Abstract:
Grouping of income data is often an essential matter in social science research since respondents in general are inclined to report income in interval instead of exact amount. This thesis studies the form of household income distribution and methods of grouping income data which produce optimal group boundaries. The methods considered are (i) Dalenius and Hodges's cumulative √f rule, (ii) Walter D. Fisher's squared distance criterion and (iii) B.B. Aglevli and F. Mehran's average condition method. The first two methods aim at providing homogeneity of elements in the same group while the third method aims at estimation of the Gini index of income inequality. Most researches frequently group income data into 10, 5 and 3 classes. Therefore, this study investigates grouping of income data in 10, 5 and 3 classes, using household income data from the Household Master Sample Formation and Analysis for Survey in Bangkok Metropolis. The study finds that the distribution of household income of Bangkok Metropolis, of Bangkok municipal areas and of Bangkok non- municipal areas are all in the form of 1-x[superscrip a] [subscript o]/x[superscrip a] [subscript o]+x[superscrip a] where x[[subscript o] is the median income and a is the pareto constant. In addition household income of Bangkok non-municipal areas is also found to be of a gamma type distribution. As for the grouping of income, it is found that the squared distance method not appropriate in case of large quantity of data and is inconvenient in computing. Hence, when grouping into 3 classes is desired, the appropriate method is the cumulative √f method which, although giving approximately result is simpler and more convenient to compute than the averge condition method which requires knowledge of the form of income distribution. However, when grouping into 5 classes is desired, skewness of the distribution should also be considered. If the distribution is highly skewed, the cumulative √f method is recommended. But if the degree of skewness is low, the purpose of the 'grouping methods should be consider instead. For the case of grouping into 10 classes, selection of appropriate method depends on using the grouping obtained.