How Accurate Are the Incomes Reported in the Household Economic Survey?

This paper was preliminary, and circulated for discussion in 1997 (This version revised in 2000). The issues it raises were taken up by Statistics New Zealand and have been largely dealt with. (A major revision has been to the household weightings.) It is placed here on the website, because occassionally researchers using the earlier data ask for it. But it illustrates the universal rule of always checking one the quality of one’s data before using it

Keywords: Statistics;


As the result of a generous grant for then Prince Albert College Trust, it has been possible to place in the public domain for research purposes, quasi-unit records (QURs) from the household economic survey (HES),[1] one of the regular surveys administered by Statistics New Zealand.

Each QUR is an average of three household unit records from the HES. This ensurers the privacy protection that the law requires for Statistics New Zealand respondents. Yet in many ways a QUR behaves like a single unit record, especially where the analysis is linear. Moreover, rather than randomly choosing the three households for each quasi-unit record, the three households belong to the same household type, so they have a common household structure, and are combined to preserve the total household income structure.[2] Thus the income characteristics of the households in the HES are preserved in the QURs more precisely than were there to be simple (or simple stratification by household type) random sampling.

Thus far only the 1994/5 data year is available, but quasi-unit records will be obtained for all years back to 1981/2. The current state of the research project is an exploration of the 1994/5 data, to obtain an understanding of its significance and robustness. Lessons learned form this study are likely to influence the structure of the QURs for other years.

This paper explores the aggregate income implied by the data. With each QUR is Statistics New Zealand’s estimate of the number of households that the QUR represents, based upon the sampling weights of the HES. Thus by aggregating up the QUR we obtain an estimate of income totals for all households. Since all the procedures used here are linear, they will give the same estimates as if the figures had been based upon the unit records. By comparing those estimates with known official estimates we can gain some notion of the income coverage ratio of the HES.

The Income Data

While the data is said to apply for the year to March 1995, it represents the information collected by households surveyed in that period. Each household is asked to report their income for the previous 12 months. Thus while a household surveyed in March 1995 would report their estimate for incomes for the year to March 1994/5, a household surveyed in April 1994 (a part of the same data year) would report its income for the year to April 1994, some 11 months earlier.

It follows that the income data from the HES for the year 1994/5 (i.e. to March 1995) corresponds more closely on average to income for the year to September 1994. Since most other income data are collected on a March year basis, this complicates the comparison. To calculate the September 1994 data, the March year 1995 and the March year 1994 data is averaged (which is not very different from what in effect the HES is doing).

Incomes of households are collected by the following components: Wages & Salaries; Self Employed Income; Public Superannuation; Government Benefits; Investment Income; Private Superannuation; Other Income; TOTAL INCOME.

Household Income for the September 1994 year.

The conceptual framework chosen for the comparison of the aggregate income of the QUR is for incomes derived for System of National Accounts (SNA) purposes. The tabulation which seems most relevant is the estimates of household income for the household institutional sector.[1]

Two SNA items, the imputed rent on owner occupied housing and the imputed interest on pension funds, are omitted because they are not requested for the HES, nor are they relevant for our purposes. Estimates for National Superannuation and Government Benefits were derived separately because the SNA concept for “Social Assistance Grants” includes health and pharmaceutical benefits (reflecting international definitions.

Note that the last component “other income including private superannuation payments” may not correspond to the aggregate derived from the institutional sector accounts. Further work is necessary. (A particular issue is where the ACC benefits should best go.)

The Table below gives the comparison.


  HES (QUR) National Column 2 Notes
    Accounts Column 3  
Wages &
$32,667m $26,056m 90.6% Compensation
of Employees
$4885m $9139m 53.3% Entrepreneurial
$4137m $5091m 81.3% NZOYB
$3433m $28029m 72.6% NZOYB
$2433m $2809m 86.6% Received Interst
& Dividends
Other Income inc
Private Superannuation
$838m $3273m 25.7% Other Income
inc ACC benefits
TOTAL INCOME $49,650m $61,095m 81.3%  

Sources of National Account Estimates: NZIER estimates for March Year, unless notes indicate NZOYB.

It is evident that there is not a very good match between the two sets of aggregates. In summary the HES covers the following proportions of the SNA estimates:
Wages & Salaries – 90.6%
Self Employed Income – 53.5%
National Superannuation – 81.3%
Government Benefits – 72.6%
Investment Income – 86.6%
Other Income (including Private Superannuation) – 25.7%

Assuming the SNA estimates are reasonably accurate the reasons for the divergence may include
(i) conceptual differences (especially for “other income” and between categories)
(ii) omissions from the HES of the income of those not living in households (but, for example, in rest homes and hostels). It may be possible to allow for this;
(iii) poor recall by the respondent
(iv) confusion between before and after tax incomes (and the complication in government transfers of abatements and surcharges;
(v) deliberate deceit.

It should be emphasized that Statistics New Zealand are aware of problems which arise from the survey (i.e. iii, iv, v), and have taken steps to minimize them.

The most serious gaps are the underestimates for other incomes (probably explanation (i) and possibly (iii)), and self employed incomes (probably a mix of (iii) and (v)). But all the differences are uncomfortably large.

What is to be done? The short answer is – more work refining this note.

At some stage we may want to adjust the HES data for the under-reporting. A quick solution would be to increase proportionally each component, but this may not be particularly reliable, especially where very large increases are required.

In any case this involves two assumptions which do not seem especially plausible. First that would be to assume there is no failure to report any income (actual income reported as zero would still be zero after the adjustment). And second it assumes that misreporting (for whatever reason) is proportional, where other work shows that it is not.[1] Note that because different income components have different reportage errors, distributional inferences could be wildly wrong unless some adjustment is made.

In the interim we need to be cautious about drawing any conclusions where income levels are important. Unless some adjustment is made, no matter how crude, the conclusions will almost certainly be wrong.

Go to top

[1] The HES and the QURs is documented in Statistics New Zealand 1994/95 Household Economic Survey: Background Notes, Wellington, 1996 and Statistics New Zealand Quasi Unit Record Data From the 1994/95 HES, Wellington, 1996.
[2] For each household type, the households are ranked from largest to smallest income, and then combined in groups of three.
[3] Statistics New Zealand, New Zealand Institutional Sector Accounts: Issues and Experimental Accounts, 1987-1995, Wellington, 1996.
[4] B.H. Easton, Income Distribution in New Zealand, NZIER Research Paper No 28, Wellington, 1983.

Go to top