In a forthcoming article, we describe a new methodology for estimating tax liabilities in public use Survey of Consumer Finances (SCF) microdata files. Conducted most recently in 2019, the SCF is a three-year household survey containing detailed information on the demographics, income and balance sheet of the designated survey respondent and, where applicable, the respondent’s spouse / partner. The survey also collects basic demographic information, indicators of financial dependence and summary measures of income for up to ten additional household members. The SCF is unique among household surveys for public use because it oversamples wealthy households and is therefore suitable for studying trends in the highest wealth and income shares (Bhutta et al. 2020; Bricker et al. 2016) . Like most household surveys, however, the SCF does not ask detailed questions about household income reporting or tax obligations.
Initially, a natural question is why building a way to solve tax problems with CFS data is useful. There are general and specific answers. At the most general level, tax data alone is insufficient to solve tax problems, as income reported on tax forms is already affected by laws, avoidance strategies and evasion practices. In terms of specific application, we show in a companion document that less than half of all income generated by closed businesses in the United States actually shows up on tax forms. The massive “leak” between the generation of economic income and the recording of income on tax forms is an area ripe for policy analysis and recommendation. A dataset like the CFS, containing information on household income as well as wealth, can be a valuable tool in undertaking such an analysis, if it also contains tax information.
Our overall strategy is to divide the households in the SCF into tax units, reconcile the measures of the survey and taxable income, create the other inputs needed to estimate the income taxes payable, and then estimate the Income tax payable for CFE tax unit microfiles in conjunction with the most recent version of The NBER TAXSIM Online Tax Calculator. TAXSIM replicates U.S. federal tax rules over time, including the period 1995 to 2019 (1994 to 2018 tax years) covered by the CFS microdata files we use.
We proceed in several stages. First, we create tax units within SCF households. For most SCF households, such as a single person or a married couple living alone or with dependent children, this process is straightforward. These households also represent the vast majority of income. Some households, however, contain more than one potential reporting unit, as they are made up of either different generations or unrelated individuals. In these cases, we use data on demographic relationships, measures of financial dependence, marital history, and income to simulate the reporting units. We also compare our simulated results to tax returns published in Income Statistics (SOI).
Second, we map SCF income into income concepts consistent with those reported on tax forms. The SCF income is largely intended to be consistent with its taxable counterparts, but even after resolving the conceptual differences, we show that the survey values are consistently higher than the published tax values. While we do not fully explore the aggregate and distributional differences between income categories, the key observation that emerges is that the corporate income gap (mathematically) explains most of the income gap. global.
Third, we model the itemized deductions. Taxpayers can choose between itemized deductions and a standard deduction which varies by filing status. The CFS captures about half of the itemized expenses and we charge the other half using published SOI deductions. Our two key benchmarks are the extent to which we track the number of filers who choose to itemize and the total value of itemized deductions.
Fourth, we present basic estimates of tax obligations, before and after credits, using the NBER TAXSIM model and compare them to published SOI values. Since revenues are consistently higher in SCF compared to SOI, our estimated tax liabilities are also higher. Since the income gap between SCF and SOI is concentrated at the top of the income distribution and the tax system is progressive, the tax liability gap is not surprisingly larger than the income gap. .
We conclude by noting that the results contained in this methodological paper, in particular the differences in business income between the data sources, have important implications for recent controversies regarding the distribution of income and wealth. We explore these topics in a companion document that builds on the methodology developed here.