------------------------------------------------------------------------------- help for taxpuf27 ------------------------------------------------------------------------------- NBER TAXSIM model for federal and state income taxes - Full ----------------------------------------------------------- Description ----------- taxpuf27[,full output secondary|interest|long temp] calculates federal and state income tax liability from a transformed version of the SOI public use file. Where the Stata procedure taxsim32 uses a few input variables likely to be available in a survey, taxpuf27 uses all the data available in the public use files, about 200 values per taxpayer. The TAXSIM version of the public use file is documented elsewhere, but includes variables named data1 through data210 for various income, deduction, and demographic characteristics. data100 is the taxpayer id variable, data11 is wages, etc. A complete list is at http://www.nber.org/taxsim-ndx.txt. One of the TAXSIM PUF files must be in the workspace before calling {hi:taxpuf27). The program returns your Stata workspace after creating a file taxsim_out.dta with values for the various liabilities and marginal tax rates. The two files can be merged for further analysis. Here is an example of a complete job to calculate year 2000 tax liabilities with taxsim and compare them to taxpayer reported liabilities: . use /home/data/soi/taxsim/dta/s2000 . taxpuf27 . fiitax = max(0,c1) . reg fiitax data16 [pw=data1 This loads the year 2000 2% subset, calculates taxes, merges taxsim output with the original data, truncates tax liability at zero (to match SOI conventions), and regresses the TAXSIM calculated value of federal liability on the taxpayer reported value using probabilty weights. You should see an r-squared value of better than .99. The law used is for the year specied in data103. It can be set to any value from 1960 to 2018, however state tax will be zero for any year outside 1977-2018. When calculating tax liabilities for alternate years, there may be missing variables, however these are silently set to zero. The tax calculator itself is the same FORTRAN program that the NBER has been updating annually since 1974. This interface converts the data to ASCII and executes the tax calcultor against it, then reads the output of the tax calculator and converts it to a Stata dataset. The full tax calculator is available only while logged onto the NBER Unix cluster, not via the Internet. Data ---- Input files are available for all years from 1960 through 2018, except for 1961, 1963, and 1965. Each year is available as a 2% subset (about 2,000 taxpayers) or a full version. File x1999.dta would be the full dataset, where s1999.dta is the subset. All files are kept in /home/data/soi/taxsim/dta. Options ------- output: Specify the name of the output dataset. The default is taxsim_out.dta in the current directory. secondary: Calculate marginal tax rates with respect to the secondary wage earner. The default is a weighted average of the primary and secondary wage earners. interest: Calculate marginal tax rates with respect to interest income. long: Calculate marginal tax rates with respect to long term gains. temp: Save temporary files to disk. Notes: ------ {p4 4 2} Please examine or read all of the material below. Dollar amounts are rounded to the nearest dollar before transmission to the calculator, and calculated amounts are similarly treated. A general description of Taxsim is given in http://www.nber.org/taxsim/feenberg-coutts.pdf. More information about the data collection at NBER is given in http://www.nber.org/taxsim-notes.html. A variable list is given in http://www.nber.org/taxsim-ndx.txt. Daniel Feenberg feenberg@nber.org 617-863-0343 Online: help for taxpuf