Knowledgebase
SOI Sample Methodology
(Provided by Arnsberger Paul, IRS SOI Division, October 17, 2007)
We (SOI) run two samples per study year: one for 501(c)(3) organizations; the other is based on a pool of 501(c)(4)-(9) organizations. In addition to subsection code, the samples are based on end-of-year book value of assets as reported on the Forms 990 and 990-EZ. These are the only two sampling criteria we use. Here is a breakdown of the random stratified sample designs for Tax Year 2004.
We have used the $50,000,000 100% sampling threshold for 501(c)(3) organizations since Tax Year 2003. (Prior to that it was $30,000,000.) We change the sample rates annually, and re-evaluate the sample design every five years or so.
For more information, I am attachinga copy of the "Data Sources and Limitations" section from our most recent SOI Bulletin article on exempt organizations:
The statistics in this article are based on a sample of the 2004 Forms 990, Return of Organization Exempt From Income Tax, and Forms 990-EZ, Short Form Return of Organization Exempt From Income Tax. Organizations were required to file the 2004 form when their accounting periods ended any time between December 31, 2004, and November 30, 2005. The sample did not include private foundations, which were required to file Form 990-PF. Most churches and certain other types of religious organizations were also excluded from the sample because they were not required to file Form 990 or Form 990-EZ. The sample included only those returns with average receipts of more than $25,000, the filing threshold.
The sample design was split into two parts: the first sampling frame contained all returns filed by organizations exempt under section 501(c)(3); the second sampling frame comprised a pool of all returns filed by organizations exempt under sections 501(c)(4) through (9). Organizations tax-exempt under other Code sections were excluded from the sample frames. The data presented were obtained from returns as originally filed with the Internal Revenue Service. They were subjected to comprehensive testing and correction procedures in order to improve statistical reliability and validity. However, in most cases, changes made to the original return as a result of either administrative processing or taxpayer amendment were not incorporated into the database.
The two samples were classified into strata based on the size of end-of-year total assets, with each stratum sampled at a different rate. For section 501(c)(3) organizations, a sample of 15,070 returns was selected from a population of 279,415. Sampling rates ranged from 1.24 percent for organizations reporting total assets less than $500,000 to 100 percent for organizations with total assets of $50,000,000 or more. The second sample contained 6,669 records selected from the population of 111,010 returns filed by organizations exempt under sections 501(c)(4) through (9). Sampling rates ranged from 1.11 percent for organizations reporting total assets less than $150,000 to 100 percent for organizations with assets of $10,000,000 or more. The filing populations for these organizations included some returns of terminated organizations, returns of inactive organizations, duplicate returns, and returns of organizations filed with tax periods prior to 2004. However, these returns were excluded from the final sample and the estimated population counts.
CHANGES TO THE IRS SOI SAMPLING STRATA OVER TIME
(Jon Durnford, NCCS, Oct 2007)
The population weights for the largest sample categories (* indicated below) are 1, and one hundred percent of the organizations in these largest categories are included in the sample. Organizations found in smaller categories have population weights > 1.
SOI changed the sample code categories used for Forms 990/990-EZ in 1998 and in 2003, as follows:
NOTE: * = large 990/990-EZ organization sample threshold (weight = 1)
SOI has not changed the sample code categories used for Form 990-PF since at least 1997 (most likely earlier) through 2004 (shown below). The large organization threshold (weight = 1) has been assigned to private foundations in categories 16 & 17 since at least 1997 through 2004. The large organization threshold was assigned to charitable trusts in category 23 until 2002. In 2003, all charitable trust categories (21, 22, and 23) were assigned weight = 1.
NOTE:
WEIGHT STATS: See attached file for Min and Max WEIGHT values in SOI file years 1982-2004.
No comments.
Please login to add your own comments.