On November 1-2, the Aspen Institute’s Program on Philanthropy and Social Innovation hosted a Form 990 “Vali-Datathon,” the second in a series of hands-on events designed to improve access to electronically-filed nonprofit tax forms, specifically Form 990s.
Form 990s contain useful information on the missions, governance and finances of nonprofit organizations. Until recently, however, these forms were available and sold by the IRS only as non-searchable images which greatly limited their potential.
After years of advocacy and a federal lawsuit, electronically-filed Form 990s were released as open data by the Internal Revenue Service (IRS) on Amazon Web Services in June 2016. Currently, approximately 60% of Form 990s are electronically filed, and thus available on Amazon Web Services.
While this historic release by the IRS was a huge step toward free, searchable data on the nonprofit sector, researchers, journalists, independent data analysts and others immediately faced challenges working with the complex XML dataset, which contains dozens of versions of the form. The need for a unified set of standards quickly became apparent. To address this, the Aspen Institute coordinated regular conversations among data scientists and researchers to discuss technical concerns and promote shared fixes.
These conversations grew into the Nonprofit Open Data Collective, a loose collaboration among leading Form 990 players including GuideStar, Charity Navigator, Urban Institute, nonprofit scholars, and independent data professionals who recognize the value of the 990 data and the importance of improving its accessibility.
At the group’s first event – a Datathon held at the Aspen Institute in May – participants mapped variables from the Form 990 to “xpath” locations in the electronic 990s.
At the November “Vali-Datathon,” participants verified these mappings and noted corrections, where necessary. Jesse Lecy of Arizona State University and David Borenstein, the former lead data scientist at Charity Navigator and now an independent 990 consultant, led the effort with hands-on guidance to participants. Over twenty individuals from universities, nonprofit organizations, and media groups participated in the Vali-Datathon.
Simultaneously, another dozen individuals from the West Coast and Indiana participated remotely, accessing shared software and instructions online under the guidance of journalist and software consultant Jacob Fenton and Citizen Audit’s Miguel Barbosa.
Over the course of the two-day event, participants completed a major portion of the core 990, though thousands of variables still need to be checked. The Nonprofit Open Data Collective is gratified by this coordinated progress, but validating the IRS-released XML files is far from complete.
To ease future data access, the Aspen Institute and its partners have asked the IRS to address certain technical and communications gaps so that the full potential of these data can be realized. The Aspen Institute also continues to lead the push for mandatory electronic filing of nonprofit tax forms, coupled with the release of the data by the IRS, which would ensure that all Form 990s are available to the public for research, fraud reduction and other valuable purposes.