Data found in audit
data library on WRDS (/wrds/audit/sasdata/
on
SAS server)
Audit Analytics “is an independent research provider of audit, regulatory and disclosure intelligence. Audit Analytics provides detailed data on over 150,000 active audits and more than 10,000 accounting firms.” Please refer to the Manuals and Overviews or download a zip file to view the data structures and variable definitions of Audit Analytics via WRDS.
Audit Analytics comprises five sets of data:
- Audit and Compliance
- Corporate and Legal
- Accounting and Oversight
- Other Independent Audit
- Canada (SEDAR)
The University of Melbourne currently has subscriptions to the first two sets of data
This comprises 14 tables (table names listed in parentheses):
- Auditors (
feed01_auditors
) - Auditor Changes (
feed02_auditor_changes
) - Audit Fees (
feed03_audit_fees
) - Audit Fees with Restatements (
feed04_audit_fees_restated
) - Benefit Plan Opinions (
feed06_benefit_plan_opinions
) - Audit Opinions (
feed05_audit_opinions
) - Current Auditors (
feed07_current_auditor
) - Non-Reliance Restatements (
feed09_nonreliance_restatements
) - SOX 302 Disclosure Controls (
feed10_sox_302_disclosure_contro
) - SOX 404 Internal Controls (
feed11_sox_404_internal_controls
) - Accelerated Filer (
feed16_accelerated_filer
) - Director and Officer Changes (
feed17_director_and_officer_chan
) - Non-timely Filer Information And Analysis (
feed20_nt
) - Revised Audit Opinions (
feed34_revised_audit_opinions
)
This comprises 12 tables (table names listed in parentheses):
- Legal Case (
feed13_legal_case_feed
) - Legal Parties (
feed14_company_legal_party_feed
) - Mergers and Acquisitions (
feed18_merger_acquisition
) - IPO (
ipo
): To be renamedfeed19_ipo
? - Bankruptcy Notification (
feed21_bankruptcy_notification
) - Comment Letter (
commlett
): To be renamedfeed25_comment_letters
? - Comment Letter Conversations (
feed26_comment_letter_conversati
) - Shareholder Activism (
feed31_shareholder_activism
) - Form D (
feed37_form_d
) - Form D Most Recent Report (
feed38_form_d_most_recent_offeri
) - Comment Threading (
feed40_comment_letter_threads
) - Transfer Agents (
feed41_transfer_agents
)
In each case, the MCCGR database includes a table with the same name, but in general, we do not include all variables provided in the WRDS version of the table. A detailed discussion of the changes we make is given in the next section.
While Audit Analytics provides comprehensive data, the tables provided by WRDS, unfortunately, are a complete mess and the data are often not well-formatted. One particular issue is that the tables are very much not normalized. So there is a lot of duplicated data and tables are loaded with extraneous information. Also, some textual variables are split across a number of columns, and some textual data is placed in separate tables. We identified and fixed the following three issues.
Specifically, the WRDS tables often include variables from the “company
financial block”, which are financial statement variables such as net
income or total assets for the “closest
”, “match
”, “hiwater
”, or
“prior
” periods (either quarters [qu
] or years [yr
]). These
variables expand the size of the tables dramatically, and their
provenance is unclear (likely scraped from SEC filings, which are the
primary source for Audit Analytics data). Also, the meaning of the terms
“closest
”, “match
”, “hiwater
”, and “prior
” is unclear and the
data are very poorly documented on WRDS. So, we simply omit these
variables when importing the data.
Several tables include fields with the text (or portions of the text) of
the underlying SEC filings. Presumably due to constraints on field size
in SAS, these text variables are split into a number of columns. For
example, the text in filings related to rows in diroffichange
is found
in do_change_text1
, do_change_text2
, and do_change_text3
. It is
likely that by pasting do_change_text1
, do_change_text2
, and
do_change_text3
back together, one gets the original text that was fed
to WRDS by Audit Analytics.
In addition to the data being split across a number of columns, the
textual data often causes problems in importing tables. For this reason,
often the textual data is placed in a separate table from the other data
(and in some cases, a number of separate tables). For example, the
variables do_change_text1
, do_change_text2
, and do_change_text3
are found in tables diroffichange_text1
, diroffichange_text2
, and
diroffichange_text3
respectively along with the primary key variable
for the original table (do_off_pers_key
in the case of
diroffichange
).
Separate text tables are provided for the following tables:
auditsox302
:auditsox302_text
auditsox404
:auditsox404_text1
,auditsox404_text2
feed17_director_and_officer_chan
: Text field is currently omitted (see below).
The WRDS tables include auditor names as well as an auditor key (i.e.,
auditor_key
as found on auditorsinfo
). Because the auditor_name
appears on auditors
, it is redundant to include the same information
on each table in which auditor_key
appears. So we have deleted it from
many tables. Simply join the table with auditors
using auditor_key
to recover this variable.
The feed17_director_and_officer_chan
text variables currently do not
work due to the text column being too wide.