Data auditing is assessing the quality and fit for purpose of data via key metrics and properties of the data (Techopedia, n.d.). Data auditing processes and procedures are the business’ way of assessing and controlling their data quality (Eichhorn, 2014). Doing data audits allows a business to fully realize the value of their data and provides higher fidelity to their data analytics results (Jones, Ross, Ruusalepp, & Dobreva, 2009). Data auditing is needed because the data could contain human error or it could be subject to IT data compliance like HIPAA, SOX, etc. regulations (Eichhorn, 2014). When it comes to health care data audits, it can help detect unauthorized access to confidential patient data, reduce the risk of unauthorized access to data, help detect defects, help detect threats and intrusion attempts, etc. (Walsh & Miaolis, 2014).
Data auditors can perform a data audit by considering the following aspects of a dataset (Jones et al., 2009):
- Data by origin: observation, computed, experiments
- Data by data type: text, images, audio, video, databases, etc.
- Data by Characteristics: value, condition, location
A condensed data audits process for research is proposed by Shamoo (1989):
- Select published, claimed, or random data from a figure, table, or data source
- Evaluate if all the formulas and equations are correct and used correctly
- Convert all the data into numerical values
- Re-derive the original data using the formulas and equations
- Segregate the various parameters and values to identify the sources of the original data
- If the data is the same as those in (1), then the audit turned up no quality issues, if not a cause analysis needs to be conducted to understand where the data quality faulted
- Formulate a report based on the results of the audit
Jones et al. (2009) provided a four stage process with a detailed swim lane diagram:
For some organizations, it is the creation of log file for all data transactions that can aid in improving data integrity (Eichhorn, 2014). The creation of the log file must be scalable and separated from the system under audit (Eichhorn, 2015). Log files can be created for one system or many. Meanwhile, all the log files should be centralized in one location, and the log data must be abstracted into a common and universal format for easy searching (Eichhorn, 2015). Regardless of the techniques, HIPAA section 164.308-3012 talk about information and audits in the health care system (Walsh & Miaolis, 2014).
HIPAA has determined key activities for a healthcare system to have a data auditing protocol (Walsh & Miaolis, 2014):
- Determine the activities that will be tracked or audited: creating a process flow or swim lane diagram like the one above, involve key data stakeholders, and evaluate which audit tools will be used.
- Select the tools that will be deployed for auditing and system activity reviews: one that can detect unauthorized access to data, ability to drill down into the data, collect audit logs, and present the findings in a report or dashboard.
- Develop and employ the information system activity review/audit policy: determine the frequency of the audits and what events would trigger other audits.
- Develop appropriate standard operating procedures: to deal with presenting the results, dealing with the fallout of what the audit reveals, and efficient audit follow-up
- Eichhorn, G. (2015). The four key features of a data audit log system. Retrieved from http://www.realisedatasystems.com/the-four-key-features-of-a-data-audit-log-system/
- Eichhorn, G. (2014). Why exactly is data auditing important? Retrieved from http://www.realisedatasystems.com/why-exactly-is-data-auditing-important/
- Jones, S., Ross, S., Ruusalepp, R., & Dobreva, M. (2009). Data audit framework methodology. Retrieved from http://www.data-audit.eu/DAF_Methodology.pdf
- Shamoo, A. E. (1989). Principles of research data audit. Retrieved from https://www.google.com/search?q=data+audit+system&sourceid=ie7&rls=com.microsoft:en-US:IE-Address&ie=&oe=#q=%22data+audit%22+system&start=0
- (n.d.). Data audit. Retrieved from https://www.techopedia.com/definition/28032/data-audit
- Walsh, T. & Miaolis, W. (2014). Privacy and security audits of electronic health information: 2014 update. Retrieved from http://library.ahima.org/doc?oid=300276#.WJkSpT8zXtQ