McIntire Professor Julia Yu and Co-Authors Use Machine Learning to Detect Fraud

Their machine learning method can accurately red-flag more publicly listed fraud companies compared with other models in prior research and could help regulators and investors to focus on a small set of "highly suspicious" companies.

Julia Yu

Julia Yu

In a study published in Journal of Accounting Research, McIntire Professor Julia Yu and her co-authors Yang Bao (Shanghai Jiao Tong University), Bin Ke (NUS Business School), Bin Li (Wuhan University), and Jie Zhang (Nanyang Technological University) employ a new fraud prediction model using machine learning.

The paper, “Detecting Accounting Fraud in Publicly Traded U.S. Firms Using a Machine Learning Approach,” samples from all publicly listed U.S. firms between 1991 and 2008, and the study’s results carry important implications for ongoing accounting research.

The benefits of uncovering potential corporate fraud by implementing machine learning are many, and the speed gained in the process offers a clear advantage over past options.

Yu says that though corporate fraud is a worldwide problem, the discovery process for exposing much of it often comes two or three years after the fact. The delays in fraud detection often result in the criminal activities being revealed to the public when the damage has been done. Hence, efficient and effective methods of detecting corporate accounting fraud offer significant value to regulators, auditors, and investors.

Introducing machine learning into the fraud detection process has the potential to save a great deal of human effort and labor hours.

“It is very costly to investigate publicly listed companies, and the regulators can only allocate limited resources to this task. Moreover, on average, only a small percentage of publicly listed companies involve severe fraudulent activities; hence, identifying a company committing an act of fraud among all the publicly traded companies is like looking for a needle in a haystack,” she explains. “Naturally, regulators and other monitors seek to investigate the smallest number of companies with the highest predicted likelihood of fraud. To address this need, we introduce a performance evaluation metric commonly used for ranking problems. We found that under this more practical-oriented performance metric, our machine learning method can accurately red-flag more publicly listed fraud companies compared with other models in prior research. Our finding would help regulators and investors to focus on a small set of ‘highly suspicious’ companies.”

Another practical contribution of the team’s research concerns improved fraud detection accuracy by focusing solely on raw accounting data rather than any proprietary information. Yu says that the choice to limit the type of data will make fraud detection more feasible for ordinary investors without access to specialized information typically only made available to institutional investors and regulators.

This study has positively impacted Yu’s own research interests in financial reporting and corporate disclosure.

“My prior publications on voluntary disclosures have involved textual analysis to generate key features to capture characteristics of corporate disclosures. After using machine learning methods in this fraud detection paper, I have started to think about how to introduce machine learning methods into textual analysis in accounting research to further improve textual measurements used in my current research projects,” Yu says.

Read the full paper published in Journal of Accounting Research.

Get all the latest news and updates delivered straight to your inbox every month.