Data-driven fraud prevention allows businesses to proactively identify and mitigate suspicious activities by leveraging historical patterns, real-time analytics, and predictive modeling.
Institutions need to ensure that there is a proper data development, and maintenance approach implemented and followed by all employees. Data may be of any form such as goods purchase data, sales data, expense data, assets purchase data, liabilities data, etc. Appropriate recording and maintenance of accounts are mandatory parts of an institution’s financial reporting practices. All financial reporting facts and figures must be backed by relevant transactions and data sources.
Such purchase data must be backed with the date-wise vendor or supplier invoices, reviews, authorizations, bank payments, and receipt of goods from the vendor or seller. Similarly, fraud data to be maintained may include fraud type, fraud timing, the amount lost, fraudster detail (if available), controls breached by the fraudster to commit fraud, technology misused, systems and software attacked, a technique of fraud used, etc.
Fraud with its many different and ever-changing manifestations is never static. It continuously adapts to time, place, circumstance, and technology, thereby necessitating agile fraud strategies. Fraud categorization is not straightforward either therefore data heps institutions in the identification of frauds.
The Association of Certified Fraud Examiners in their coursework provides more than a hundred different fraud attack vectors each having subcategories.
Therefore, the nature of data used for fraud detection differs from a classical machine learning or a data science project, in the sense that the target is moving and getting labeled ‘time-relevant’ ground truth even more. Time-relevant because, by the time your model has stabilized, fraud attack vectors may have changed or adapted.
The current industry thrust is towards the use of AI and Machine Learning but that leap should follow after learning to crawl, then walk and then run. Investment in predictive modeling should be made after carefully studying your data readiness.
Building a ‘data culture’ of acquiring, storing, curating, and decisions from data should precede an undertaking of a predictive modeling project.
Data-Driven Fraud Prevention
These are some practical steps your institution can take to inculcate the practice of data-driven decisions for fraud prevention:
Having a complete understanding of the data collection process along with proper documentation
This is the first step and an obvious one but often neglected. Data collection is primarily an engineering project and requires different skills. However, if not done in synchronization with your fraud operations then you can end up with big “holes” in your data.
Enterprise-level data can get unwieldy and it is best to dedicate a resource from the fraud team to play the role of a liaison with the data engineering team.
Remember that this resource should be able to examine the forensic value of the data points. Documentation is important – needless to say. The upsurge in the amount of possible data points can be alleviated with documentation and documentation alone.
Synergizing your fraud and data experts
Fraud experts understand attack vectors and are well-informed on existing and emerging threats. They could be white hat hackers, vulnerability specialists, information security experts, or working in other similar roles. Building a fraud solution is a cross-functional task and some level of product management is necessary to help pivot different teams and synergize them.
The output of this synergy is in the form of forensic data points. These, by themselves, may not be indicative of fraud but contain signals about interesting events or occurrences.
Dashboards and Visualization
Time is crucial and running manual offline analysis of your signals is tedious. Real-time dashboards can help with making this slightly easy for you. Your team can work with the engineering team to create meaningful dashboards projecting them on screens where your team can monitor them. Although it may sound easy, visualization requires non-negligible effort and many iterations to get to a point where it is used and liked by fraud analysts.
Once we have a fair idea of what insights are actionable, then creating automatic alerts greatly helps. Alerts can be delivered via email or instant messaging. To ensure maintaining the seriousness of an alert, we should avoid the proliferation of them.
Leveraging multiple signals
You cannot always generate insights by looking at one data point in isolation. Using relationships between different data points we can create complex yet meaningful signals. There are two approaches here: a generative approach takes into account how the data point could have been generated to identify these relationships. A deductive approach could use some statistical methods to find these relationships. A fraudster very likely leaves multiple traces on different channels or touchpoints with your institution. It helps to look to be able to look at these data points in one view.
A label feedback process
Two things are important and need to happen independently of one another.
- Creating a process for fraud analysts to record objective feedback on investigated events.
- Recording these labels in a way that can easily be joined with the rest of your data.
This is not trivial at all. Having worked with multiple clients, this ends up being the sticking point. Either the feedback of an investigation is not objective or it is recorded in a way that cannot be correlated with any other data source. Events happen independently and we need to be able to correlate them.
The data engineering pipeline needs to support a common identifier that can help you link cross-channel events to the same account or transaction. Then, feedback should ideally be multi-categorical and not just subjective. This greatly speeds up the data-cleansing task of an analyst or data scientist.
These are, I believe, some of the fundamental initial steps toward building a culture of data-driven fraud management. At some point in time, I will share some machine-learning methods that are fairly common in fraud detection.
Final Thoughts
Institutions face a multifaceted challenge in establishing robust data development and maintenance approaches to support their financial reporting and fraud detection endeavors. As fraud continually evolves, adapting to various conditions and exploiting technological advancements, it is imperative for institutions to cultivate a comprehensive understanding of their data collection processes. Building a data culture, where data acquisition, storage, and utilization become integral components, is a prelude to employing predictive modeling effectively. Synchronizing efforts between fraud experts and data professionals is vital, ensuring the extraction of forensic data points that provide insights into potential anomalies.
Dashboards and visualization tools aid in real-time monitoring, while leveraging multifaceted data points can unveil complex fraud patterns. Instituting a feedback process, that allows for objective evaluations and easy integration with other data, further reinforces the effectiveness of data-driven fraud management. Institutions must proceed methodically, ensuring foundational data practices are in place before venturing into advanced machine-learning fraud detection techniques.