How we quadrupled the efficiency of our Anti Money-Laundering teams with Machine Learning | by Bastien Rolando | The Qonto Way

In the relentless battle against financial fraud and evolving money laundering tactics, staying ahead is essential.

Preventing money laundering means fighting the funding of criminal activities, including drug trafficking, terrorism, corruption, and organized crime.

Qonto acknowledges the gravity of this threat, and we’re committed to continually enhancing our fraud detection tools. Qonto’s Data Products team recently collaborated with Risk, Anti-Money Laundering (AML) & Financial Crime (FC) teams to significantly improve the efficiency of our tools, and — therefore — our whole money laundering detection process.

This Jan. 19, 1931, file photo shows Chicago mobster Al Capone at a football game. (AP File)

In the era of Prohibition, the notorious gangster Al Capone ruled the streets of Chicago. His illegal activities generated immense wealth, but he faced a problem: he couldn’t simply deposit his ill-gotten gains in banks without attracting unwanted attention from the authorities.

To solve this, Capone turned to a seemingly legitimate business: a chain of laundromats. But these laundromats weren’t for cleaning clothes — they were for “cleaning” money. The dirty cash from bootlegging (the illegal distribution of goods) and other criminal ventures would flow into the laundromats, mingling with the legitimate earnings. Through a series of clever transactions, the origins of the money became nearly impossible to trace.

This process, known as money laundering, allowed Capone to enjoy his wealth while avoiding suspicion. The historical tale of Al Capone’s laundromats underscores the art of disguising illegal proceeds as lawful income.

Nowadays, money laundering still consists of making a business seem legitimate by fabricating transactions, but the methods are increasingly diverse and harder to detect.

Today, the banking industry uses machine learning to detect fraud. However, most of the transaction monitoring still relies on an expert rule-based approach (also called scenarios, or alerts).

This is what an expert rule could look like:

The company withdrew more than €X in cash during the past week.
And the company received money from Switzerland.

All the companies that meet these criteria would be considered as suspicious, and usually would have to be reviewed by AML experts or agents.

This approach has various advantages:

It comes from expert knowledge.
Agents understand the reason why they have to investigate a company.
Implementation is quick and easy.

However, this process does come with certain drawbacks. Manual threshold selection introduces subjectivity, and the rules are often too strict, flagging too many companies. Also, over time, rules tend to stack up on one another and become unwieldy to maintain, monitor, and review.

In the early days of Qonto, we used the industry standard and created many rules to identify fraudulent activities. However, after two years of development, maintaining them became cumbersome. We recognized the need to optimize our approach. So, we created a filtering tool on top of existing alerts.

That meant using the rules as inputs and prioritizing a number of alerts within our risk appetite. Then, our AML experts reviewed companies flagged by these alerts.

We used a Bandit algorithm, specifically Thompson Sampling, to optimize the choice of alerts. This method involved a trade-off between two approaches:

Exploitation: We evaluated the historical performance of each rule and selected the most effective ones.
Exploration: When we didn’t have enough observations to assess a rule, we reviewed more companies until we could confirm that the rule performed well.

This provided a distribution of how many alerts of each type we wanted to review. Then, we randomly picked the right number of companies, among all candidates.

Even though this method allowed us to optimize the efficiency of our anti-money laundering set-up, it was possible to improve it even further by:

Considering market specificities, as money laundering patterns can differ from one country to another.
Removing the ‘randomness’ to prioritize companies according to our risk assessment, and to reduce delays between the first weak signal and the investigation.
Considering other risk drivers than solely the alerts themselves.

This algorithm was created back in 2019. At that time, Qonto had 50,000 customers. As we now have more than 400,000 customers, the algorithm started struggling to handle the increasing volume of alerts and effectively prioritize them for our entire client base.

We needed to improve our precision to meet our growth targets. As a result, we deemed it necessary to pause and completely reconsider our prioritization system.

Machine learning is commonly used for fraud detection and identifying money laundering. By the time we came to machine learning as a solution, we had already developed several fraud detection models internally. However, this was not a classic detection challenge (as in the case of our previous models), but a prioritization problem.

Focusing on alert prioritization would not enable us to improve our previous algorithm. Laundering money is a characteristic of the organization, not the alert. We recognized the need to prioritize companies themselves.

The goal became clear: each day, our aim was to identify and present the companies that were most likely to engage in money laundering. This problem could be solved using supervised machine learning: we could evaluate the risk for each company with a score, then showcase the highest scores on a daily basis.

However, there were compelling reasons to keep the rules:

Certain rules remained effective at detecting money laundering within specific contexts.
We aimed to enable experts to manually address highly specific patterns by creating rules.

So, we opted to incorporate rules as features in our new model. Specifically, each existing rule translates into a feature that counts the number of non-reviewed rules for a given company.

As for the rest of the features, referred to as “context features,” we identified them using two methods:

Piece analysis: We closely examined actual instances of money laundering, particularly those that weren’t identified by rules. This allowed us to pinpoint the key indicators that aid in distinguishing potential money laundering cases.
Shadowing sessions: We observed AML agents while they reviewed various cases, and identified the factors used by the agents to tell whether a company was likely to engage in money laundering or not.

Both of these methods leverage real-world examples and expert knowledge to uncover features that may not have been initially apparent. This contributes to a more accurate and effective model for identifying potential money laundering activities.

Based on existing research¹ ² ³ in the field of fraud detection, and our observations at Qonto, boosted trees models usually emerge as the most effective models for detecting fraudulent activities.

Our project was no exception; the model that showed the best performance was Catboost. This model also had some other perks, such as its fast training and its default handling of category variables.

Despite their effectiveness, however, boosted tree models lack the transparency that rule-based systems offer. To bridge this gap, a significant aspect of the project revolved around extracting SHAP (SHapley Additive exPlanations) values for the top N companies daily. These values were presented graphically to help agents show why a particular company needed review, thus enhancing the model’s interpretability.

The features names have been anonymized — example of normalized SHAP values graph: AML agents can understand which features drive the score positively or negatively.

Finally, each country Qonto operates in has its own model trained on its country’s data, enabling us to better leverage the unique characteristics of each market. This tailored approach ensures that our efforts to detect money laundering are finely tuned to the specific nuances of each region.

We’ve built a service to run those models and compute the scores in batches every day. A few months following its implementation, the outcome is clear. Overall, our filtering system’s precision surged by 200% to reach 300%, and in our largest market the precision improved by a staggering 588%. This single initiative has yielded a profound and direct influence on the operations of our AML and Financial Crime teams.

Today, we possess the capability to swiftly and effectively uncover companies involved in money laundering. Moreover, AML agents invest significantly less time reviewing legitimate companies, allowing them to focus on more relevant cases and reduce waste.

The model is easily and regularly retrained in order to improve its performance and keep up-to-date with current patterns. And experts can still create rules that will be be taken into consideration after each retraining. This approach ensures ongoing adaptation and effectiveness in detecting potential money laundering activities.

Source link

Leave a Reply Cancel reply

Related Stories

Different types of artificial intelligence (AI) | by Robert Ishimura Sousa | Apr, 2024

VC-Dimension V.S. Inductive Bias V.S. Biology V.S. Physical Laws : Comprehensive Multi-Disciplinary Table of Machine Learning Classifiers | by Medium_AI_CS_ML | Apr, 2024

Why Machine Learning Is Worth Talking About? | by jupytermishra | Apr, 2024

You may have missed

The Weekly Reorg: Bitcoin Fashion Week

Virtual curating frees artist – Hypergrid Business

Different types of artificial intelligence (AI) | by Robert Ishimura Sousa | Apr, 2024

Azteco Is Helping Millions Buy Bitcoin Without Sharing Their Identity