Life is Like a Boat

忘備録や経済、投資、プログラミングに関するメモやtipsなど

PyCon APAC 2023 report 1

Fighting Money Laundering with Python and Open Source Software

I participated in Python APAC 2023, which was held in Tokyo. There were many interesting talks given by "Pythonistas" during the two-day conference. Having previously worked in the banking industry as a Java/Python programmer, I was particularly interested in attending this talk.

The talk explored how Python and Open Source software are tools in fighting money laundering.

For instance, when law enforcement agencies investigate fraud or money laundering, they need to search for a linkage between multiple bank accounts. The process is often time consuming. Deshpande explained that the approach for someone with Python and data science background would take to solve the problem is:

  1. The dataset of accounts are provided from the law enforcement

  2. Generate graph showing links between accounts and transactions

  3. Apply Machine Learning on the graph

  4. Predict possible money laundering case

  5. Generate a report

According to Deshpande, there are two important aspects in providing solutions: Daubert Standard and Benford's Law. Let's delve into these points.

The first is the standard which is used in a court to decide if scientific evidence is good enough to be used during a trial. To do that, from what I understand from his talk, program and algorithm has to be explained in plain language, not code, and the error ratio has to be calculated and validated independently.

The second point he touched upon was the Benford's Law. This is a term I remember I came across when I took a Statistics class back in my college days which was been buried deep in my memory. One would expect that the first number of figures such as city populations and baseball statistics would be anything from 1 to 9, and they would all show up about equally. The twist here is that according to Benford's Law, that's not how it works. In many real life cases, the number "1" is the most common first digits and it accounts 30% of the cases. Here is the link explaining more on that.

(image from https://medium.com/thealexfreeman/benfords-law-9b93f21f4c40)

During his talk, he showed how the Law is applied in investigating money laundering and showed a demo with the following sample case

  1. load a sample dataset containing transaction amounts, bank balances and so on.

  2. load it into Pandas dataframe

  3. get the first digit of transaction amount

  4. plot that as actual distribution and compare it against Benford's Law.

Here is the screenshot from his talk showing the anomaly is detected. You can see the red point indicating the Benford's distribution whereas the bar indicates empirical distribution which suggests anomaly.

Luckily, in Python, there is a library for Benford's Law! pypi.org

It is always interesting to know how people like Deshpande approach the real world problem using Python and scientific theories.

PyCon APAC 2023 Day 1 #pyconapac_3 - YouTube

Benford’s Law (Python). What is Benford’s Law? | by Alex Freeman | thealexfreeman | Medium