To test different algorithms in terms of efficacy, we’ve created the world’s largest synthetic trade-based money laundering (TBML) dataset.

The synthetic dataset contains 100,000 customers with a total of 100 million transactions. We assumed 1 in 1,000 transactions (namely, 0.1%) being traded-based money laundering. As transactions in AML analytics are continues, we assumed 10% of all transactions as being too small (i.e., statistically insignificant sample size).


The dataset is available for qualified customers only (i.e., banks, financial service companies, and government agencies).

The dataset is manually labeled and large enough to test your own algorithm against it.


Format: PostgreSQL, Snowflake, or .hyper. 

If interested, please CONTACT us for more details.