Fraud Prevention With Neo4j: A 5-Minute Overview
Fraudsters are becoming more sophisticated and using increasingly complex schemes. What can be done to thwart those schemes?
Join the DZone community and get the full member experience.Join For Free
Fraud is becoming increasingly difficult to discover and prevent as fraudsters are increasingly employing complex techniques and advanced technologies to perpetrate fraud.
Who Are Today’s Fraudsters?
Today, fraudsters are organized in groups, possess synthetic or manufactured identities — which in many cases are stolen. They are also using hijacked devices unknown to the owners. All of this makes it very difficult to detect modern fraud.
We’re talking about many different kinds of fraud, like:
- Credit card fraud.
- Rogue merchants.
- Fraud rings.
- E-commerce fraud.
- Insurance fraud.
- Fraud that is undetected, as it appears like routine transactions.
Old Fraud Prevention Techniques Don’t Cut It Anymore
Standard techniques and technologies including relational databases just don’t cut it anymore. Fraud patterns are getting increasingly complex, and it’s all about pattern analysis and discovery, which is best done in Neo4j, a graph database.
In Neo4j, the transactions are stored as a graph where related pieces of data are connected making it easy to traverse those relationships in real time and to find the fraudulent patterns quickly.
Traditional fraud detection methods use discrete analysis, which can only relate two to three pieces of information at a time, like common IP addresses, common shipping addresses, or a common bank account.
But in complex fraud schemes being employed today, some fraud patterns go into multiple tens of attributes, requiring over tens of thousands of transactions to find the fraudulent action. This needs connected analysis, which is only possible with connected data in Neo4j.
Example: Credit Card Testing
Let’s say a fraudster gets a hold of credit card information via rogue merchant skimming or through a data breach. They want to test the card to ensure it’s still usable. They go to nearby merchants and test the card two to three times, for increasing dollar amounts, starting with a two- or three-dollar purchase at a Starbucks, for instance.
Once they’re certain the credit card works, they then make a big purchase and move on to the next card. Graph databases can help find these testing patterns among the sea of normal transactions to help stop bigger transactions from taking place.
Example: Determining Fraud Origination
Another example is determining the origin of a fraud ring.
Let’s say John reports a $2,000 fraudulent computer purchase. That credit card company can look back at his previous smaller transactions and find additional fraudulent transactions starting with small purchases at a gas station.
The credit card company suspects that additional fraud emanated from this gas station, so the card issuer looks at all of the transactions at the gas station within a time window.
After looking at Sheila’s transactions, they find a similar unusual large purchase at a jewelry store. Sheila confirms this is fraud, as well. She also confirms other fraudulent transactions. A similar pattern also then emerges with others like Karen in the image above.
With this graph-based pattern analysis, the credit card company is able to quickly determine fraud origination and find out cases of unreported credit card fraud, helping them shut this fraudster down before more damage was done.
How Neo4j Fits Into a Fraud Prevention Architecture
So, how does Neo4j fit into an overall fraud prevention architecture?
Most large financial services companies are already employing big data analysis to detect fraud. In such a scenario, multiple data sources including transaction data, merchant data, etc. are streamed into a data lake where data scientists analyze macro data patterns and find instances of fraud.
Right now, it currently takes weeks to months to find all the instances of a particular fraud pattern. Also, since teams are unable to visualize this pattern easily in a data lake, there are often false positives and overlooked fraud.
A subset of the data in the data lake retaining the pattern being evaluated can be loaded into Neo4j. This then enables graph visualization of the data and helps fine-tune the fraud patterns visually.
Neo4j can also find all instances of the particular fraud pattern very quickly within the entire dataset — often within seconds to minutes — saving enormous amounts of time and making data scientists more productive.
In addition, Neo4j can operationalize its dataset for fraud prevention in real time by pattern-matching events to existing patterns in Neo4j making this a real-time, sense-and-respond solution. By operationalizing this analysis and removing the manual effort, the data science team can now focus on detecting new patterns of fraud and rapidly operationalizing them.
Neo4j is being used by multiple Fortune 100 financial services firms, major government institutions, and top-tier retailers for connected graph analysis. To hear a more in-depth presentation on fraud and how Neo4j fits in, please watch our hour-long fraud webinar which includes a demo.
Published at DZone with permission of Ryan Boyd, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.