Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

The Power Behind the Paradise Papers

DZone's Guide to

The Power Behind the Paradise Papers

Neo4j has powered the Paradise Papers release. Read on to see how journalists leveraged a graph database to quickly sift through the mountain of data they uncovered.

· Big Data Zone
Free Resource

Access NoSQL and Big Data through SQL using standard drivers (ODBC, JDBC, ADO.NET). Free Download 

Once again, the International Consortium of Investigative Journalists (ICIJ) has shaken the world with a far-reaching, in-depth investigation into the shadowy world of offshore finance:  The Paradise Papers.

Discover how Neo4j graph technology has helped power the Paradise Papers investigation by the ICIJUsing Neo4j, the ICIJ has built upon their Pulitzer Prize-winning investigation of 2016 — the Panama Papers — and they’ve begun to add politicians featured in early Paradise Paper reports to their Offshore Leaks Database.

The new 1.4 TB of data — 13.4 million documents — includes information leaked from trust company Asiaciti and from Appleby, a 100-year-old offshore law firm specializing in tax havens as well as information leaked. The files were obtained by German newspaper Süddeutsche Zeitung and shared with Washington D.C.-headquartered ICIJ, a network of independent reporting teams around the world.

As in previous investigations, Neo4j plays a key role in revealing the connections between the wealthy, their money, and the taxation-friendly countries in which it resides.

The reason? Graph databases excel at managing highly connected data and complex queries.

Instead of using tables the way a relational database does, graphs use special structures incorporating nodes, properties, and relationships to define and store data, making them highly proficient at analyzing the relationships and any interconnections between data and allowing journalists to “follow the money” easier than ever.

The Paradise Papers investigation powered by Neo4jUnprecedented volumes of highly connected data

Pierre Romera, chief technology officer of the ICIJ, tells Business Insider:

“Most of the leaks we get are not structured since they are raw documents. With the Paradise Papers, those documents represented 1.4 TB of data and were gathered from different sources. Putting them in a single database was a challenge for us. With Neo4j and [visualization tool] Linkurious, and after a few weeks of research, we were able to propose to our 382 journalists a way to explore the data and also to share visualizations from stories they were working on. It’s surprising how intuitive a graph database can be for non-tech savvy people. Thanks to this approach, we could both investigate and prepare the future releases.”

According to Mar Cabra, the ICIJ’s Data and Research Unit Editor, using Neo4j was the only solution available to meet her requirements when they broke the Panama Papers investigation last year.

“It’s a revolutionary discovery tool that’s transformed our investigative journalism process,” she says, “because relationships are all important in telling you where the criminality and secrecy lies, who works with whom, and so on. Understanding relationships at huge scale is where graph techniques excel. At least 11.5m documents, and far larger than any data leaks we have investigated before, we needed a technology that could handle these unprecedented volumes of highly connected data quickly, easily and efficiently.” 

She adds:

“We also needed an easy-to-use and intuitive solution that didn’t require the intervention of any data scientist or developers, so that journalists around the globe would work with the data, regardless of their technical abilities. Linkurious Enterprise was the best platform to explore this data and to share insights in a secure way. Using the Linkurious graph visualization platform with Neo4j is a powerful combination.” 

According to Neo4j Co-Founder and CEO, Emil Eifrem:

“Whatever else we can be sure of as the Paradise Papers’ investigation unfolds, it’s only with world-class tools like Neo4j and Linkurious that world-class investigation of vast and complex datasets like this can happen in our Age of Connections. Graph databases are the only option when trying to make sense of the vast terabytes of connected data that we are producing more and more of, and they are an essential tool for international agencies, governments, financial services and security firms trying to uncover the truth.”

Stay Tuned for More Coverage of the Paradise Papers

In the coming days and weeks, the Neo4j team will continue to unveil how graph technology powered the Paradise Papers investigation, including an in-depth look at the ICIJ data model with example queries, graph visualizations, and more.

In the meantime, continue to follow the ICIJ’s Paradise Papers coverage exploring the political and economic dimensions of the investigation as they continue to unfold.

The fastest databases need the fastest drivers - learn how you can leverage CData Drivers for high performance NoSQL & Big Data Access.

Topics:
graph database ,data analytics ,paradise papers ,big data

Published at DZone with permission of Bryce Merkl Sasaki, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}