Following the Mediator scandal, France adopted in 2011 a Sunshine Act. For the first time we have data on the presents and contracts awarded to health care professionals by pharmaceutical companies. Can we use graph visualization to understand these dangerous ties?
Pharmaceutical companies in France and in other countries use presents and contracts to influence the prescriptions of health care professionals. This has posed ethical problems in the past.
In France, 21 persons are currently prosecuted for their role in the Mediator scandal, a drug that was recently banned. Some of them are accused of having helped the drug manufacturer obtain an authorization to sell its drug and later fight its ban in exchange for money.
In the US, GlaxoSmithKline was condemned to pay $3 billion in the largest health-care fraud settlement in US history. Before the settlement, GlaxoSmithKline paid various experts to fraudulently market the benefits of its drugs.
Such problems arose in part because of a lack of transparency in the ties between pharmaceutical companies and health-care professionals. With open data now available can we change this?
Moving the Data to Neo4j
Regards Citoyens, a French NGO, parsed various sources to build the first database documenting the financial relationships between health care providers and pharmaceutical manufacturers.
That database covers a period from January 2012 to June 2014. It contains 495 951 health care professionals (doctors, dentists, nurses, midwives, pharmacists) and 894 pharmaceutical companies. The contracts and presents represent a total of 244 572 645 €.
The original data can be found on the Regards Citoyens website.
The data is stored in one large CSV file. We are going to use graph visualization to understand the network formed by the financial relationships between pharmaceutical companies and health care professionals.
First we need to move the data into a Neo4j graph database: https://gist.github.com/jvilledieu/e435a568cf1d3f8dd000#file-sunshine_import-cql
Now the data is stored in Neo4j as a graph (download it here). It can be searched, explored and visualized through Linkurious.
Unfortunately, names in the data have been anonymized by Regards Citoyens following pressure from the CNIL (the French Commission nationale de l’informatique et des libertés).
Who is Sanofi Giving Money To?
Let’s start our data exploration with Sanofi, the French biggest pharmaceutical company. If we search Sanofi through Linkurious we can see that it is connected to 57 765 professionals. Let’s focus on the 20 Sanofi’s contacts who have the most connections.
Sanofi’s top 20 connections.
Among these entities there are 19 doctors in general medicine and one student. We can quickly grasp which professions Sanofi is targeting by coloring the health care professionals according to their profession:
19 doctors among Sanofi’s top 20 connections.
In a click, we can filter the visualization to focus on the doctors. We are now going to color them according to their region of origin.
Region of origin of Sanofi’s 19 doctors.
Indirectly, the health care professionals Sanofi connects to via presents also tell us about its competitors. Let’s look at who else has given presents to the health care professionals befriended by Sanofi.
Sanofi’s contacts (highlighted in red) are also in touch with other pharmaceutical companies.
Zooming in, we can see Sanofi is at the center of a very dense network next to Bristol-Myers Quibb, Pierre Gabre, Lilly or Astrazeneca for example. According to the Sunshine dataset, Sanofi’s is competing with these companies.
We can also see an interesting node. It is a student who has received presents from 104 pharmaceutical companies including companies that are not direct competitors of Sanofi.
A successful student.
Why has he received so much attention? Unfortunately all we have is an ID (02b0d3726458ef46682389f2ac7dc7af).
Sanofi could identify the professionals its competitors have targeted and perhaps target them too in the future.
Who has received the most money from pharmaceutical companies in France?
Neo4j includes a graph query language called Cypher. Through Cypher we can compute complex graph queries and get results in seconds.
We can for example identify the doctor who has received the most money from pharmaceutical companies:
|//Doctor who has received the most money
WHERE a.totalDECL IS NOT NULL
ORDER BY a.totalDECL DESC
The doctor behind the ID 2d92eb1e795f7f538556c59e48aaa7c1 has received 77 480€ from 6 pharmaceutical companies.
The relationships are colored according to the money they represent. St Jude Medical has over 70 231€ to Dr 2d92eb1e795f7f538556c59e48aaa7c1.
Perhaps next time they receive a prescription from Dr 2d92eb1e795f7f538556c59e48aaa7c1, his patients would like to know about his relationship with St Jude Medical. Unfortunately today the Sunshine data is anonymous.
We can also find the most generous pharmaceutical company.
|//Company which has distributed the most money
RETURN a, sum(r.totalDECL) as total
ORDER BY total DESC
Novartis Pharma has awarded 12 595 760€ to various entities.
The 5 entities receiving the most money from Novartis.
When we look closer, we can see that the 5 entities which have received the most money from Novartis Pharma are 5 NGOs.
24f3287da6ab125862249416bc91f9c4 has received 75 000€.
The Sunshine dataset offers a rare glimpse into the practice of pharmaceutical companies and how they use money to influence the behavior of health care professionals. Unfortunately for citizens looking for transparency, the data is anonymized. Perhaps it will change in the future?