# Inter-relationships in a Matrix

# Inter-relationships in a Matrix

### Using car crash data to examine and display inter-relationships between points of data in a matrix.

Join the DZone community and get the full member experience.

Join For Free**The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.**

Last week, I wanted to displaying inter-relationships between data in a matrix. My friend Fleur, from AXA, mentioned an interesting possible application, in car accidents. In car against car accidents, it might be interesting to see which parts of the cars were involved.

On https://www.data.gouv.fr/fr/, we can find such a dataset, with a lot of information of car accident involving bodily injuries (in France, a police report is necessary, and all of them are reported in a big dataset… actually several dataset, with information of people involved, cars, locations, etc). For 2014 claims, the dataset is

```
> base = read.csv("https://www.data.gouv.fr/s/resources/base-de-donnees-accidents-corporels-de-la-circulation-sur-6-annees/20150806-153355/vehicules_2014.csv")
```

Let us keep only claims involving two vehicules,

```
> T=table(base$Num_Acc)
> idx=names(T)[which(T==2)]
```

For 2014, we have 32,222 claims.

```
> length(idx)
[1] 32222
```

In this dataset, we have information about where cars were hit,

plus ‘9’ for multiple hot (in rollover accidents) and ‘0’ should be missing information.

`> nom=c("NA","Front","Front R",'Front L',"Back","Back R","Back L","Side R","Side L","Multiple")`

Now, we simply have to go through our dataset, and get the matrix. My first idea was to get a symmetric one:

```
> B=base[base$Num_Acc %in% idx,]
> B=B[order(B$Num_Acc),]
> M=matrix(0,10,10)
> for(i in seq(1,nrow(B),by=2)){
+ a=B$choc[i]+1
+ b=B$choc[i+1]+1
+ M[a,b]=M[a,b]+1
+ M[b,a]=M[b,a]+1
+ }
> rownames(M)=nom
> colnames(M)=nom
```

The problem, when we ask for a symmetric chord diagram, is that we cannot have Front – Front claims (since values on the diagonal are removed)

```
> library(circlize)
> chordDiagramFromMatrix(M,symmetric=TRUE)
```

So let’s pretend that there could be some possible distinction in the dataset, between the first and the second row. Like the first one is the ‘responsible’ driver. Or like, for insurer, the first one is your insured. Just to avoid this symmetry problem:

```
> M=matrix(0,10,10)
> for(i in seq(1,nrow(B),by=2)){
+ a=B$choc[i]+1
+ b=B$choc[i+1]+1
+ M[a,b]=M[a,b]+1
+ }
> rownames(M)=paste("A",nom,sep=" ")
> colnames(M)=paste("B",nom,sep=" ")
```

If we visualize the chord diagram, this time it is more complex to analyze:

`> chordDiagram(M)`

Below we have the first row (say our driver, letter A) and on top, the second row (say the other driver, letter B),

In bodily injury claims, we observe a large proportion of Front – Front claims, as well as Front – Back. And as expected Back-Back are not that common….

**Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.**

Published at DZone with permission of Arthur Charpentier , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

## {{ parent.title || parent.header.title}}

## {{ parent.tldr }}

## {{ parent.linkDescription }}

{{ parent.urlSource.name }}