Brisking Through Survivability Analysis

DZone 's Guide to

Brisking Through Survivability Analysis

In this article, take a look at a survivability analysis and define it.

· Big Data Zone ·
Free Resource

In this horrific time of disease outbreak and corresponding hospitalizations and lockdowns enforced upon people, most people often have one thing on their mind: to survive and get on with life.

Here is a quick way to mathematically model it: 

First, we need to define an event that needs to be examined. For instance, a hospital's management would like to know what happens to COVID patients who are admitted to their facilities?

In this case, the event is treatment outcome for an admitted patient. Potential outcomes can be survival or succumbing to disease. 

In addition, there can be other cases where patient decides to discontinue treatment and end imposed isolation and hence outcome cannot be known in given time frame. These are typically called "Right Censored" situations. Such patients need to be counted out of the remaining surviving ones, along with the ones who do not survive at any given instance of time.

There is another class of situations which are termed as "Left Censored". An example in this context can be Patients who had recovered in the past from the same disease, and hence they could have some immunity arising out of a previous infection.  Other example can be Patients admitted to hospital, who succumbed prior to start of this study. 

In data modeling, every censor event must be recorded. Typically, such is modeled using a combination of two indicating variables. ( Even one is fine but this adds to convenience.)

1. Record Survival or Otherwise 

Patient ID Time Outcome ( Survived-0, Succumbed-1, Left treatment mid way-3) Censor Event
1 Day 1 0 0
1 Day 2 0 0
1 ... Day 243 1 0
2 Day 1 0 0
2 ..Day 35 3 1

So, a data logger can record observation per patient as above, which can be further aggregated by the Patient ID as below.

Patient ID Time Outcome ( Survived-0, Succumbed-1) Censor Event
1 243 Days 1 0
2 45 Days 0 1
3 63 Days 1 0

and more so..

2. Define Survivability

Let's say there are 100 patients. What is the likelihood of patients to survive versus succumbing to disease? We can look at this in two ways. Long term and short term.

Over the long term, the likelihood to survive a given "critical" period of time needs to be computed as well, in order to see the contrast between the two. 

When analyzing survivability in the short term, it is important to know what is the likelihood that a given patient will succumb at the next instant of time. 

This is essentially an evaluation of threat or hazard posed to long term survivability, at any instant of time. 

Obviously hazard faced by whom? We will need a running count of the surviving patient population as well. Refer to the table below.

The hazard faced by surviving patients who have survived so far is  equivalent to (1- likelihood of their long term survival ) spread over time.

It happens so, that the short term metric is a derivative of above.

So, if long term survivability is denoted by F(t) as a function of time, then short term threat to it ( aka morbidity) is  d/dt( 1- F(t)) = -F'(t).

The contrasting ratio is called Hazard function, hz(t) which takes the form -F'(t)/F(t).

Calculus tells that Long term survivability function F(t) is equivalent to  negative exponential function of cumulative threat function HZ(t).

i.e. F(t)= exp(-HZ(t))

These general mathematical relations are used in two well-know techniques of Survivability Analysis, namely Kaplan-Meier and Nelson-Alan

Both techniques can be computed at once as below:

Patient ID Time Accumulating Count
(Patients Succumbing to disease)
Censor Indicator ( Indicating Discontinuation of Treatment) Surviving Population subject to Threat from Succumbing to Disease Cumulative
Hazard/Threat Function- Estimate of Vulnerability of Patients
Survivability Function as the Joint Probability Distribution for All cases
1 128 Days 1 0 100 ( 1/100)

91 135 Days 1 0 99 (1/99 +1/100) exp(-(1/99 +1/100)
57 137 Days 2 0 98 (1/100 +1/99 +1/98) exp(-(1/100 +1/99 +1/98))
and so on...

Notice, the list is sorted on Time, which should be obvious.

3. Visualize the Survivability Function Spread Over Treatment Time

Inversely, we can also visualize the Cumulative Hazard Function: -Log(F(t)

Cumulative HZ(t)
big data, covid, data science, survivability analysis

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}