DZone
Big Data Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Big Data Zone > Visualizing (Censored) Lifetime Distributions

Visualizing (Censored) Lifetime Distributions

Understanding all the R packages out there is hard, but this one's pretty interesting. Check out some standard censored lifetime data and see how to generate it into a nice visual.

Arthur Charpentier user avatar by
Arthur Charpentier
·
Jun. 15, 17 · Big Data Zone · Tutorial
Like (4)
Save
Tweet
3.26K Views

Join the DZone community and get the full member experience.

Join For Free

There are now more than 10,000 R packages available from CRAN — and much more if you include those available only on GitHub. So, to be honest, it's difficult to know all of them. But sometimes, you discover a nice function in one of them, and that is really awesome. Consider for instance some (standard) censored lifetime data:

n=10000
idx=sample(1:4,size=n,replace=TRUE)
pd=LETTERS[idx]
lambda=1+(idx-1)/3
t=rexp(n,lambda)
x=rexp(n)
c=t>x
y=pmin(t,x)
df=data.frame(time=y,status=c,product=pd)

Yes, I will generate them here. Consider the Kaplan-Meier estimator of the survival function:

library(survival)
km.base = survfit( Surv(time,status) ~ 1  , data = df )
plot(km.base)

Recently, Anat (currently finishing the Data Science for Actuaries program) helped me discover a nice R function to add information to that graph (well, not that graph, since it will be a ggplot version, but the same survival distribution plot):

library(ggplot2)
library(survminer)
ggsurvplot(km.base, main = "", color = "blue" , censor = FALSE, xlim = c(0,3), risk.table = TRUE ,
risk.table.col = "blue" , risk.table.height = 0.2, risk.table.title = "" , legend.labs = "All" , legend.title = "" , break.time.by = 1, xlab = "" , ylab = "")

Image title

This is more interesting when we have different lifetimes:

km.prod = survfit( Surv(time,status) ~ product  , data = df )
ggsurvplot(km.prod, main = "", censor = FALSE, xlim = c(0,3), risk.table = TRUE , risk.table.col = "strata" , risk.table.height = 0.3, risk.table.title = "" , legend.labs = LETTERS[1:4] , legend.title = "" , break.time.by = 1, xlab = "" , ylab = "")

Image title

Or a different time granularity:

Nice, isn't it?

Distribution (differential geometry)

Published at DZone with permission of Arthur Charpentier, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Top 20 Git Commands With Examples
  • Building a Kotlin Mobile App with the Salesforce SDK, Part 3: Synchronizing Data
  • Create a Self-Service Customer Support Chatbot Without Code
  • Waterfall Vs. Agile Methodologies: Which Is Best For Project Management?

Comments

Big Data Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo