Over a million developers have joined DZone.

R: String to Date or NA

DZone's Guide to

R: String to Date or NA

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

I’ve been trying to clean up a CSV file which contains some rows with dates and some not – I only want to keep the cells which do have dates so I’ve been trying to work out how to do that.

My first thought was that I’d try and find a function which would convert the contents of the cell into a date if it was in date format and NA if not. I could then filter out the NA values using the is.na function.

I started out with the as.Date function…

> as.Date("2014-01-01")
[1] "2014-01-01"
> as.Date("foo")
Error in charToDate(x) : 
  character string is not in a standard unambiguous format

…but that throws an error if we have a non date value so it’s not so useful in this case.

Instead we can make use of the strptime function which does exactly what we want:

> strptime("2014-01-01", "%Y-%m-%d")
[1] "2014-01-01 GMT"
> strptime("foo", "%Y-%m-%d")
[1] NA

We can then feed those values into is.na..

> strptime("2014-01-01", "%Y-%m-%d") %>% is.na()
> strptime("foo", "%Y-%m-%d") %>% is.na()
[1] TRUE

…and we have exactly the behaviour we were looking for.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.


Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}