Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

R: dplyr - "Variables not shown"

DZone's Guide to

R: dplyr - "Variables not shown"

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

I recently ran into a problem where the result of applying some operations to a data frame wasn’t being output the way I wanted.

I started with this data frame:

words = function(numberOfWords, lengthOfWord) {
  w = c(1:numberOfWords)  
  for(i in 1:numberOfWords) {
    w[i] = paste(sample(letters, lengthOfWord, replace=TRUE), collapse = "")
  }
  w
}
 
numberOfRows = 100
df = data.frame(a = sample (1:numberOfRows, 10, replace = TRUE),
                b = sample (1:numberOfRows, 10, replace = TRUE),
                name = words(numberOfRows, 10))

I wanted to group the data frame by a and b and output a comma separated list of the associated names. I started with this:

> df %>% 
    group_by(a,b) %>%
    summarise(n = n(), words = paste(name, collapse = ",")) %>%
    arrange(desc(n)) %>%
    head(5)
 
Source: local data frame [5 x 4]
Groups: a
 
   a  b  n
1 19 90 10
2 24 36 10
3 29 20 10
4 29 80 10
5 62 54 10
Variables not shown: words (chr)

Unfortunately the words column has been excluded and I came across this Stack Overflow post which suggested that the print.tbl_df function was the one responsible for filtering columns.

Browsing the docs I found a couple of ways to overwrite this behaviour:

> df %>% 
    group_by(a,b) %>%
    summarise(n = n(), words = paste(name, collapse = ",")) %>%
    arrange(desc(n)) %>%
    head(5) %>%
    print(width = Inf)

or

> options(dplyr.width = Inf)
> df %>% 
    group_by(a,b) %>%
    summarise(n = n(), words = paste(name, collapse = ",")) %>%
    arrange(desc(n)) %>%
    head(5)

And now we see this output instead:

Source: local data frame [5 x 4]
Groups: a
 
   a  b  n                                                                                                         words
1 19 90 10 dfhtcgymxt,zpemxbpnri,rfmkksuavp,jxaarxzdzd,peydpxjizc,trdzchaxiy,arthnxbaeg,kjbpdvvghm,kpvsddlsua,xmysfcynxw
2 24 36 10 wtokzdfecx,eprsvpsdcp,kzgxtwnqli,jbyuicevrn,klriuenjzu,qzgtmkljoy,bonbhmqfaz,uauoybprrl,rzummfbkbx,icyeorwzxl
3 29 20 10 ebubytlosp,vtligdgvqw,ejlqonhuit,jwidjvtark,kmdzcalblg,qzrlewxcsr,eckfgjnkys,vfdaeqbfqi,rumblliqmn,fvezcdfiaz
4 29 80 10 wputpwgayx,lpawiyhzuh,ufykwguynu,nyqnwjallh,abaxicpixl,uirudflazn,wyynsikwcl,usescualww,bkvsowfaab,gfhyifzepx
5 62 54 10 beuegfzssp,gfmegjtrys,wkubhvnkkk,rkhgprxttb,cwsrzulnpo,hzkvjbiywc,gbmiupnlbw,gffovxwtok,uxadfrjvdn,aojjfhxygs

Much better!

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}