Over a million developers have joined DZone.

Publishing an R package in the Journal of Statistical Software

· Big Data Zone

Learn how you can maximize big data in the cloud with Apache Hadoop. Download this eBook now. Brought to you in partnership with Hortonworks.

I’ve been an edi­tor of JSS for the last few years, and as a result I tend to get email from peo­ple ask­ing me about pub­lish­ing papers describ­ing R pack­ages in JSS. So for all those won­der­ing, here are some gen­eral comments.

JSS prefers to pub­lish papers about pack­ages where the pack­age is on CRAN and has been there long enough to have matured (i.e., obvi­ous bugs ironed out and a few active users). This is partly because we have so many sub­mis­sions that it helps to fil­ter some out and this approach pro­vides some basic qual­ity checks. So I sug­gest you begin by devel­op­ing the pack­age for CRAN. This is a pref­er­ence rather than a require­ment, and it is not stated any­where in the JSS rules. A paper describ­ing a pack­age that has only recently been put on CRAN will still be con­sid­ered, but the prob­a­bil­ity of it get­ting through the review­ing process is smaller.

We pre­fer sub­stan­tial pack­ages rather than very spe­cific but small pack­ages. That is, a pack­age that solves a very spe­cific prob­lem is less likely to be pub­lished than a pack­age that pro­vides a gen­eral toolkit for a dis­ci­pline area, or one that imple­ments a num­ber of use­ful approaches to a prob­lem. Think about mak­ing your pack­age as ambi­tious as you can in scope and func­tion­al­ity. Here is an excerpt from a rejec­tion let­ter I wrote:

This paper/​package does what it claims quite well, but it could do so much more. It lacks ambi­tion. As it stands, the pack­age re-​​implements a pop­u­lar algo­rithm in R. To be pub­lish­able in JSS, I would want to see it aim higher and pro­vide more gen­eral facil­i­ties for xxxx…

Descrip­tions of smaller and more focused pack­ages may still be accept­able as papers in the “Code Snip­pets” sec­tion of the jour­nal. But in that case, the paper should be suit­ably shorter.

JSS now has the high­est impact fac­tor of any statistics/​mathematics jour­nal (which reflects what a silly mea­sure IF is, but that’s another story). As a result we are flooded with sub­mis­sions. Con­se­quently, the stan­dards required for pub­li­ca­tion have increased fairly rapidly in the last cou­ple of years. Do not think you can get away with a quick descrip­tion of an R pack­age you have writ­ten and have it pub­lished in JSS. Spend as much time as you would for any other jour­nal in pro­vid­ing a con­text in terms of the exist­ing lit­er­a­ture, explain­ing the rel­e­vant back­ground mate­r­ial, and describ­ing the inno­v­a­tive fea­tures of your work.

JSS papers involve a review of the soft­ware as well as the paper. You need to make sure the R code in the pack­age is of high qual­ity, and that the help files are com­plete, cor­rect and infor­ma­tive. A well-​​written paper that describes a poor R pack­age is not accept­able. In par­tic­u­lar, please fol­low stan­dard R cod­ing con­ven­tions, and spend lots of time writ­ing good help files. Just because your pack­age passes a CRAN check does not mean the code is well writ­ten or that the help files con­tain no errors.

Make sure the pack­age is actively main­tained and devel­oped. I’ve seen papers describ­ing pack­ages that have not changed for more than two years. Surely in that time you will have found at least one bug, or thought of at least one new fea­ture to add.

Make sure the out­put in the paper exactly matches the results obtained from the ver­sion of the pack­age pro­vided, and state in the intro­duc­tion which ver­sion of the pack­age was used for this paper.

Here are a few fur­ther com­ments taken directly from reports I have writ­ten on JSS papers.

  • The exam­ples use the object name “data”, which clashes with the data() func­tion. Please avoid object names that already exist as functions.
  • The exam­ples in the man files and in the paper would be eas­ier to read if you put space around the assign­ment <-
  • Many sym­bols are not in math mode when not part of an equa­tion. Please be con­sis­tent in putting math­e­mat­i­cal sym­bols in math mode.
  • Much bet­ter error report­ing is needed. It should return some­thing inter­pretable when­ever the user spec­i­fies inap­pro­pri­ate arguments.
  • Please con­sider using S3 meth­ods and classes to allow much sim­pler print­ing and plotting.
  • The help page is unnec­es­sar­ily clut­tered. There seems no good rea­son to put all func­tions on the one page.
  • Given that plot­ting is one of the main fea­tures of this pack­age, the avail­able plot­ting facil­i­ties are very prim­i­tive. For exam­ple, it is not pos­si­ble to change fonts, labels, line types, titles, or any­thing else. At least add an “...” argu­ment so that plot­ting para­me­ters can be used, and make sure that the plot­ting para­me­ters passed in this way do not cause clashes.
  • You spec­ify default val­ues for most argu­ments, but then re-​​specify them in your exam­ples. It would be sim­pler to leave out the argu­ments in the exam­ples if you want to use the default val­ues. Oth­er­wise, why have defaults at all?
  • Use FALSE and TRUE not F and T in examples.
  • The code for xxxx would be sim­pler if you used match.arg() to do argu­ment matching.


Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks


Published at DZone with permission of Rob J Hyndman, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}