Platinum Partner
architects,bigdata,tool,tools & methods,big data

Makefiles for R/LaTex Projects

Make is a mar­vel­lous tool used by pro­gram­mers to build soft­ware, but it can be used for much more than that. I use make when­ever I have a large project involv­ing R files and LaTeX files, which means I use it for almost all of the papers I write, and almost of the con­sult­ing reports I produce.

If you are using a Mac or Linux, you will already have make installed. If you are using Win­dows and have Rtools installed, then you will also have make. Oth­er­wise, Win­dows users will need to install it. One imple­men­ta­tion is in GnuWin.

A typ­i­cal project of mine will include sev­eral R files con­tain­ing code that fit some mod­els, and gen­er­ate tables and graphs. I try to set things up so I can re-​​create all the results by sim­ply run­ning the R files. Then I will have a LaTeX file which con­tains the paper or report I am writ­ing. The tables and graphs pro­duced by R are pulled in to the LaTeX file. Con­se­quently, all I need to do is run all the R files, and then process the tex file, and the paper/​report is generated.

Make relies on a Makefile to deter­mine what it must do. Essen­tially, a Makefile spec­i­fies what files must be gen­er­ated first, and how to gen­er­ate them. So I need a Makefile that spec­i­fies that all the R files must be processed first, and then the LaTeX file.

The beauty of a Makefile is that it will only process the files that have been updated. It is smart enough not to re-​​run code if it has already been run. So if noth­ing has changed, run­ning make does noth­ing. If only the tex file changes, run­ning make will re-​​compile the tex doc­u­ment. If the R code has changed, run­ning make will re-​​run the R code to gen­er­ate the new tables and graphs, and then re-​​compile the tex doc­u­ment. All I do is type make and it fig­ures out what is required.

A Make­file for LaTeX

It is easy to tell if the latex doc­u­ment needs com­pil­ing — make sim­ply has to check that the pdf ver­sion of the doc­u­ment is older than the tex ver­sion of the doc­u­ment. Here is a sim­ple Makefile that will just han­dle a LaTeX document.

TEXFILE= paper
$(TEXFILE).pdf: $(TEXFILE).tex
	latexmk -pdf -quiet $(TEXFILE)

The first line spec­i­fies the name of my file, in this case paper.tex. The sec­ond line spec­i­fies that the pdf file must be cre­ated from the tex file, and the last line explains how to do that. Mik­TeX users might pre­fer pdftexify instead of latexmk.

To use the above Makefile, copy the code into a plain text file called Makefile and store it in the same direc­tory as your tex file. Change the first line so the name of your tex file (with­out the exten­sion) is used. Then type make from a com­mand prompt within the same direc­tory as the tex file, and it should do what­ever is nec­es­sary to con­vert your tex to pdf.

Of course, you wouldn’t nor­mally bother with a Makefile if that is all it did. But throw in a whole lot of R files, and it becomes very worthwhile.

A Make­file for R and LaTeX

We need a way to allow make to be able to tell if an R file has been run. If the R files are run using

R CMD BATCH file.R

then the out­put is saved as file.Rout. Then make only has to check if file.Rout is older than file.R.

I also like to strip out all the white space from the pdf fig­ures cre­ated in R before I put them in a LaTeX doc­u­ment. There is a nice com­mand pdfcrop which does that. (You should already have it on a Mac or Linux, and also on Win­dows pro­vided you are using Mik­TeX.) So I also want my Makefile to crop all images if they have not already been done. Once an image is cropped, an empty file of the form file.pdfcropis cre­ated to indi­cate that file.pdf has already been cropped.

OK, now we are ready for my mar­vel­lous Makefile.

# Usually, only these lines need changing
TEXFILE= paper
RDIR= .
FIGDIR= ./figs
 
# list R files
RFILES := $(wildcard $(RDIR)/*.R)
# pdf figures created by R
PDFFIGS := $(wildcard $(FIGDIR)/*.pdf)
# Indicator files to show R file has run
OUT_FILES:= $(RFILES:.R=.Rout)
# Indicator files to show pdfcrop has run
CROP_FILES:= $(PDFFIGS:.pdf=.pdfcrop)
 
all: $(TEXFILE).pdf $(OUT_FILES) $(CROP_FILES)
 
# May need to add something here if some R files depend on others.
 
# RUN EVERY R FILE
$(RDIR)/%.Rout: $(RDIR)/%.R $(RDIR)/functions.R
	R CMD BATCH $<
 
# CROP EVERY PDF FIG FILE
$(FIGDIR)/%.pdfcrop: $(FIGDIR)/%.pdf
	pdfcrop $< $< && touch $@
 
# Compile main tex file and show errors
$(TEXFILE).pdf: $(TEXFILE).tex $(OUT_FILES) $(CROP_FILES)
	latexmk -pdf -quiet $(TEXFILE)
 
# Run R files
R: $(OUT_FILES)
 
# View main tex file
view: $(TEXFILE).pdf
	evince $(TEXFILE).pdf &
 
# Clean up stray files
clean:
	rm -fv $(OUT_FILES) 
	rm -fv $(CROP_FILES)
	rm -fv *.aux *.log *.toc *.blg *.bbl *.synctex.gz
	rm -fv *.out *.bcf *blx.bib *.run.xml
	rm -fv *.fdb_latexmk *.fls
	rm -fv $(TEXFILE).pdf
 
.PHONY: all clean

Down­load the file here. For most projects I copy this file into the main direc­tory of my project, then all I have to do is mod­ify the first few lines. RDIR spec­i­fies where the R files are kept and FIGDIR spec­i­fies where the fig­ures are kept. Nor­mally I keep these together, but some­times they might be in sep­a­rate directories.

Now make will do every­thing nec­es­sary — run the R files, crop the pdf graph­ics, and process the latex doc­u­ment. But it won’t do any steps that don’t need doing.

make R will only process the R files.

make view will run the pdf viewer, after updat­ing the pdf file if necessary.

make clean will delete all the files gen­er­ated by latex or by make, so that the entire process must be run again at the next make command.

Notice that my R files all depend on functions.R. This is a file that con­tains project-​​specific func­tions. If this file is updated, all the other R files will need updat­ing also.

For many projects, some R files will depend on some oth­ers hav­ing already run. For exam­ple, read.R may read in the data and refor­mat it for analy­sis, while plot.Rmight pro­duce some graphs assum­ing that read.R has already run. To ensure makeknows about this depen­dency, we need to add a line

$(RDIR)/plot.Rout: $(RDIR)/plot.R $(RDIR)/functions.R $(RDIR)/read.R
	R CMD BATCH $<

This should be inserted where I have the com­ment # May need to add something here if some R files depend on others.

This Makefile works on Linux. Mac and Win­dows users will need to replace evinceby what­ever pdf viewer they pre­fer.

Published at DZone with permission of {{ articles[0].authors[0].realName }}, DZone MVB. (source)

Opinions expressed by DZone contributors are their own.

{{ tag }}, {{tag}},

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}
{{ parent.authors[0].realName || parent.author}}

{{ parent.authors[0].tagline || parent.tagline }}

{{ parent.views }} ViewsClicks
Tweet

{{parent.nComments}}