To many people, R is like the Everglades. They’ve heard of it, they know it’s big and has amazing treasures deep inside. Articles in the media can make it look irresistible. But after a personal or even second-hand experience people also learn that R can be a big swamp where you are all but guaranteed to get soggy boots and mosquito bites before you’re done. And there is always the distinct possibility of getting lost and falling into a ‘gator hole’. Indeed, if you go in without a guide, hoping to get in and out quickly you probably won’t enjoy it much. This post contains a script that shows you some of the sights without letting you fall in. If you like to learn by example, read on.
The rest of this post is the verbatim script with graphics embedded in the appropriate places. You can also download the script and run it yourself. The comments in this script capture a session of working with and thinking about a dataset. This script doesn’t try to cover everything. On the contrary, it pedantically reuses as few techniques as possible to show that you can do a lot with a little.
This script also demonstrates how to be systematic with respect to commenting, variable naming, setting graphical parameters, etc. One of the keys to working successfully with R is writing scripts that explain what they are doing and contain consistent, readable, verging-on-predictable code.
I look forward to any suggestions for corrections/improvements.
(Curator's note: For the R code in text format, see this article's source.)
OK, perhaps we shouldn’t expect a job offer from the Florida Department of Tourism any time soon. But I hope this short tour through the R swamp shed some light on how just a few techniques can help you begin interrogating large datasets and telling interesting stores.