R Projects & Tips
R is a remarkably powerful open source data analysis tool, and I use it on a regular basis for data analysis. Below are some tips and packages I have found useful when working in R.
Packages I've authored:
- jhelvyr: An R package that loads a handful of useful custom functions and settings that I frequently use when working in R.
- logitr: An R package for estimating both multinomial (MNL) and mixed logit (MXL) models in the preference and willingness to pay (WTP) spaces. [Disclaimer: this package hasn't been updated in 2 years and needs a full overhaul.]
- Hmisc: A grab bag of many useful functions for data analysis. One of my favorites is the "Cs" function, which concatenates strings so you don't have to type all the annoying parentheses. Ex: Cs(one, two, three) produces c("one", "two", "three").
- Use ggplot2 or ggvis for making figures. The learning curve is a little steep at first, but I find the ggplot approach for plotting data is far more flexible and powerful than the standard R plotting functions.
- Read the ggplot Cookbook for a very detailed and useful guide for plotting with ggplot.
- Keep your data tidy. I whole-heartedly subscribe to Hadley Wickham's guidelines for working with tidy data.
- Use the tidyr and dplyr packages to manipulate datasets and keep them tidy. The two together provide extremely useful and intuitive functions that are easy to use and remarkably powerful.
- Use the data.table package for faster manipulation with larger datasets. It is less user-friendly than dplyr, but wicked fast.