Basic plots with ggplot2
One reason why R has become so popular among data scientists is that has some powerful tools for producing publishable-quality graphics. For many users, the base graphics system—“base” because it comes with the base install of R—is all that is needed to produce the desired output. Aside from the official R manuals, there are many tutorials for using base graphics. Rob Kabacoff’s Quick-R provides a short intro on producing basic graphs in R, and Josef Fruehwald provided this guide to plotting with the base graphics system.
Here, I’ll be using the ggplot2
package (Wickham, 2009). ggplot2
is a newer graphical system for R, which is an implementation of Leland Wilkinson’s Grammar of Graphics (Wilkinson, Wills, Rope, et al., 2006), which takes the best parts of R’s base and lattice
graphics systems to produce multilayerd graphs.
So, let’s get started exploring some aspects of historical demographic data for Island Caribs (or, Kalinago) who lived in the Carib Territory within the Commonwealth of Dominica. Today, the Carib Territory in northeast Dominica is home to the largest population—2145 individuals, according to preliminary results of the 2011 census—of Island Caribs. For more, Honychurch (1995) provides a detailed history of Dominica.
First, download and load the data. The R code below acceses a subset (n = 500) of the original data frame.
download.file("https://www.dropbox.com/s/i8dcx7o4lo36tb2/dnica2.txt?dl=1", "dnica2.txt")
dnica2 <- read.table("dnica2.txt", header = TRUE, sep = "\t")
The data are comprised of parish death records from 1917 to 1971. Let’s say we want to explore the age-at-death distribution for the entire sample. A common way to view this is with a histogram. I’ve decided override ggplot2
’s default color themes using the ggthemr
package.
library(ggplot2)
library(ggthemr)
ggthemr("grape")
ggplot(dnica2, aes(x = age)) + geom_histogram()
References
[1] L. Honychurch. The Dominica Story: A History of the Island. Macmillan, 1995.
[2] H. Wickham. ggplot2: elegant graphics for data analysis. New York: Springer, 2009.
[3] L. Wilkinson, D. Wills, D. Rope, et al. The Grammar of Graphics. New York: Springer, 2006.