## R : Graphics Tutorial Series ( Part 4 )

- 05/01/2017
- 832
- 0 Like

**Published In**

- Big Data
- Analytics
- Artificial Intelligence

Ad: 50000 Data Science Jobs Globally | Over 10000 Hours of Free Data Science Video Tutorials:- Register Now

In this first part of the Tutorial Series *R : Graphics Tutorial Series ( Part 1 )* , we learnt the basics of R Base Graphics while in the second part of the Series R : Graphics Tutorial Series ( Part 2 ) and third part

R : Graphics Tutorial Series ( Part 3 ) we saw various graphical methods for displaying relationships between two variables (bi-variate relationships) and between many variables (multivariate relationships).

This is an extension of the Part 2 and 3 where we would see the use of ** correlograms **for visualizing correlations and mosaic plots for visualizing multivariate relationships among categorical variables.

Bubble plots

In the previous section, we displayed the relationship between three quantitative variables using a 3D scatter plot. Another approach is to create a 2D scatter plot and use the size of the plotted point to represent the value of the third variable. This approach is referred to as a **bubble plot**.

You can create a bubble plot using the symbols() function. This function can be used to draw circles, squares, stars, thermometers, and box plots at a specified set of (x,y) coordinates.

For plotting circles, the format is - symbols(x, y, circle=radius)

If you want the areas, rather than the radiuses of the circles, to be proportional to the values of a third variable, use this formula (recall the Circle area formula from your achool books) -

symbols(x, y, circle=sqrt(z/pi))

where z is the third variable to be plotted.

Let’s apply this to the mtcars data, plotting car weight on the x-axis, miles per gallon on the y-axis, and engine displacement as the bubble size.

attach(mtcars)

## The following objects are masked from mtcars (pos = 3): ## ## am, carb, cyl, disp, drat, gear, hp, mpg, qsec, vs, wt

## The following objects are masked from mtcars (pos = 7): ## ## am, carb, cyl, disp, drat, gear, hp, mpg, qsec, vs, wt

## The following objects are masked from mtcars (pos = 14): ## ## am, carb, cyl, disp, drat, gear, hp, mpg, qsec, vs, wt

r <- sqrt(disp/pi) symbols(wt, mpg, circle=r, inches=0.30, # The option inches is a scaling factor that can be used to control the size of the circles (the default is to make the largest circle 1 inch). fg="white",, main="Bubble Plot with point size proportional to displacement", ylab="Miles Per Gallon", xlab="Weight of Car (lbs/1000)") text(wt, mpg, rownames(mtcars), cex=0.5, col = "red")

detach(mtcars)

### Line charts

opar <- par(no.readonly=TRUE) par(mfrow=c(1,2)) t1 <- subset(Orange, Tree==1) # Scatterplot plot(t1$age, t1$circumference,, ylab="Circumference (mm)",) # Line Graph plot(t1$age, t1$circumference,, ylab="Circumference (mm)",,)

par(opar)

opar <- par(no.readonly=TRUE) par(mfrow=c(2,4)) types <- c("p","l","o","b","c","s","S","h") for (i in 1: length(types)) { plot(t1$age, t1$circumference,, col=i, , main=paste("Type : ",types[i], sep = ""), type=types[i]) }

par(opar)

Let us see an example of a complex line chart. Here we will plot the growth of all five orange trees over time. Each tree will have its own distinctive line.

par(mfrow=c(1,1)) Orange$Tree <- as.numeric(Orange$Tree) # Convert Factor to numeric for convenience ntrees <- max(Orange$Tree) xrange <- range(Orange$age) yrange <- range(Orange$circumference) # set up the graph and specify the axis labels and ranges but plots no actual data (type = "n") plot(xrange, yrange, , , ) colors <- rainbow(ntrees) # rainbow() function returns set of colors linetype <- c(1:ntrees) plotchar <- seq(18, 18+ntrees, 1) # # The lines() function will now add a separate line and set of points for each orange tree. for (i in 1:ntrees) { tree <- subset(Orange, Tree==i) lines(tree$age, tree$circumference, , lwd=2, lty=linetype[i], col=colors[i], pch=plotchar ) } # # Add title now title("Tree Growth", "example of line plot") # Now add Legends # legend(xrange[1], yrange[2], 1:ntrees, cex=0.8, col=colors, pch=plotchar, lty=linetype, )

### Correlograms

library(knitr) data(mtcars) options(digits=2) # round of to 2 digits after decimal kable(cor(mtcars))

You can display that same correlation matrix using the corrgram() function in the corrgram package -

library(corrgram)

## Warning: package 'corrgram' was built under R version 3.2.5

corrgram(mtcars, order=TRUE, lower.panel=panel.shade, upper.panel=panel.pie, text.panel=panel.txt, )

corrgram(mtcars, order=TRUE, lower.panel=panel.ellipse, upper.panel=panel.pts, text.panel=panel.txt, diag.panel=panel.minmax, )

# # Remove the upper panel # corrgram(mtcars, lower.panel=panel.shade, upper.panel=NULL, text.panel=panel.txt, )

# # We can control the colors used by the corrgram() function. To do so, specify four colors in the colorRampPalette() function corrgram(mtcars, order=TRUE, lower.panel=panel.shade, upper.panel=panel.pie, text.panel=panel.txt, , col.regions=colorRampPalette(c("darkgoldenrod4", "burlywood1", "darkkhaki", "darkgreen")))

There is another package corrplot() that can be used to create beautiful Correlograms . Let us see few examples using corrplot() package .

library(corrplot)

## Warning: package 'corrplot' was built under R version 3.2.5

par(mfrow=c(2,2)) M<-cor(mtcars) corrplot(M,) corrplot(M,) corrplot(M,) corrplot(M, , number.cex = 0.5)

- “full” (default) : display full correlation matrix
- “upper”: display upper triangular of the correlation matrix
- “lower”: display lower triangular of the correlation matrix

par(mfrow=c(1,2)) corrplot(M,) corrplot(M,)

par(mfrow=c(1,1))

Reordering the correlation matrix - The correlation matrix can be reordered according to the correlation coefficient. We have seen with corrgram package thet Principal Component Analysis is being used there for reordering. Let use use Hierarchical Clustering in corrplot package. Reordering is important to identify the hidden structure and pattern in the matrix.

par(mfrow=c(2,2)) # correlogram with hclust reordering corrplot(M,,) # Using different color spectrum col<- colorRampPalette(c("red", "white", "blue"))(20) corrplot(M,,, col=col) # Change background color to lightblue corrplot(M,,, col=c("black", "white"), ) # Changing the color of the correlogram using custom palette of colors from RcolorBrewer package library(RColorBrewer) corrplot(M,,, col=brewer.pal(n=8,))

par(mfrow=c(1,1))

The Text Label orientation can also be controlled in corrplot using options tl.col (for text label color) andtl.srt (for text label string rotation).

corrplot(M,,, tl.col="black", tl.srt=45)

## Mosaic plots

data(Titanic) # head(Titanic)

## [1] 0 0 35 0 0 0

str(Titanic)

## table [1:4, 1:2, 1:2, 1:2] 0 0 35 0 0 0 17 0 118 154 ... ## - attr(*, "dimnames")=List of 4 ## ..$ Class : chr [1:4] "1st" "2nd" "3rd" "Crew" ## ..$ Sex : chr [1:2] "Male" "Female" ## ..$ Age : chr [1:2] "Child" "Adult" ## ..$ Survived: chr [1:2] "No" "Yes"

# mosaicplot(Titanic, main = "Survival on the Titanic", color = TRUE)

# mosaicplot(~ Sex + Age + Survived, data = Titanic, color = TRUE)

# mosaicplot(Titanic, main = "Survival on the Titanic", # col= to produce alternating coloured rectangles — green for survivors and blue for non-survivors. col = hcl(c(240, 120)), # off= argument is used to squeeze out a little of the space between the blocks off = c(5, 5, 5, 5))

# # See the flat contigency table ftable(Titanic)

## Survived No Yes ## Class Sex Age ## 1st Male Child 0 5 ## Adult 118 57 ## Female Child 0 1 ## Adult 4 140 ## 2nd Male Child 0 11 ## Adult 154 14 ## Female Child 0 13 ## Adult 13 80 ## 3rd Male Child 35 13 ## Adult 387 75 ## Female Child 17 14 ## Adult 89 76 ## Crew Male Child 0 0 ## Adult 670 192 ## Female Child 0 0 ## Adult 3 20

# library(vcd)

## Warning: package 'vcd' was built under R version 3.2.5

mosaic(Titanic, shade=TRUE, legend=TRUE)

# # mosaic(~Class+Sex+Age+Survived, data=Titanic, shade=TRUE, legend=TRUE)

- 05/01/2017
- 832
- 0 Like

## R : Graphics Tutorial Series ( Part 4 )

- 05/01/2017
- 832
- 0 Like

#### Ankit Agarwal

Analytics Manager - Deloitte Advisory at Deloitte

Opinions expressed by Grroups members are their own.

#### Top Authors

Ad: 50000 Data Science Jobs Globally | Over 10000 Hours of Free Data Science Video Tutorials:- Register Now

In this first part of the Tutorial Series *R : Graphics Tutorial Series ( Part 1 )* , we learnt the basics of R Base Graphics while in the second part of the Series R : Graphics Tutorial Series ( Part 2 ) and third part

R : Graphics Tutorial Series ( Part 3 ) we saw various graphical methods for displaying relationships between two variables (bi-variate relationships) and between many variables (multivariate relationships).

This is an extension of the Part 2 and 3 where we would see the use of ** correlograms **for visualizing correlations and mosaic plots for visualizing multivariate relationships among categorical variables.

Bubble plots

In the previous section, we displayed the relationship between three quantitative variables using a 3D scatter plot. Another approach is to create a 2D scatter plot and use the size of the plotted point to represent the value of the third variable. This approach is referred to as a **bubble plot**.

You can create a bubble plot using the symbols() function. This function can be used to draw circles, squares, stars, thermometers, and box plots at a specified set of (x,y) coordinates.

For plotting circles, the format is - symbols(x, y, circle=radius)

If you want the areas, rather than the radiuses of the circles, to be proportional to the values of a third variable, use this formula (recall the Circle area formula from your achool books) -

symbols(x, y, circle=sqrt(z/pi))

where z is the third variable to be plotted.

Let’s apply this to the mtcars data, plotting car weight on the x-axis, miles per gallon on the y-axis, and engine displacement as the bubble size.

attach(mtcars)

## The following objects are masked from mtcars (pos = 3): ## ## am, carb, cyl, disp, drat, gear, hp, mpg, qsec, vs, wt

## The following objects are masked from mtcars (pos = 7): ## ## am, carb, cyl, disp, drat, gear, hp, mpg, qsec, vs, wt

## The following objects are masked from mtcars (pos = 14): ## ## am, carb, cyl, disp, drat, gear, hp, mpg, qsec, vs, wt

r <- sqrt(disp/pi) symbols(wt, mpg, circle=r, inches=0.30, # The option inches is a scaling factor that can be used to control the size of the circles (the default is to make the largest circle 1 inch). fg="white",, main="Bubble Plot with point size proportional to displacement", ylab="Miles Per Gallon", xlab="Weight of Car (lbs/1000)") text(wt, mpg, rownames(mtcars), cex=0.5, col = "red")

detach(mtcars)

### Line charts

opar <- par(no.readonly=TRUE) par(mfrow=c(1,2)) t1 <- subset(Orange, Tree==1) # Scatterplot plot(t1$age, t1$circumference,, ylab="Circumference (mm)",) # Line Graph plot(t1$age, t1$circumference,, ylab="Circumference (mm)",,)

par(opar)

opar <- par(no.readonly=TRUE) par(mfrow=c(2,4)) types <- c("p","l","o","b","c","s","S","h") for (i in 1: length(types)) { plot(t1$age, t1$circumference,, col=i, , main=paste("Type : ",types[i], sep = ""), type=types[i]) }

par(opar)

Let us see an example of a complex line chart. Here we will plot the growth of all five orange trees over time. Each tree will have its own distinctive line.

par(mfrow=c(1,1)) Orange$Tree <- as.numeric(Orange$Tree) # Convert Factor to numeric for convenience ntrees <- max(Orange$Tree) xrange <- range(Orange$age) yrange <- range(Orange$circumference) # set up the graph and specify the axis labels and ranges but plots no actual data (type = "n") plot(xrange, yrange, , , ) colors <- rainbow(ntrees) # rainbow() function returns set of colors linetype <- c(1:ntrees) plotchar <- seq(18, 18+ntrees, 1) # # The lines() function will now add a separate line and set of points for each orange tree. for (i in 1:ntrees) { tree <- subset(Orange, Tree==i) lines(tree$age, tree$circumference, , lwd=2, lty=linetype[i], col=colors[i], pch=plotchar ) } # # Add title now title("Tree Growth", "example of line plot") # Now add Legends # legend(xrange[1], yrange[2], 1:ntrees, cex=0.8, col=colors, pch=plotchar, lty=linetype, )

### Correlograms

library(knitr) data(mtcars) options(digits=2) # round of to 2 digits after decimal kable(cor(mtcars))

You can display that same correlation matrix using the corrgram() function in the corrgram package -

library(corrgram)

## Warning: package 'corrgram' was built under R version 3.2.5

corrgram(mtcars, order=TRUE, lower.panel=panel.shade, upper.panel=panel.pie, text.panel=panel.txt, )

corrgram(mtcars, order=TRUE, lower.panel=panel.ellipse, upper.panel=panel.pts, text.panel=panel.txt, diag.panel=panel.minmax, )

# # Remove the upper panel # corrgram(mtcars, lower.panel=panel.shade, upper.panel=NULL, text.panel=panel.txt, )

# # We can control the colors used by the corrgram() function. To do so, specify four colors in the colorRampPalette() function corrgram(mtcars, order=TRUE, lower.panel=panel.shade, upper.panel=panel.pie, text.panel=panel.txt, , col.regions=colorRampPalette(c("darkgoldenrod4", "burlywood1", "darkkhaki", "darkgreen")))

There is another package corrplot() that can be used to create beautiful Correlograms . Let us see few examples using corrplot() package .

library(corrplot)

## Warning: package 'corrplot' was built under R version 3.2.5

par(mfrow=c(2,2)) M<-cor(mtcars) corrplot(M,) corrplot(M,) corrplot(M,) corrplot(M, , number.cex = 0.5)

- “full” (default) : display full correlation matrix
- “upper”: display upper triangular of the correlation matrix
- “lower”: display lower triangular of the correlation matrix

par(mfrow=c(1,2)) corrplot(M,) corrplot(M,)

par(mfrow=c(1,1))

Reordering the correlation matrix - The correlation matrix can be reordered according to the correlation coefficient. We have seen with corrgram package thet Principal Component Analysis is being used there for reordering. Let use use Hierarchical Clustering in corrplot package. Reordering is important to identify the hidden structure and pattern in the matrix.

par(mfrow=c(2,2)) # correlogram with hclust reordering corrplot(M,,) # Using different color spectrum col<- colorRampPalette(c("red", "white", "blue"))(20) corrplot(M,,, col=col) # Change background color to lightblue corrplot(M,,, col=c("black", "white"), ) # Changing the color of the correlogram using custom palette of colors from RcolorBrewer package library(RColorBrewer) corrplot(M,,, col=brewer.pal(n=8,))

par(mfrow=c(1,1))

The Text Label orientation can also be controlled in corrplot using options tl.col (for text label color) andtl.srt (for text label string rotation).

corrplot(M,,, tl.col="black", tl.srt=45)

## Mosaic plots

data(Titanic) # head(Titanic)

## [1] 0 0 35 0 0 0

str(Titanic)

## table [1:4, 1:2, 1:2, 1:2] 0 0 35 0 0 0 17 0 118 154 ... ## - attr(*, "dimnames")=List of 4 ## ..$ Class : chr [1:4] "1st" "2nd" "3rd" "Crew" ## ..$ Sex : chr [1:2] "Male" "Female" ## ..$ Age : chr [1:2] "Child" "Adult" ## ..$ Survived: chr [1:2] "No" "Yes"

# mosaicplot(Titanic, main = "Survival on the Titanic", color = TRUE)

# mosaicplot(~ Sex + Age + Survived, data = Titanic, color = TRUE)

# mosaicplot(Titanic, main = "Survival on the Titanic", # col= to produce alternating coloured rectangles — green for survivors and blue for non-survivors. col = hcl(c(240, 120)), # off= argument is used to squeeze out a little of the space between the blocks off = c(5, 5, 5, 5))

# # See the flat contigency table ftable(Titanic)

## Survived No Yes ## Class Sex Age ## 1st Male Child 0 5 ## Adult 118 57 ## Female Child 0 1 ## Adult 4 140 ## 2nd Male Child 0 11 ## Adult 154 14 ## Female Child 0 13 ## Adult 13 80 ## 3rd Male Child 35 13 ## Adult 387 75 ## Female Child 17 14 ## Adult 89 76 ## Crew Male Child 0 0 ## Adult 670 192 ## Female Child 0 0 ## Adult 3 20

# library(vcd)

## Warning: package 'vcd' was built under R version 3.2.5

mosaic(Titanic, shade=TRUE, legend=TRUE)

# # mosaic(~Class+Sex+Age+Survived, data=Titanic, shade=TRUE, legend=TRUE)

- 05/01/2017
- 832
- 0 Like

## Ankit Agarwal

Analytics Manager - Deloitte Advisory at Deloitte

Opinions expressed by Grroups members are their own.