\documentclass[notitlepage]{article} \usepackage{Math533} \usepackage{listings} \usepackage{numprint} \lstloadlanguages{R} \lstset{basicstyle=\ttfamily, numbers=none, literate={~} {$\sim$}{2}} \begin{document} \title{Using \texttt{knitr} to produce reports from \texttt{R} scripts} \author{David A. Stephens} \maketitle \begin{abstract} This document shows you how to use the \texttt{knitr} package in \texttt{R} or \texttt{RStudio} to create pdf documents based on \texttt{R} scripts. \end{abstract} The file that created this pdf document is \texttt{knit-01-lm.Rnw}, or the equivalent file \texttt{RStudio-knit-01-lm.Rnw} for use in \texttt{RStudio}. \bigskip \begin{itemize} \item Using \texttt{R}: when issuing the commands \begin{lstlisting} library(knitr) knit('knit-01-lm.Rnw') \end{lstlisting} whilst running \texttt{R} in a folder containing the file \texttt{knit-01-lm.Rnw}, the code will produce a Latex file \texttt{knit-01-lm.tex} which can then be compiled using \texttt{pdflatex}. \medskip \item Using \texttt{RStudio}: you can open the \texttt{RStudio-knit-01-lm.Rnw} file directly in \texttt{RStudio}, and compile it using the `Compile pdf' button at the top of the R/Sweave script window that opens automatically when you click on the \texttt{.Rnw} file. Note that you should set the file to be compiled correctly by going to the \[ \textit{Tools} \longrightarrow \textit{Global Options} \] pulldown menu, and hitting the Sweave symbol in the left-hand sidebar. Then from the top pulldown menu \textit{Weave Rnw files using} select the option `\texttt{knitr}'. \end{itemize} \bigskip Here is a plot of the rocket propulsion data from the textbook, as studied in class: %<>= %library(knitr) %knit_hooks$set(source = function(x, options) { % paste("\\begin{lstlisting}[numbers=left, firstnumber=last]\n", x, % "\\end{lstlisting}\n", sep = "") %}) %@ <>= data.source<-"http://www.math.mcgill.ca/dstephens/Regression/Data/2-1-RocketProp.csv" RocketProp<-read.csv(file=data.source) names(RocketProp)<-c('i','Strength','Age') x<-RocketProp$Age y<-RocketProp$Strength plot(x,y,pch=19,cex=0.6,xlab='Age',ylab='Shear Strength') xmean<-mean(x) ymean<-mean(y) abline(v=xmean,h=ymean,lty=2) @ The mean of the $x$ values is \Sexpr{xmean}, and the mean of the $y$ values is \Sexpr{ymean}. <>= ############################################ #Fit the simple linear regression using lm fit.RP<-lm(y~x) summary(fit.RP) coef(fit.RP) @ From this, we can deduce that the estimate of the intercept is $\widehat \beta_0=\Sexpr{coef(fit.RP)[1]}$, and the estimate of the slope is $\widehat \beta_1=\Sexpr{coef(fit.RP)[2]}$, that is, the line of best fit is \[ y = \Sexpr{coef(fit.RP)[1]} + \Sexpr{coef(fit.RP)[2]} x \] However, this is reporting the estimates to too many decimal places: we can reduce that by using the \texttt{round} function: for example \texttt{round(coef(fit.RP)[1],3)} gives the result to 3 decimal places. We have \[ y = \Sexpr{round(coef(fit.RP)[1],3)} + \Sexpr{round(coef(fit.RP)[2],3)} x. \] The line of best fit is good (see plot below). \medskip Here is an alternative way to use the \texttt{lm} function in \texttt{R}, by using the \texttt{data=RocketProp} argument; this allows you to call the variables by name in the command \[ \texttt{fit.RP<-lm(Strength~Age,data=RocketProp)} \] <>= fit.RP<-lm(Strength~Age,data=RocketProp) plot(x,y,pch=19,cex=0.6,xlab='Age',ylab='Shear Strength') coef(fit.RP) abline(coef(fit.RP),col='red') title('Line of best fit for Rocket Propulsion Data') @ \end{document}