--- title: "Using RMarkdown to Produce Reports from R Scripts" output: pdf_document --- This document shows you how to use the \texttt{R Markdown} in \texttt{RStudio} to create pdf documents based on \texttt{R} scripts. This $x=1$ \[ X=(x_1,x_2,\cdots,x_p) \] The file that created this pdf document is \texttt{1-lm.Rmd} for use in \texttt{RStudio}. ## Create an R Markdown file Using \texttt{RStudio}: you can open the \texttt{1-lm.Rmd} file directly in \texttt{RStudio}, and compile it using the `Knit' button at the top of the R script window that opens automatically when you click on the \texttt{.Rmd} file. Note that you can change the setting of \texttt{R Markdown} in *Tools $\rightarrow$ Global Options* pulldown menu, and hitting the R Markdown symbol in the left-hand sidebar. ## Load and plot the data Here is a plot of the rocket propulsion data from the textbook, as studied in class: ```{r} data.source<-"http://www.math.mcgill.ca/yyang/regression/data/2-1-RocketProp.csv" RocketProp<-read.csv(file=data.source) names(RocketProp)<-c('i','Strength','Age') x<-RocketProp$Age y<-RocketProp$Strength plot(x,y,pch=19,cex=0.6,xlab='Age',ylab='Shear Strength') xmean<-mean(x) ymean<-mean(y) abline(v=xmean,h=ymean,lty=2) ``` The mean of the $x$ values is `r xmean`, and the mean of the $y$ values is `r ymean`. ## Fit the regression ```{r} ############################################ #Fit the simple linear regression using lm fit.RP<-lm(y~x) summary(fit.RP) coef(fit.RP) ``` From this, we can deduce that the estimate of the intercept is $\widehat \beta_0= `r coef(fit.RP)[1]`$, and the estimate of the slope is $\widehat \beta_1= `r coef(fit.RP)[2]`$, that is, the line of best fit is \[ y = `r coef(fit.RP)[1]` + `r coef(fit.RP)[2]` x \] However, this is reporting the estimates to too many decimal places: we can reduce that by using the \texttt{round} function: for example \texttt{round(coef(fit.RP)[1],3)} gives the result to 3 decimal places. We have \[ y = `r round(coef(fit.RP)[1],3)` + `r round(coef(fit.RP)[2],3)` x. \] The line of best fit is good (see plot below). Here is an alternative way to use the \texttt{lm} function in \texttt{R}, by using the \texttt{data=RocketProp} argument; this allows you to call the variables by name in the command \[ \texttt{fit.RP<-lm(Strength~Age,data=RocketProp)} \] ```{r} fit.RP<-lm(Strength~Age,data=RocketProp) plot(x,y,pch=19,cex=0.6,xlab='Age',ylab='Shear Strength') coef(fit.RP) abline(coef(fit.RP),col='red') title('Line of best fit for Rocket Propulsion Data') ```