Using regression model to explore relationships: the “Gasoline Mileage” example
novembro 17, 2016 § Deixe um comentário
If you are into statistics probably already know the importance of regression analysis to statistical modelling. If you are not, it is necessary to say that it is important stuff and is use for estimating the relationships among variables. There are many techniques and extensions for carrying out regression analysis such as linear regression, multivariate linear regression (also known as general linear model), some variances as Bayesian multivariate linear regression, least-squares and so on.
What these approaches have in common is an equation of the form y = a + bx, where x is the explanatory variable and y is the dependent variable. The slope of the line is b, and a is the intercept (the value of y when x = 0).
Harold V. Henderson and Paul F. Velleman provided a famous example of the use of a regression model in their paper “Building Multiple Regression Models Interactively”, published in 1981 by Biometrics magazine (to those whom are interested in read the original one, check http://www.mortality.org/INdb/2008/02/12/8/document.pdf).
There they used what is known as the “Gasoline Mileage Data”, which became a dataset used around the world for educational purposes. The data were extracted from 1974 Motor Trend magazine and comprise gasoline mileage in miles per gallon (MPG), and ten aspects of automobile design and performance for 32 automobiles (1973-74 models). I explored this data using the dataset created for R programming called “mtcars”. As I believe that any analysis has to have a purpose, mine attempted to determine whether an automatic or manual transmission is better for MPG and to quantify the MPG difference.
In doing so, I composed the following paper with linear and multiple regression models and the codes to perform the modelling in R, as well as my personal analysis. The paper can be accessed at http://rpubs.com/marcelo_tibau/228029