What is a model? How is a linear regression fit and assessed? How we construct a model and make predictions with it?
In our case, a model is simply a (simplified) representation of reality, using mathematical language. This section will deal with regression linear models, starting with a single variable (predictor, independent) used to predict another one (response, dependent). The relation between both variables must be modeled, using a line and some properties of the normal distribution to fit the parameters that define that line. The model is assessed to measure the predictive power as well as if we incur in any violation of the premises concerning the way the line was fit. We will review the assumptions of Normality of the residues, linearity, constant variance (homoscedasticity) and independence. From that basis, we will expand to add more variables, check the effects of multi-collinearity and how to deal from there.
T3.1 Linear regression model [lecture]
T3.2 Model assessment [lecture]
T3.3 Multiple regression [lecture]
Slides from old lectures 2019 [PDF]
We propose you to try the following tasks to practice the concepts explain in those lectures:
How to do it?
Generate random numbers in excel: =RAND()
Generate numbers following a normal distribution with mean=100 and st dev=10: =NORMINV(RAND(),100,10)
For more instructions, google (as I do)!
The objectives of this topic are:
Lecture days: Third week: 2-6 November
During the lecture we will discuss about the topics in the presentations, the exercises and possible doubts.
Data and materials
Excel for task [xls]
Excel for practice [xls]
Simple regression exercises
Exercise pre-exam [PDF]
Exercise height-volume [PDF]
Exercise wheat production [PDF]
Multiple regression exercises
Exercise height-diameter-value [PDF]
Exercise barley yields [PDF]
Videos and tutorials
Simple linear regression [youtube]
How to make our own sandbox model [video]