When there is an identifiable trend in the data (i.e., at least a moderately strong correlation between x and y), we often want to model this relationship so that we can interpolate (estimate the value of y for any given value of x within the range of data we have) and extrapolate (predict the value of y for any given value of x beyond our range of data).

To model relationships, we can use a line or curve. The type of curve that best fits the data below is logistic.

There are many possibilities for which functions you could use to model the data, but the simplest is with a line. Therefore, this is called linear regression.

As you saw in Lesson 14, each x-value is xi and each y-value is yi. The line used to model the trend between the xi’s and yi’s is called the regression line or line of best fit.

