In simple linear regression the computer learns a linear relationship between a single input and a single output by calculating two values and . These values define a line that best fits the training examples .

Multivariate linear regression is an extension that finds a relationship between multiple inputs and an output. We say that the input is “ dimensional.” The computer calculates values . These values define a plane (if ) or hyperplane (if ) that best fits the training examples.

To handle the multiplicity of variables and values, it is convenient to use matrices.

Let \( \vec{\theta} = \) `[[\theta_0], [\theta_1], [\vdots], [\theta_n]]`. Its transpose `[\theta_0, \theta_1, …, \theta_n]`

Let the inputs be represented as `[[x_0 = 1], [x_1], [\vdots], [x_n]]`. (Below you’ll see why we set .)

Instead of as for simple linear regression, we use the rules of matrix multiplication to get the model equation:

The cost function can also be expressed using matrix notation:

The partial derivatives are as follows:

The model equation, cost function, and its partial derivatives should look familiar. In the special case where they are exactly the equations for simple linear regression.