$$ y_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \cdots + \beta_p x_{ip} + \varepsilon_i $$
$$ \mathbf{y} = X\boldsymbol{\beta} + \boldsymbol{\varepsilon}$$
$$ \hat{\boldsymbol{\beta}} = (X^\top X)^{-1} X^\top \mathbf{y} $$
$$ y_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \cdots + \beta_p x_{ip} + \varepsilon_i $$
$$ \mathbf{y} = X\boldsymbol{\beta} + \boldsymbol{\varepsilon} $$
$$ X = \begin{bmatrix} 1 & x_{11} & x_{12} & \cdots & x_{1p} \\\\ 1 & x_{21} & x_{22} & \cdots & x_{2p} \\\\ \vdots & \vdots & \vdots & \ddots & \vdots \\\\ 1 & x_{n1} & x_{n2} & \cdots & x_{np} \end{bmatrix} $$
$$ \hat{\boldsymbol{\beta}} = (X^\top X)^{-1} X^\top \mathbf{y} $$
$$ \hat{\mathbf{y}} = X \hat{\boldsymbol{\beta}} $$
$$ \mathbf{e} = \mathbf{y} - \hat{\mathbf{y}} $$
$$ \hat{\sigma}^2 = \frac{1}{n-p-1} \, \mathbf{e}^\top \mathbf{e} $$
$$ \mathrm{Var}(\hat{\boldsymbol{\beta}}) = \hat{\sigma}^2 (X^\top X)^{-1} $$
$$ SE(\hat{\beta}_j) = \sqrt{ \left[ \hat{\sigma}^2 (X^\top X)^{-1} \right]_{jj} } $$
$$ t_j = \frac{\hat{\beta}_j}{SE(\hat{\beta}_j)} $$
$$ df = n - p - 1$$
$$ \hat{\beta}_j \pm t_{0.975,\, n-p-1} \cdot SE(\hat{\beta}_j) $$
$$ R^2 = 1 - \frac{\sum_{i=1}^n (y_i - \hat{y}_i)^2} {\sum_{i=1}^n (y_i - \bar{y})^2} $$
$$ R_{\mathrm{adj}}^2 = 1 - \left( \frac{n-1}{n-p-1} \right)(1 - R^2)$$
$$ RSE = \sqrt{\frac{\sum_{i=1}^{n} (y_i - \hat{y}_i)^2}{n - p - 1}}$$
$$ SSE = \sum_{i=1}^{n} (y_i - \hat{y}_i)^2$$
$$ \hat{y}_i = \beta_0 + \beta_1 x_{i1} + \cdots + \beta_p x_{ip}$$
| No. | X1 | X2 | Y |
|---|---|---|---|
| 0 | 32 | 10 | 37.9 |
| 1 | 19 | 9 | 42.2 |
| 2 | 13 | 5 | 47.3 |
| 3 | 13 | 5 | 47.5 |
| 4 | 5 | 5 | 51.5 |
| 5 | 7 | 3 | 48.2 |
| 6 | 34 | 7 | 40.3 |
| 7 | 20 | 6 | 46.7 |
| 8 | 30 | 1 | 18.8 |
| 9 | 17 | 3 | 25.8 |
df = pd.DataFrame([
[32 ,10 ,37.9],
[19 ,9 ,42.2],
[13 ,5 ,47.3],
[13 ,5 ,47.5],
[5 ,5 ,51.5],
[7 ,3 ,48.2],
[34 ,7 ,40.3],
[20 ,6 ,46.7],
[30 ,1 ,18.8],
[17 ,3 ,25.8],
], columns=['x1', 'x2', 'y'])
X = df[['x1', 'x2']]
y = df['y']
lg = LinearRegression()
lg.fit(X, y)
y_pred = lg.predict(X)
print('Predict', y_pred)
print('R^2 score', lg.score(X, y))
| y | X1 | X2 |
|---|---|---|
| 140 | 60 | 22 |
| 155 | 62 | 25 |
| 159 | 67 | 24 |
| 179 | 70 | 20 |
| 192 | 71 | 15 |
| 200 | 72 | 14 |
| 212 | 75 | 14 |
| 215 | 78 | 11 |