多元線性回歸分析 (Multiple Linear Regression)


套路28: 多元線性回歸分析 (Multiple Linear Regression)

1. 使用時機: 以多個獨立自變項預測一個應變項。
2. 分析類型: 母數分析(parametric analysis)
3. 範例資料: 咪路測量菌菌的生長條件與菌數,資料如下表。咪路希望能得到一回歸方程式可用來預測菌菌生長
item
Temp
pH
hour
ml
CFU(106)
1
6
5.7
1.6
2.12
9.9
2
1
6.4
3
3.39
9.3
3
2
5.7
3.4
3.61
9.4
4
11
6.1
3.4
1.72
9.1
5
1
6
3
1.8
6.9
6
2
5.7
4.4
3.21
9.3
7
5
5.9
2.2
2.59
7.9
8
1
6.2
2.2
3.25
7.4
9
1
5.5
1.9
2.86
7.3
10
3
5.2
0.2
2.32
8.8
11
11
5.7
4.2
1.57
9.8
12
9
6.1
2.4
1.5
10.5
13
5
6.4
3.4
2.69
9.1
14
3
5.5
3
4.06
10.1
15
1
5.5
0.2
1.98
7.2
16
8
6
3.9
2.29
11.7
17
2
5.5
2.2
3.55
8.7
18
3
6.2
4.4
3.31
7.6
19
6
5.9
0.2
1.83
8.6
20
10
5.6
2.4
1.69
10.9
21
4
5.8
2.4
2.42
7.6
22
5
5.8
4.4
2.98
7.3
23
5
5.2
1.6
1.84
9.2
24
3
6
1.9
2.48
7
25
8
5.5
1.6
2.83
7.2
26
8
6.4
4.1
2.41
7
27
6
6.2
1.9
1.78
8.8
28
6
5.4
2.2
2.22
10.1
29
3
5.4
4.1
2.72
12.1
30
5
6.2
1.6
2.36
7.7
31
1
6.8
2.4
2.81
7.8
32
8
6.2
1.9
1.64
11.5
33
10
6.4
2.2
1.82
10.4
求多元線性回歸方程式。

4. 建立資料
from pandas import DataFrame
dat = {'Temp': [6, 1, 2, 11, 1, 2, 5, 1, 1, 3, 11, 9, 5, 3, 1, 8, 2, 3, 6, 10, 4, 5, 5, 3, 8, 8, 6, 6, 3, 5, 1, 8, 10],
      'pH': [5.7, 6.4, 5.7, 6.1, 6, 5.7, 5.9, 6.2, 5.5, 5.2, 5.7, 6.1, 6.4, 5.5, 5.5, 6, 5.5, 6.2, 5.9, 5.6, 5.8, 5.8, 5.2, 6, 5.5, 6.4, 6.2, 5.4, 5.4, 6.2, 6.8, 6.2, 6.4],
      'hour': [1.6, 3, 3.4, 3.4, 3, 4.4, 2.2, 2.2, 1.9, 0.2, 4.2, 2.4, 3.4, 3, 0.2, 3.9, 2.2, 4.4, 0.2, 2.4, 2.4, 4.4, 1.6, 1.9, 1.6, 4.1, 1.9, 2.2, 4.1, 1.6, 2.4, 1.9, 2.2],
      'ml': [2.12, 3.39, 3.61, 1.72, 1.8, 3.21, 2.59, 3.25, 2.86, 2.32, 1.57, 1.5, 2.69, 4.06, 1.98, 2.29, 3.55, 3.31, 1.83, 1.69, 2.42, 2.98, 1.84, 2.48, 2.83, 2.41, 1.78, 2.22, 2.72, 2.36, 2.81, 1.64, 1.82],
      'CFU': [9.9, 9.3, 9.4, 9.1, 6.9, 9.3, 7.9, 7.4, 7.3, 8.8, 9.8, 10.5, 9.1, 10.1, 7.2, 11.7, 8.7, 7.6, 8.6, 10.9, 7.6, 7.3, 9.2, 7, 7.2, 7, 8.8, 10.1, 12.1, 7.7, 7.8, 11.5, 10.4]       
     }
df = DataFrame(dat,columns=['Temp','pH','hour','ml','CFU'])

5. 執行回歸
X = df[['Temp','pH','hour','ml']]
Y = df['CFU']
import statsmodels.api as sm
X = sm.add_constant(X) # adding a constant
model = sm.OLS(Y, X).fit()
predictions = model.predict(X)
res = model.summary()
res

結果:
                            OLS Regression Results                           
==============================================================================
Dep. Variable:                    CFU   R-squared:                       0.252
Model:                            OLS   Adj. R-squared:                  0.146
Method:                 Least Squares   F-statistic:                     2.363
Date:                Mon, 29 Jul 2019   Prob (F-statistic):             0.0772
Time:                        15:00:43   Log-Likelihood:                -54.579
No. Observations:                  33   AIC:                             119.2
Df Residuals:                      28   BIC:                             126.6
Df Model:                           4                                        
Covariance Type:            nonrobust                                        
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const         12.6254      4.135      3.053      0.005       4.155      21.096
Temp           0.1946      0.110      1.771      0.087      -0.030       0.420
pH            -0.8865      0.646     -1.373      0.181      -2.209       0.436
hour           0.2477      0.247      1.003      0.324      -0.258       0.753
ml            -0.0474      0.541     -0.088      0.931      -1.156       1.061
==============================================================================
Omnibus:                        0.100   Durbin-Watson:                   1.434
Prob(Omnibus):                  0.951   Jarque-Bera (JB):                0.282
Skew:                           0.097   Prob(JB):                        0.869
Kurtosis:                       2.591   Cond. No.                         153.
==============================================================================

只有constTempp value 小於0.05
方程式: CFU = 12.6254 + 0.1946*Temp

留言

這個網誌中的熱門文章

三因子變異數分析 (Three-Way ANOVA)

比較多組不同變異數獨立樣本平均值檢定 (Welch's Test for Analysis of Variance,parametric)

雙因子變異數分析 (Two-Way ANOVA)