簡單直線回歸 (Simple Linear Regression)

套路27: 簡單直線回歸 (Simple Linear Regression)

1. 使用時機: 以單一變數(自變數)預測判斷依變數與自變數之間相關的方向(趨勢)和程度。
2. 分析類型: 母數分析(parametric analysis)
3. 範例資料: 咪路測量手養鳥出生天數(day)與翅膀長度(cm)資料如下:
X (day)        
1
3
4
6
9
10
12
15
17
19
23
25
27
Y (翼長)
1.4
1.5
2.2
2.4
3.1
3.2
3.5
3.9
4.1
4.5
4.7
5.0
5.2
XY是否有線性關係?
4. 畫圖看資料分佈:
  v1 = [1,3,4,6,9,10,12,15,17,19,23,25,27]
  v2 = [1.4,1.5,2.2,2.4,3.1,3.2,3.5,3.9,4.1,4.5,4.7,5,5.2]
  dat = {'Day':v1,'Length':v2}  # Day, Length是資料標題
  import pandas as pd
  df = pd.DataFrame(dat)
  import seaborn as sb
  from matplotlib import pyplot as plt
  sb.lmplot(x = "Day", y = "Length", data = df)
  plt.show()
# 條帶區域為95%信賴區間
5. 使用Python計算直線回歸方程式
方法:
  v1 = [1,3,4,6,9,10,12,15,17,19,23,25,27]
  v2 = [1.4,1.5,2.2,2.4,3.1,3.2,3.5,3.9,4.1,4.5,4.7,5,5.2]
  dat = {'Day':v1,'Length':v2}  # Day, Length是資料標題
  import pandas as pd
  df = pd.DataFrame(dat)
  import statsmodels.api as sm
  X = df['Day']
  y = df['Length']
  X2 = sm.add_constant(X)
  est = sm.OLS(y, X2)
  est2 = est.fit()
  print(est2.summary())
結果:
                            OLS Regression Results                           
==============================================================================
Dep. Variable:                 Length   R-squared:                       0.965
Model:                            OLS   Adj. R-squared:                  0.962
Method:                 Least Squares   F-statistic:                     307.2
Date:                Tue, 23 Jul 2019   Prob (F-statistic):           2.19e-09
Time:                        16:25:41   Log-Likelihood:                0.75314
No. Observations:                  13   AIC:                             2.494
Df Residuals:                      11   BIC:                             3.624
Df Model:                           1                                        
Covariance Type:            nonrobust                                        
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          1.5260      0.129     11.829      0.000       1.242       1.810
Day            0.1454      0.008     17.528      0.000       0.127       0.164
==============================================================================
Omnibus:                        1.649   Durbin-Watson:                   0.884
Prob(Omnibus):                  0.438   Jarque-Bera (JB):                1.191
Skew:                          -0.541   Prob(JB):                        0.551
Kurtosis:                       1.987   Cond. No.                         29.2
==============================================================================

留言

這個網誌中的熱門文章

三因子變異數分析 (Three-Way ANOVA)

兩組獨立樣本變異數相同 t 檢定 (Two-Sample t test with equal variances,parametric)

雙因子變異數分析 (Two-Way ANOVA)