三因子變異數分析 (Three-Way ANOVA)


套路18: 三因子變異數分析 (Three-Way ANOVA)

應用三因子變異數分析的資料有三影響因子,就是有三個自變項。此檢定有七組H0HA
H0: m因子1-1 = m因子1-2 = … = m因子1-mHA: 至少有一組平均值不同。
H0: m因子2-1 = m因子2-2 = … = m因子2-nHA: 至少有一組平均值不同。
H0: m因子3-1 = m因子3-2 = … = m因子3-oHA: 至少有一組平均值不同。
H0: 因子1與因子2互不影響HA: 因子1與因子2互相影響。
H0: 因子1與因子3互不影響HA: 因子1與因子3互相影響。
H0: 因子2與因子3互不影響HA: 因子2與因子3互相影響。
H0: 因子1因子2與因子3互不影響HA: 因子1因子2與因子3互相影響。

1. 使用時機: 用於比較在三不同因子對樣本平均值 (mean)有無影響及三因子間有無交互作用。
2. 分析類型: 母數(parametric)分析。直接使用資料數值算統計叫parametric方法,把資料排序之後用排序的名次算統計叫non-parametric方法。
3. 前提假設: 使用母數(parametric)分析時資料須為常態分布(normal distribution)。使用ANOVA多組資料須相同變異數
4. 範例資料: 咪路研究溫度對孔雀魚(Poecilia reticulata)、寶蓮燈魚(Paracheirodon axelrodi)及斑馬魚(Danio rerio)生長的影響。在20°C24°C 28°C培養下,三種魚身長變化(cm)資料如下:
孔雀魚(Poecilia reticulata)
20°C
24°C
28°C
0.9
0.8
1.15
1.2
1.45
1.5
0.8
0.7
1.05
1.35
1.4
1.55
0.6
0.4
1.0
1.2
1.7
1.5
0.4
0.5
1.0
1.3
1.6
1.35

斑馬魚(Danio rerio)
20°C
24°C
28°C
1.05
1.15
1.2
1.0
1.8
1.55
1.0
1.0
1.3
1.15
1.55
1.5
0.9
0.95
1.35
1.05
1.7
1.4
1.1
0.85
1.15
1.2
1.6
1.6

寶蓮燈魚(Paracheirodon axelrodi)
20°C
24°C
28°C
0.55
0.7
1.0
1.2
1.45
1.6
0.6
0.5
1.05
1.3
1.4
1.45
0.5
0.65
0.95
1.15
1.5
1.4
0.7
0.6
1.1
1.1
1.55
1.45
H0: 三種魚沒差異 HA: 三種魚有差異
H0: 溫度沒影響HA: 溫度有影響
H0: 性別沒影響HA: 性別有影響
H0: 魚種及性別沒交互作用HA: 魚種及性別有交互作用
H0: 魚種及溫度沒交互作用HA: 魚種及溫度有交互作用
H0: 性別及溫度沒交互作用HA: 性別及溫度有交互作用
H0: 三因子沒交互作用HA: 三因子有交互作用

5. 畫圖看資料分布:
fi = ['F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3']
te = ['t20','t20','t20','t20','t20','t20','t20','t20','t24','t24','t24','t24','t24','t24','t24','t24','t28','t28','t28','t28','t28','t28','t28','t28','t20','t20','t20','t20','t20','t20','t20','t20','t24','t24','t24','t24','t24','t24','t24','t24','t28','t28','t28','t28','t28','t28','t28','t28','t20','t20','t20','t20','t20','t20','t20','t20','t24','t24','t24','t24','t24','t24','t24','t24','t28','t28','t28','t28','t28','t28','t28','t28']
se = ['M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F']
gr = [0.9,0.8,0.6,0.4,0.8,0.7,0.4,0.5,1.15,1.05,1,1,1.2,1.35,1.2,1.3,1.45,1.4,1.7,1.6,1.5,1.55,1.5,1.35,1.05,1,0.9,1.1,1.15,1,0.95,0.85,1.2,1.3,1.35,1.15,1,1.15,1.05,1.2,1.8,1.55,1.7,1.6,1.55,1.5,1.4,1.6,0.55,0.6,0.5,0.7,0.7,0.5,0.65,0.6,1,1.05,0.95,1.1,1.2,1.3,1.15,1.1,1.45,1.4,1.5,1.55,1.6,1.45,1.4,1.45]
dat = {'Fish':fi,'Temp':te,'Sex':se,'Growth':gr}
import pandas as pd
df = pd.DataFrame(dat)
import seaborn as sns
sns.catplot(x = "Temp", y = "Growth", hue = "Sex", col = "Fish", data = df, kind = "box", height = 4, aspect = .7)
結果:

6. 檢查資料是否為常態分布 (H0:資料為常態分佈):
dat1 = [0.9,0.8,0.6,0.4]
dat2 = [0.8,0.7,0.4,0.5]
dat3 = [1.15,1.05,1,1]
dat4 = [1.2,1.35,1.2,1.3]
dat5 = [1.45,1.4,1.7,1.6]
dat6 = [1.5,1.55,1.5,1.35]
dat7 = [1.05,1,0.9,1.1]
dat8 = [1.15,1,0.95,0.85]
dat9 = [1.2,1.3,1.35,1.15]
dat10 = [1,1.15,1.05,1.2]
dat11 = [1.8,1.55,1.7,1.6]
dat12 = [1.55,1.5,1.4,1.6]
dat13 = [0.55,0.6,0.5,0.7]
dat14 = [0.7,0.5,0.65,0.6]
dat15 = [1,1.05,0.95,1.1]
dat16 = [1.2,1.3,1.15,1.1]
dat17 = [1.45,1.4,1.5,1.55]
dat18 = [1.6,1.45,1.4,1.45]
import scipy.stats
scipy.stats.shapiro(dat1)
結果: (0.9630723595619202, 0.798227071762085)
p = 0.7982 > 0.05,接受H0:資料為常態分佈。
scipy.stats.shapiro(dat2)
結果: (0.9497060179710388, 0.7142806649208069)
p = 0.714 > 0.05,接受H0:資料為常態分佈。
scipy.stats.shapiro(dat3)
結果: (0.8274267315864563, 0.16119077801704407)
p = 0.161 > 0.05,接受H0:資料為常態分佈。
scipy.stats.shapiro(dat4)
結果: (0.849402666091919, 0.22423146665096283)
p = 0.224 > 0.05,接受H0:資料為常態分佈。
scipy.stats.shapiro(dat5)
結果: (0.9392699599266052, 0.6498799920082092)
p = 0.649 > 0.05,接受H0:資料為常態分佈。
scipy.stats.shapiro(dat6)
結果: (0.839701235294342, 0.1945330798625946)
p = 0.194 > 0.05,接受H0:資料為常態分佈。
scipy.stats.shapiro(dat7)
結果: (0.9713737964630127, 0.8499714732170105)
p = 0.849 > 0.05,接受H0:資料為常態分佈。
scipy.stats.shapiro(dat8)
結果: (0.9815163612365723, 0.9108564257621765)
p = 0.911 > 0.05,接受H0:資料為常態分佈。
scipy.stats.shapiro(dat9)
結果: (0.9497063159942627, 0.7142823338508606)
p = 0.714 > 0.05,接受H0:資料為常態分佈。
scipy.stats.shapiro(dat10)
結果: (0.9497058987617493, 0.7142797708511353)
p = 0.714 > 0.05,接受H0:資料為常態分佈。
scipy.stats.shapiro(dat11)
結果: (0.9630723595619202, 0.7982271313667297)
p = 0.7982 > 0.05,接受H0:資料為常態分佈。
scipy.stats.shapiro(dat12)
結果: (0.9713736176490784, 0.8499705791473389)
p = 0.849 > 0.05,接受H0:資料為常態分佈。
scipy.stats.shapiro(dat13)
結果: (0.9713736176490784, 0.8499705791473389)
p = 0.849 > 0.05,接受H0:資料為常態分佈。
scipy.stats.shapiro(dat14)
結果: (0.9713737964630127, 0.8499714732170105)
p = 0.849 > 0.05,接受H0:資料為常態分佈。
scipy.stats.shapiro(dat15)
結果: (0.9929120540618896, 0.97187739610672)
p = 0.971 > 0.05,接受H0:資料為常態分佈。
scipy.stats.shapiro(dat16)
結果: (0.9713737964630127, 0.8499715924263)
p = 0.849 > 0.05,接受H0:資料為常態分佈。
scipy.stats.shapiro(dat17)
結果: (0.992912232875824, 0.971878170967102)
p = 0.971 > 0.05,接受H0:資料為常態分佈。
scipy.stats.shapiro(dat18)
結果: (0.8397020101547241, 0.19453544914722443)
p = 0.194 > 0.05,接受H0:資料為常態分佈。

7. 檢查資料是否為相同變異數 (H0: s12 = s22 = s32  = s42):
方法: Levene test for equal variances (parametric test)
dat1 = [0.9,0.8,0.6,0.4]
dat2 = [0.8,0.7,0.4,0.5]
dat3 = [1.15,1.05,1,1]
dat4 = [1.2,1.35,1.2,1.3]
dat5 = [1.45,1.4,1.7,1.6]
dat6 = [1.5,1.55,1.5,1.35]
dat7 = [1.05,1,0.9,1.1]
dat8 = [1.15,1,0.95,0.85]
dat9 = [1.2,1.3,1.35,1.15]
dat10 = [1,1.15,1.05,1.2]
dat11 = [1.8,1.55,1.7,1.6]
dat12 = [1.55,1.5,1.4,1.6]
dat13 = [0.55,0.6,0.5,0.7]
dat14 = [0.7,0.5,0.65,0.6]
dat15 = [1,1.05,0.95,1.1]
dat16 = [1.2,1.3,1.15,1.1]
dat17 = [1.45,1.4,1.5,1.55]
dat18 = [1.6,1.45,1.4,1.45]
import scipy.stats
scipy.stats.levene(dat1, dat2, dat3, dat4, dat5, dat6, dat7, dat8, dat9, dat10, dat11, dat12, dat13, dat14, dat15, dat16, dat17, dat18, center = 'mean')
結果: LeveneResult(statistic=2.058529411764705, pvalue=0.023130449958502195)
p = 0.023 < 0.05,不接受H0: s12 = s22 = s32  = s42
# 相同變異數表示樣本來自相同母體(population),不同變異數表示樣本取樣自不同母體。

8. 使用Python計算雙因子變異數分析:

方法: statsmodels
fi = ['F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3']
te = ['t20','t20','t20','t20','t20','t20','t20','t20','t24','t24','t24','t24','t24','t24','t24','t24','t28','t28','t28','t28','t28','t28','t28','t28','t20','t20','t20','t20','t20','t20','t20','t20','t24','t24','t24','t24','t24','t24','t24','t24','t28','t28','t28','t28','t28','t28','t28','t28','t20','t20','t20','t20','t20','t20','t20','t20','t24','t24','t24','t24','t24','t24','t24','t24','t28','t28','t28','t28','t28','t28','t28','t28']
se = ['M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F']
gr = [0.9,0.8,0.6,0.4,0.8,0.7,0.4,0.5,1.15,1.05,1,1,1.2,1.35,1.2,1.3,1.45,1.4,1.7,1.6,1.5,1.55,1.5,1.35,1.05,1,0.9,1.1,1.15,1,0.95,0.85,1.2,1.3,1.35,1.15,1,1.15,1.05,1.2,1.8,1.55,1.7,1.6,1.55,1.5,1.4,1.6,0.55,0.6,0.5,0.7,0.7,0.5,0.65,0.6,1,1.05,0.95,1.1,1.2,1.3,1.15,1.1,1.45,1.4,1.5,1.55,1.6,1.45,1.4,1.45]
dat = {'Fish':fi,'Temp':te,'Sex':se,'Grow':gr}
import pandas as pd
df = pd.DataFrame(dat)
import statsmodels.api as sm
from statsmodels.formula.api import ols
mod = ols('Grow ~ Fish*Sex*Temp', data = df).fit()
sm.stats.anova_lm(mod, typ = 2)
結果:
                 sum_sq    df           F        PR(>F)
Fish           0.502986   2.0   20.913378  1.881431e-07  # p = 1.881e-7 < 0.05 Fish有影響
Sex            0.000868   1.0    0.072185  7.892057e-01  # p = 0.7892 > 0.05 Sex沒影響
Temp           7.248403   2.0  301.377286  5.066963e-30  # p = 5.0669e-30 < 0.05 Temp有影響
Fish:Sex       0.096736   2.0    4.022137  2.353345e-02  # p = 0.02353 < 0.05 Fish:Sex互相影響
Fish:Temp      0.352014   4.0    7.318094  8.748435e-05  # p = 8.748e-5 < 0.05  Fish:Temp互相影響
Sex:Temp       0.066736   2.0    2.774783  7.127039e-02  # p = 0.0712 > 0.05 Sex:Temp 互不影響
Fish:Sex:Temp  0.090347   4.0    1.878248  1.275738e-01 # p = 0.1275 > 0.05三因子不一起影響
Residual       0.649375  54.0         NaN           NaN


留言

這個網誌中的熱門文章

比較多組不同變異數獨立樣本平均值檢定 (Welch's Test for Analysis of Variance,parametric)

雙因子變異數分析 (Two-Way ANOVA)