使用Python畫盒鬚圖 (Box Plot using Python)
套路36: 使用Python畫盒鬚圖 (Box Plot using
Python)
1. 使用時機: 拿到數據時,對數據的某些基本特徵(集中,分散,有無離群值)進行分析了解。
2. 分析類型: 敘述性統計,資料視覺化,Python繪圖。
3. 範例一、單變數一組樣本(資料):
第一步: 資料,咪路調查淡水河口彈塗魚的體長(cm),資料如下: 14.3, 15.8, 14.6, 16.1, 12.9, 15.1,
17.3, 14.0, 14.5, 13.9, 16.2, 14.3, 14.6, 13.3, 15.5, 11.8, 14.8, 13.5, 16.3,
15.4, 15.5, 13.9, 10.7, 14.8, 12.9, 15.4。
dat = [14.3, 15.8, 14.6, 16.1, 12.9,
15.1, 17.3, 14.0, 14.5, 13.9, 16.2, 14.3, 14.6, 13.3, 15.5, 11.8, 14.8, 13.5,
16.3, 15.4, 15.5, 13.9, 10.7, 14.8, 12.9, 15.4]
dat # 顯示資料,可檢查資料格式是否正確
第二步: 呼叫seaborn程式套件。
import seaborn as sns
第三步: 畫圖。
sns.set(style="whitegrid")
ax = sns.boxplot(x = dat, orient =
"v", color = "skyblue", width=0.2) # 畫盒圖
ax = sns.swarmplot(x = dat, orient =
"v", color = "red")
# 加上資料點
結果:
# 同時畫x-y散布(紅色點)圖及盒圖(藍色box plot)。
4. 範例二、單變數兩組樣本(資料):
第一步: 資料,咪路調查高一和大一學生體重(kg),資料如下:
高一
|
41
|
35
|
33
|
36
|
40
|
46
|
31
|
37
|
34
|
30
|
38
|
大一
|
52
|
57
|
62
|
55
|
64
|
57
|
56
|
55
|
60
|
59
|
|
wt = [41, 35, 33, 36, 40, 46, 31, 37,
34, 30, 38, 52, 57, 62, 55, 64, 57, 56, 55, 60, 59]
cl =
["H","H","H","H","H","H","H","H","H","H","H","U","U","U","U","U","U","U","U","U","U"]
dat = {'Weight':wt,'Class':cl}
import pandas as pd # 呼叫pandas程式套件
df = pd.DataFrame(dat) # 將資料組合成名稱為df的data frame
df
# 顯示資料,可檢查資料格式是否正確
第二步: 呼叫seaborn程式套件
import seaborn as sns
第三步: 畫圖。
sns.set(style="whitegrid")
ax = sns.boxplot(x = "Class",
y = "Weight", data = df, width=0.2, palette="Set3")
ax = sns.swarmplot(x =
"Class", y = "Weight", data = df, color = "red")
結果:
# 同時畫x-y散布(紅色點)圖及盒圖(藍色及黃色box plot)。
5. 範例三、單變數多組樣本(資料):
第一步: 資料,咪路調查餵食不同飼料的肉雞體重(g),資料如下:
飼料1
|
飼料2
|
飼料3
|
飼料4
|
61.8
|
78.8
|
70.5
|
60.3
|
65.1
|
79.5
|
72.6
|
63.8
|
61.7
|
76.0
|
71.7
|
64.1
|
63.3
|
73.4
|
72.0
|
61.4
|
|
77.3
|
71.1
|
60.9
|
wt = [61.8,65.1,61.7,63.3,78.8,79.5,76.0,73.4,77.3,70.5,72.6,71.7,72.0,71.1,60.3,63.8,64.1,61.4,60.9]
cl = ["F1","F1","F1","F1","F2","F2","F2","F2","F2","F3","F3","F3","F3","F3","F4","F4","F4","F4","F4"]
dat = {'Weight':wt,'Feed':cl}
import pandas as pd # 呼叫pandas程式套件
df = pd.DataFrame(dat) # 將資料組合成名稱為df的data frame
df
# 顯示資料,可檢查資料格式是否正確
第二步: 呼叫seaborn程式套件
import seaborn as sns
第三步: 畫圖
sns.set(style="whitegrid")
ax = sns.boxplot(x = "Feed", y =
"Weight", data = df, width=0.2, palette="Set3")
ax = sns.swarmplot(x = "Feed", y =
"Weight", data = df, color = "red")
結果:
# 同時畫x-y散布(紅色點)圖及盒圖(box plot)。
6. 範例四、二變數樣本(資料):
第一步: 資料,咪路調查人類血漿中鉀離子濃度(mg/100 ml)資料如下:
沒注射賀爾蒙
|
注射賀爾蒙
|
||
雌
|
雄
|
雌
|
雄
|
16.3
|
15.3
|
38.1
|
34.0
|
20.4
|
17.4
|
26.2
|
22.8
|
12.4
|
10.9
|
32.3
|
27.8
|
15.8
|
10.3
|
35.8
|
25.0
|
9.5
|
6.7
|
30.2
|
29.3
|
co = [16.3,20.4,12.4,15.8,9.5,15.3,17.4,10.9,10.3,6.7,38.1,26.2,32.3,35.8,30.2,34,22.8,27.8,25,29.3]
ho =
["N","N","N","N","N","N","N","N","N","N","H","H","H","H","H","H","H","H","H","H"]
se =
["F","F","F","F","F","M","M","M","M","M","F","F","F","F","F","M","M","M","M","M"]
dat = {'Conc':co,'Hor':ho, 'Sex':se}
import pandas as pd # 呼叫pandas程式套件
df = pd.DataFrame(dat) # 將資料組合成名稱為df的data frame
df # 顯示資料,可檢查資料格式是否正確
第二步: 呼叫seaborn程式套件
import seaborn as sns
第三步: 畫圖
sns.set(style="whitegrid")
ax = sns.boxplot(x = "Hor", y = "Conc",
hue="Sex", data = df, palette = "Set3")
ax = sns.swarmplot(x = "Hor", y = "Conc",
hue="Sex", data = df, dodge = True, palette = "Set1")
結果:
7. 範例五、三變數樣本(資料):
第一步: 資料,咪路研究溫度對孔雀魚(Poecilia reticulata)、寶蓮燈魚(Paracheirodon axelrodi)及斑馬魚(Danio rerio)生長的影響。在20°C、24°C 及28°C培養下,三種魚身長變化(cm)資料如下:
孔雀魚(Poecilia
reticulata)
|
|||||
20°C
|
24°C
|
28°C
|
|||
雄
|
雌
|
雄
|
雌
|
雄
|
雌
|
0.9
|
0.8
|
1.15
|
1.2
|
1.45
|
1.5
|
0.8
|
0.7
|
1.05
|
1.35
|
1.4
|
1.55
|
0.6
|
0.4
|
1.0
|
1.2
|
1.7
|
1.5
|
0.4
|
0.5
|
1.0
|
1.3
|
1.6
|
1.35
|
斑馬魚(Danio
rerio)
|
|||||
20°C
|
24°C
|
28°C
|
|||
雄
|
雌
|
雄
|
雌
|
雄
|
雌
|
1.05
|
1.15
|
1.2
|
1.0
|
1.8
|
1.55
|
1.0
|
1.0
|
1.3
|
1.15
|
1.55
|
1.5
|
0.9
|
0.95
|
1.35
|
1.05
|
1.7
|
1.4
|
1.1
|
0.85
|
1.15
|
1.2
|
1.6
|
1.6
|
寶蓮燈魚(Paracheirodon
axelrodi)
|
|||||
20°C
|
24°C
|
28°C
|
|||
雄
|
雌
|
雄
|
雌
|
雄
|
雌
|
0.55
|
0.7
|
1.0
|
1.2
|
1.45
|
1.6
|
0.6
|
0.5
|
1.05
|
1.3
|
1.4
|
1.45
|
0.5
|
0.65
|
0.95
|
1.15
|
1.5
|
1.4
|
0.7
|
0.6
|
1.1
|
1.1
|
1.55
|
1.45
|
fi =
['F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3']
te =
['t20','t20','t20','t20','t20','t20','t20','t20','t24','t24','t24','t24','t24','t24','t24','t24','t28','t28','t28','t28','t28','t28','t28','t28','t20','t20','t20','t20','t20','t20','t20','t20','t24','t24','t24','t24','t24','t24','t24','t24','t28','t28','t28','t28','t28','t28','t28','t28','t20','t20','t20','t20','t20','t20','t20','t20','t24','t24','t24','t24','t24','t24','t24','t24','t28','t28','t28','t28','t28','t28','t28','t28']
se =
['M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F']
gr =
[0.9,0.8,0.6,0.4,0.8,0.7,0.4,0.5,1.15,1.05,1,1,1.2,1.35,1.2,1.3,1.45,1.4,1.7,1.6,1.5,1.55,1.5,1.35,1.05,1,0.9,1.1,1.15,1,0.95,0.85,1.2,1.3,1.35,1.15,1,1.15,1.05,1.2,1.8,1.55,1.7,1.6,1.55,1.5,1.4,1.6,0.55,0.6,0.5,0.7,0.7,0.5,0.65,0.6,1,1.05,0.95,1.1,1.2,1.3,1.15,1.1,1.45,1.4,1.5,1.55,1.6,1.45,1.4,1.45]
dat =
{'Fish':fi,'Temp':te,'Sex':se,'Growth':gr}
import pandas as pd # 呼叫pandas程式套件
df = pd.DataFrame(dat) # 將資料組合成名稱為df的data frame
df # 顯示資料,可檢查資料格式是否正確
第二步: 呼叫seaborn程式套件
import seaborn as sns
第三步: 畫box圖,三種魚畫三格,畫成3 x 1排列
sns.catplot(x =
"Temp", y = "Growth", hue = "Sex", col =
"Fish", data = df, kind = "box", height = 4, aspect = .7)
結果:
留言
張貼留言