使用Python畫Swarm圖 (Swarmplots using Python)

套路38: 使用PythonSwarm (Swarmplots using Python)

1. 使用時機: 拿到數據時對數據的某些基本特徵(集中分散有無離群值)進行分析了解。Swarm圖與帶狀圖之不同在Swarm圖將數值相同的點分散排開,比較容易看出那些值重複多次。

2. 分析類型: 敘述性統計資料視覺化Python繪圖。

3. 範例一、單變數一組樣本(資料):
第一步: 資料咪路調查淡水河口彈塗魚的體長(cm)資料如下: 14.3, 15.8, 14.6, 16.1, 12.9, 15.1, 17.3, 14.0, 14.5, 13.9, 16.2, 14.3, 14.6, 13.3, 15.5, 11.8, 14.8, 13.5, 16.3, 15.4, 15.5, 13.9, 10.7, 14.8, 12.9, 15.4
  dat = [14.3, 15.8, 14.6, 16.1, 12.9, 15.1, 17.3, 14.0, 14.5, 13.9, 16.2, 14.3, 14.6, 13.3, 15.5, 11.8, 14.8, 13.5, 16.3, 15.4, 15.5, 13.9, 10.7, 14.8, 12.9, 15.4]
  dat   # 顯示資料可檢查資料格式是否正確
第二步: 呼叫seaborn程式套件。
  import seaborn as sns
第三步: 畫圖。
  sns.set(style="whitegrid")
ax = sns.swarmplot(x = dat, orient = "v", color = "red")   # 加上資料點
結果:
4. 範例二、單變數兩組樣本(資料):
第一步: 資料咪路調查高一和大一學生體重(kg)資料如下:
高一
41
35
33
36
40
46
31
37
34
30
38
大一
52
57
62
55
64
57
56
55
60
59


  wt = [41, 35, 33, 36, 40, 46, 31, 37, 34, 30, 38, 52, 57, 62, 55, 64, 57, 56, 55, 60, 59]
cl = ["H","H","H","H","H","H","H","H","H","H","H","U","U","U","U","U","U","U","U","U","U"]
dat = {'Weight':wt,'Class':cl}
import pandas as pd   # 呼叫pandas程式套件
df = pd.DataFrame(dat)   # 將資料組合成名稱為dfdata frame
df   # 顯示資料可檢查資料格式是否正確
第二步: 呼叫seaborn程式套件
  import seaborn as sns
第三步: 畫圖。
  sns.set(style="whitegrid")
ax = sns.swarmplot(x = "Class", y = "Weight", data = df)
結果:
5. 範例三、單變數多組樣本(資料):
第一步: 資料咪路調查餵食不同飼料的肉雞體重(g),資料如下:
飼料1
飼料2
飼料3
飼料4
61.8
78.8
70.5
60.3
65.1
79.5
72.6
63.8
61.7
76.0
71.7
64.1
63.3
73.4
72.0
61.4

77.3
71.1
60.9

   wt = [61.8,65.1,61.7,63.3,78.8,79.5,76.0,73.4,77.3,70.5,72.6,71.7,72.0,71.1,60.3,63.8,64.1,61.4,60.9]
   cl = ["F1","F1","F1","F1","F2","F2","F2","F2","F2","F3","F3","F3","F3","F3","F4","F4","F4","F4","F4"]
   dat = {'Weight':wt,'Feed':cl}
   import pandas as pd  # 呼叫pandas程式套件
   df = pd.DataFrame(dat)  # 將資料組合成名稱為dfdata frame
   df   # 顯示資料可檢查資料格式是否正確
第二步: 呼叫seaborn程式套件
   import seaborn as sns
第三步: 畫圖
   sns.set(style="whitegrid")
   ax = sns.swarmplot(x = "Feed", y = "Weight", data = df)
結果:
6. 範例四變數樣本(資料):
第一步: 資料咪路調查人類血漿中鉀離子濃度(mg/100 ml)資料如下:
沒注射賀爾蒙
注射賀爾蒙
16.3
15.3
38.1
34.0
20.4
17.4
26.2
22.8
12.4
10.9
32.3
27.8
15.8
10.3
35.8
25.0
9.5
6.7
30.2
29.3

   co = [16.3,20.4,12.4,15.8,9.5,15.3,17.4,10.9,10.3,6.7,38.1,26.2,32.3,35.8,30.2,34,22.8,27.8,25,29.3]
   ho = ["N","N","N","N","N","N","N","N","N","N","H","H","H","H","H","H","H","H","H","H"]
   se = ["F","F","F","F","F","M","M","M","M","M","F","F","F","F","F","M","M","M","M","M"]
   dat = {'Conc':co,'Hor':ho, 'Sex':se}
   import pandas as pd  # 呼叫pandas程式套件
   df = pd.DataFrame(dat)   # 將資料組合成名稱為dfdata frame
   df   # 顯示資料可檢查資料格式是否正確
第二步: 呼叫seaborn程式套件
   import seaborn as sns
第三步: 畫圖
   sns.set(style="whitegrid")
   ax = sns.swarmplot(x = "Hor", y = "Conc", hue = "Sex", data = df, palette = "Set2", dodge = True)
結果:
7. 範例五變數樣本(資料):
第一步: 資料咪路研究溫度對孔雀魚(Poecilia reticulata)、寶蓮燈魚(Paracheirodon axelrodi)及斑馬魚(Danio rerio)生長的影響。在20°C24°C 28°C培養下,三種魚身長變化(cm)資料如下:
孔雀魚(Poecilia reticulata)
20°C
24°C
28°C
0.9
0.8
1.15
1.2
1.45
1.5
0.8
0.7
1.05
1.35
1.4
1.55
0.6
0.4
1.0
1.2
1.7
1.5
0.4
0.5
1.0
1.3
1.6
1.35

斑馬魚(Danio rerio)
20°C
24°C
28°C
1.05
1.15
1.2
1.0
1.8
1.55
1.0
1.0
1.3
1.15
1.55
1.5
0.9
0.95
1.35
1.05
1.7
1.4
1.1
0.85
1.15
1.2
1.6
1.6

寶蓮燈魚(Paracheirodon axelrodi)
20°C
24°C
28°C
0.55
0.7
1.0
1.2
1.45
1.6
0.6
0.5
1.05
1.3
1.4
1.45
0.5
0.65
0.95
1.15
1.5
1.4
0.7
0.6
1.1
1.1
1.55
1.45
   fi = ['F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F1','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F2','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3','F3']
   te = ['t20','t20','t20','t20','t20','t20','t20','t20','t24','t24','t24','t24','t24','t24','t24','t24','t28','t28','t28','t28','t28','t28','t28','t28','t20','t20','t20','t20','t20','t20','t20','t20','t24','t24','t24','t24','t24','t24','t24','t24','t28','t28','t28','t28','t28','t28','t28','t28','t20','t20','t20','t20','t20','t20','t20','t20','t24','t24','t24','t24','t24','t24','t24','t24','t28','t28','t28','t28','t28','t28','t28','t28']
   se = ['M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F']
   gr = [0.9,0.8,0.6,0.4,0.8,0.7,0.4,0.5,1.15,1.05,1,1,1.2,1.35,1.2,1.3,1.45,1.4,1.7,1.6,1.5,1.55,1.5,1.35,1.05,1,0.9,1.1,1.15,1,0.95,0.85,1.2,1.3,1.35,1.15,1,1.15,1.05,1.2,1.8,1.55,1.7,1.6,1.55,1.5,1.4,1.6,0.55,0.6,0.5,0.7,0.7,0.5,0.65,0.6,1,1.05,0.95,1.1,1.2,1.3,1.15,1.1,1.45,1.4,1.5,1.55,1.6,1.45,1.4,1.45]
   dat = {'Fish':fi,'Temp':te,'Sex':se,'Growth':gr}
   import pandas as pd  # 呼叫pandas程式套件
   df = pd.DataFrame(dat)   # 將資料組合成名稱為dfdata frame
   df   # 顯示資料可檢查資料格式是否正確
第二步: 呼叫seaborn程式套件
   import seaborn as sns
第三步: box三種魚畫三格畫成3 x 1排列
   sns.catplot(x = "Temp", y = "Growth", hue = "Sex", col = "Fish", data = df, kind = "swarm", height = 4, aspect = .7, dodge = True)
結果:

留言

這個網誌中的熱門文章

三因子變異數分析 (Three-Way ANOVA)

比較多組不同變異數獨立樣本平均值檢定 (Welch's Test for Analysis of Variance,parametric)

雙因子變異數分析 (Two-Way ANOVA)