本文介紹了在 groupby 聚合之后指定列順序的處理方法,對大家解決問題具有一定的參考價值,需要的朋友們下面隨著小編來一起學習吧!
問題描述
每次運行代碼時,我的年齡、身高和體重列的順序都會發生變化.我需要保持 agg 列的順序不變,因為我最終會根據列位置引用此輸出文件.如何確保每次都以相同的順序輸出年齡、身高和體重?
The ordering of my age, height and weight columns is changing with each run of the code. I need to keep the order of my agg columns static because I ultimately refer to this output file according to the column locations. What can I do to make sure age, height and weight are output in the same order every time?
d = pd.read_csv(input_file, na_values=[''])
df = pd.DataFrame(d)
df.index_col = ['name', 'address']
df_out = df.groupby(df.index_col).agg({'age':np.mean, 'height':np.sum, 'weight':np.sum})
df_out.to_csv(output_file, sep=',')
推薦答案
我覺得你可以使用subset:
I think you can use subset:
df_out = df.groupby(df.index_col)
.agg({'age':np.mean, 'height':np.sum, 'weight':np.sum})[['age','height','weight']]
你也可以使用 pandas
函數:
Also you can use pandas
functions:
df_out = df.groupby(df.index_col)
.agg({'age':'mean', 'height':sum, 'weight':sum})[['age','height','weight']]
示例:
df = pd.DataFrame({'name':['q','q','a','a'],
'address':['a','a','s','s'],
'age':[7,8,9,10],
'height':[1,3,5,7],
'weight':[5,3,6,8]})
print (df)
address age height name weight
0 a 7 1 q 5
1 a 8 3 q 3
2 s 9 5 a 6
3 s 10 7 a 8
df.index_col = ['name', 'address']
df_out = df.groupby(df.index_col)
.agg({'age':'mean', 'height':sum, 'weight':sum})[['age','height','weight']]
print (df_out)
age height weight
name address
a s 9.5 12 14
q a 7.5 4 8
根據建議編輯 - 添加 reset_index
,如果也需要索引值,這里 as_index=False
不起作用:
EDIT by suggestion - add reset_index
, here as_index=False
does not work if need index values too:
df_out = df.groupby(df.index_col)
.agg({'age':'mean', 'height':sum, 'weight':sum})[['age','height','weight']]
.reset_index()
print (df_out)
name address age height weight
0 a s 9.5 12 14
1 q a 7.5 4 8
這篇關于在 groupby 聚合之后指定列順序的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!
【網站聲明】本站部分內容來源于互聯網,旨在幫助大家更快的解決問題,如果有圖片或者內容侵犯了您的權益,請聯系我們刪除處理,感謝您的支持!