午夜电影合集,日日夜夜精品,九九精品视频在线

本文介紹了如何通過 pandas 或火花數(shù)據(jù)框刪除所有行中具有相同值的列?的處理方法，對大家解決問題具有一定的參考價值，需要的朋友們下面隨著小編來一起學習吧！

問題描述

假設我有類似以下的數(shù)據(jù):

Suppose I've data similar to following:

  index id   name  value  value2  value3  data1  val5
    0  345  name1    1      99      23     3      66
    1   12  name2    1      99      23     2      66
    5    2  name6    1      99      23     7      66

我們?nèi)绾卧谝粋€命令中刪除所有行具有相同值的所有列，例如 (value, value2, value3)還是使用 python 的幾個命令?

How can we drop all those columns like (value, value2, value3) where all rows have the same values, in one command or couple of commands using python?

假設我們有許多列類似于 value、value2、value3...value200.

Consider we have many columns similar to value, value2, value3...value200.

輸出:

   index    id  name   data1
       0   345  name1    3
       1    12  name2    2
       5     2  name6    7

推薦答案

我們可以做的是使用 nunique 計算數(shù)據(jù)框每一列中唯一值的個數(shù)，并丟棄只有一個唯一值:

What we can do is use nunique to calculate the number of unique values in each column of the dataframe, and drop the columns which only have a single unique value:

In [285]:
nunique = df.nunique()
cols_to_drop = nunique[nunique == 1].index
df.drop(cols_to_drop, axis=1)

Out[285]:
   index   id   name  data1
0      0  345  name1      3
1      1   12  name2      2
2      5    2  name6      7

另一種方法是只 diff 數(shù)字列，獲取 abs 值和 sums 它們:

Another way is to just diff the numeric columns, take abs values and sums them:

In [298]:
cols = df.select_dtypes([np.number]).columns
diff = df[cols].diff().abs().sum()
df.drop(diff[diff== 0].index, axis=1)
?
Out[298]:
   index   id   name  data1
0      0  345  name1      3
1      1   12  name2      2
2      5    2  name6      7

另一種方法是使用具有相同值的列的標準差為零的屬性:

Another approach is to use the property that the standard deviation will be zero for a column with the same value:

In [300]:
cols = df.select_dtypes([np.number]).columns
std = df[cols].std()
cols_to_drop = std[std==0].index
df.drop(cols_to_drop, axis=1)

Out[300]:
   index   id   name  data1
0      0  345  name1      3
1      1   12  name2      2
2      5    2  name6      7

其實以上都可以單行完成:

Actually the above can be done in a one-liner:

In [306]:
df.drop(df.std()[(df.std() == 0)].index, axis=1)

Out[306]:
   index   id   name  data1
0      0  345  name1      3
1      1   12  name2      2
2      5    2  name6      7

這篇關(guān)于如何通過 pandas 或火花數(shù)據(jù)框刪除所有行中具有相同值的列?的文章就介紹到這了，希望我們推薦的答案對大家有所幫助，也希望大家多多支持html5模板網(wǎng)！

【網(wǎng)站聲明】本站部分內(nèi)容來源于互聯(lián)網(wǎng),旨在幫助大家更快的解決問題，如果有圖片或者內(nèi)容侵犯了您的權(quán)益，請聯(lián)系我們刪除處理，感謝您的支持！

久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

如何通過 pandas 或火花數(shù)據(jù)框刪除所有行中具有相

問題描述

推薦答案

相關(guān)文檔推薦