問題描述
我有一個(gè)看起來(lái)像這樣的列表:
I have a list of lists that looks like:
c = [['470', '4189.0', 'asdfgw', 'fds'],
['470', '4189.0', 'qwer', 'fds'],
['470', '4189.0', 'qwer', 'dsfs fdv']
...]
c
有大約 30,000 個(gè)內(nèi)部列表.我想做的是根據(jù)每個(gè)內(nèi)部列表中的第 4 項(xiàng)消除重復(fù)項(xiàng).所以上面的列表看起來(lái)像:
c
has about 30,000 interior lists. What I'd like to do is eliminate duplicates based on the 4th item on each interior list. So the list of lists above would look like:
c = [['470', '4189.0', 'asdfgw', 'fds'],['470', '4189.0', 'qwer', 'dsfs fdv'] ...]
這是我目前所擁有的:
d = [] #list that will contain condensed c
d.append(c[0]) #append first element, so I can compare lists
for bact in c: #c is my list of lists with 30,000 interior list
for items in d:
if bact[3] != items[3]:
d.append(bact)
我認(rèn)為這應(yīng)該可行,但它只是運(yùn)行和運(yùn)行.我讓它運(yùn)行了 30 分鐘,然后殺死了它.我不認(rèn)為程序應(yīng)該花這么長(zhǎng)時(shí)間,所以我猜我的邏輯有問題.
I think this should work, but it just runs and runs. I let it run for 30 minutes, then killed it. I don't think the program should take so long, so I'm guessing there is something wrong with my logic.
我覺得創(chuàng)建一個(gè)全新的列表非常愚蠢.任何幫助將不勝感激,請(qǐng)?jiān)谖覍W(xué)習(xí)時(shí)隨時(shí)挑剔.如果我的詞匯不正確,請(qǐng)更正我的詞匯.
I have a feeling that creating a whole new list of lists is pretty stupid. Any help would be much appreciated, and please feel free to nitpick as I am learning. Also please correct my vocabulary if it is incorrect.
推薦答案
我會(huì)這樣做:
seen = set()
cond = [x for x in c if x[3] not in seen and not seen.add(x[3])]
解釋:
seen
是一個(gè)跟蹤每個(gè)子列表中已經(jīng)遇到的第四個(gè)元素的集合.cond
是精簡(jiǎn)列表.如果 x[3]
(其中 x
是 c
中的子列表)不在 seen
中,則 x
將被添加到 cond
并且 x[3]
將被添加到 seen
.
seen
is a set which keeps track of already encountered fourth elements of each sublist.
cond
is the condensed list. In case x[3]
(where x
is a sublist in c
) is not in seen
, x
will be added to cond
and x[3]
will be added to seen
.
seen.add(x[3])
將返回 None
,因此 not seen.add(x[3])
將始終為 True
,但只有當(dāng) x[3] not in seen
為 True
時(shí)才會(huì)評(píng)估該部分,因?yàn)?Python 使用短路評(píng)估.如果第二個(gè)條件得到評(píng)估,它將始終返回 True
并具有將 x[3]
添加到 seen
的副作用.這是正在發(fā)生的另一個(gè)示例(print
返回 None
并具有打印某些內(nèi)容的副作用"):
seen.add(x[3])
will return None
, so not seen.add(x[3])
will always be True
, but that part will only be evaluated if x[3] not in seen
is True
since Python uses short circuit evaluation. If the second condition gets evaluated, it will always return True
and have the side effect of adding x[3]
to seen
. Here's another example of what's happening (print
returns None
and has the "side-effect" of printing something):
>>> False and not print('hi')
False
>>> True and not print('hi')
hi
True
這篇關(guān)于根據(jù)每個(gè)子列表中的第三項(xiàng)刪除列表列表中的重復(fù)項(xiàng)的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!