問題描述
我有兩個列表,比如說:
I have two lists, let's say:
keys1 = ['A', 'B', 'C', 'D', 'E', 'H', 'I']
keys2 = ['A', 'B', 'E', 'F', 'G', 'H', 'J', 'K']
如何創建一個沒有重復的合并列表,保留兩個列表的順序,在它們所屬的位置插入缺失的元素?像這樣:
How do I create a merged list without duplicates that preserve the order of both lists, inserting the missing elements where they belong? Like so:
merged = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K']
請注意,元素可以根據相等性進行比較,但不是有序的(它們是復雜的字符串).元素不能通過比較來排序,但它們有一個基于它們在原始列表中出現的順序.
Note that the elements can be compared against equality but not ordered (they are complex strings). The elements can't be ordered by comparing them, but they have an order based on their occurrence in the original lists.
如果出現矛盾(兩個輸入列表中的順序不同),任何包含所有元素的輸出都是有效的.當然,如果解決方案在保留大部分訂單方面顯示出常識",則可以加分.
In case of contradiction (different order in both input lists), any output containing all elements is valid. Of course with bonus points if the solution shows 'common sense' in preserving most of the order.
再一次(正如一些評論仍然爭論的那樣),列表通常不會在公共元素的順序方面相互矛盾.如果他們這樣做了,算法需要優雅地處理該錯誤.
Again (as some comments still argue about it), the lists normally don't contradict each other in terms of the order of the common elements. In case they do, the algorithm needs to handle that error gracefully.
我從一個版本開始,它使用 .next() 遍歷列表以僅推進包含不匹配元素的列表,但 .next() 只是不知道何時停止.
I started with a version that iterates over the lists with .next() to advance just the list containing the unmatched elements, but .next() just doesn't know when to stop.
merged = []
L = iter(keys1)
H = iter(keys2)
l = L.next()
h = H.next()
for i in range(max(len(keys1, keys2))):
if l == h:
if l not in merged:
merged.append(l)
l = L.next()
h = H.next()
elif l not in keys2:
if l not in merged:
merged.append(l)
l = L.next()
elif h not in keys1:
if h not in merged:
merged.append(h)
h = H.next()
else: # just in case the input is badly ordered
if l not in merged:
merged.append(l)
l = L.next()
if h not in merged:
merged.append(h)
h = H.next()
print merged
這顯然不起作用,因為 .next() 會導致最短列表異常.現在我可以更新我的代碼以在每次調用 .next() 時捕獲該異常.但是代碼已經很不符合pythonic了,這顯然會破滅泡沫.
This obviously doesn't work, as .next() will cause an exception for the shortest list. Now I could update my code to catch that exception every time I call .next(). But the code already is quite un-pythonic and this would clearly burst the bubble.
有沒有人更好地了解如何遍歷這些列表以組合元素?
Does anyone have a better idea of how to iterate over those lists to combine the elements?
如果我可以一次完成三個列表,則可以獲得獎勵積分.
Bonus points if I can do it for three lists in one go.
推薦答案
您需要的基本上是任何合并實用程序所做的:它嘗試合并兩個序列,同時保持每個序列的相對順序.您可以使用 Python 的 difflib
模塊來區分這兩個序列,并將它們合并:
What you need is basically what any merge utility does: It tries to merge two sequences, while keeping the relative order of each sequence. You can use Python's difflib
module to diff the two sequences, and merge them:
from difflib import SequenceMatcher
def merge_sequences(seq1,seq2):
sm=SequenceMatcher(a=seq1,b=seq2)
res = []
for (op, start1, end1, start2, end2) in sm.get_opcodes():
if op == 'equal' or op=='delete':
#This range appears in both sequences, or only in the first one.
res += seq1[start1:end1]
elif op == 'insert':
#This range appears in only the second sequence.
res += seq2[start2:end2]
elif op == 'replace':
#There are different ranges in each sequence - add both.
res += seq1[start1:end1]
res += seq2[start2:end2]
return res
例子:
>>> keys1 = ['A', 'B', 'C', 'D', 'E', 'H', 'I']
>>> keys2 = ['A', 'B', 'E', 'F', 'G', 'H', 'J', 'K']
>>> merge_sequences(keys1, keys2)
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K']
請注意,您期望的答案不一定是唯一可能的答案.例如,如果我們在這里改變序列的順序,我們會得到另一個同樣有效的答案:
Note that the answer you expect is not necessarily the only possible one. For example, if we change the order of sequences here, we get another answer which is just as valid:
>>> merge_sequences(keys2, keys1)
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'I']
這篇關于交錯不同長度的列表,消除重復,并保持順序的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!