問題描述
我的應(yīng)用程序中有經(jīng)過身份驗(yàn)證的用戶可以訪問包含多達(dá) 500,000 個(gè)項(xiàng)目的共享數(shù)據(jù)庫(kù).每個(gè)用戶都有自己的面向公眾的網(wǎng)站,并且需要能夠優(yōu)先考慮在他們自己的網(wǎng)站上展示的項(xiàng)目(想想贊成).
I have authenticated users in my application who have access to a shared database of up to 500,000 items. Each of the users has their own public facing web site and needs the ability to prioritize the items on display (think upvote) on their own site.
在 500,000 個(gè)項(xiàng)目中,他們最多可能只有 200 個(gè)優(yōu)先項(xiàng)目,其余項(xiàng)目的順序不太重要.
out of the 500,000 items they may only have up to 200 prioritized items, the order of the rest of the items is of less importance.
每個(gè)用戶對(duì)項(xiàng)目的優(yōu)先級(jí)不同.
Each of the users will prioritize the items differently.
我最初在這里問了一個(gè)類似的 mysql 問題 Mysql 結(jié)果按每個(gè)用戶唯一的列表排序 并得到了很好的答案,但我相信更好的選擇可能是選擇非 sql 索引解決方案.
I initially asked a similar mysql question here Mysql results sorted by list which is unique for each user and got a good answer but i believe a better option may be to opt for a non sql indexed solution.
這可以在 Lucene 中完成嗎?是否有另一種搜索技術(shù)會(huì)更好.
Can this be done in Lucene?, is there another search technology which would be better for this.
ps.Google 對(duì)其搜索結(jié)果實(shí)施了類似的類型設(shè)置,如果您已登錄,您可以在其中優(yōu)先考慮和排除您自己的搜索結(jié)果.
ps. Google implements a similar type setup with their search results where you can prioritize and exclude your own search results if you are logged in.
更新:在我閱讀文檔時(shí)重新標(biāo)記為 sphinx,我相信它可能能夠通過存儲(chǔ)在內(nèi)存中的每個(gè)文檔屬性值"來完成我正在尋找的事情 - 有興趣聽到對(duì)此的任何反饋來自斯芬克斯大師
Update: re-tagged with sphinx as i have been reading the documentation and i believe it may be able to do what i am looking for with "per-document attribute values" stored in memory - interested to hear any feedback on this from sphinx gurus
推薦答案
在構(gòu)建索引時(shí),您肯定希望將 item 的 id 存儲(chǔ)在每個(gè)文檔對(duì)象中.有幾種方法可以進(jìn)行下一步,但一種簡(jiǎn)單的方法是獲取優(yōu)先項(xiàng)并將它們添加到您的搜索查詢中,對(duì)于每個(gè)特殊項(xiàng)如下所示:
You'll definitely want to store the id of item in each document object when building your index. There's a few ways to do the next step, but an easy one would be take the prioritized items and add them to your search query, something like this for each special item:
"OR item_id=%d+X"
其中 X 是您想要使用的提升量.您可能需要根據(jù)經(jīng)驗(yàn)調(diào)整此數(shù)字,以確保僅被點(diǎn)贊"不會(huì)將其置于搜索完全不相關(guān)內(nèi)容的列表的頂部.
where X is the amount of boost you'd like to use. You'll probably need to empirically tweak this number to make sure that just being "upvoted" doesn't put it to the top of a list searching for something totally unrelated.
這樣做至少可以避免很多煩人的后處理步驟,這些步驟需要您遍歷整個(gè)結(jié)果集——希望從查詢索引開始就可以進(jìn)行正確的排序.
Doing it this way will at least prevent you from a lot of annoying postprocessing steps that would require you to iterate over the whole result set -- hopefully the proper sorting will be there right from querying the index.
這篇關(guān)于Lucene 搜索結(jié)果按自定義訂單列表排序(每個(gè)用戶唯一)的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!