問題描述
我正在創(chuàng)建一個(gè)允許用戶提交報(bào)價(jià)的網(wǎng)站.我將如何創(chuàng)建一個(gè)(相對(duì)簡(jiǎn)單?)返回最相關(guān)引號(hào)的搜索?
I'm creating a site that allows users to submit quotes. How would I go about creating a (relatively simple?) search that returns the most relevant quotes?
例如,如果搜索詞是turkey",那么我會(huì)返回turkey"這個(gè)詞出現(xiàn)兩次的引號(hào),然后再返回它只出現(xiàn)一次的引號(hào).
For example, if the search term was "turkey" then I'd return quotes where the word "turkey" appears twice before quotes where it only appears once.
(我會(huì)添加一些其他規(guī)則來幫助過濾掉不相關(guān)的結(jié)果,但我主要關(guān)心的是.)
(I would add a few other rules to help filter out irrelevant results, but my main concern is that.)
推薦答案
每個(gè)人都建議使用 MySQL 全文搜索,但是您應(yīng)該注意一個(gè)巨大的警告.全文搜索引擎僅適用于 MyISAM 引擎(不適用于 InnoDB,后者因其引用完整性和 ACID 合規(guī)性而成為最常用的引擎).
Everyone is suggesting MySQL fulltext search, however you should be aware of a HUGE caveat. The Fulltext search engine is only available for the MyISAM engine (not InnoDB, which is the most commonly used engine due to its referential integrity and ACID compliance).
所以你有幾個(gè)選擇:
1.粒子樹概述了最簡(jiǎn)單的方法.您實(shí)際上可以從純 SQL 中獲得排名搜索(沒有全文,什么也沒有).下面的 SQL 查詢將搜索表并根據(jù)搜索字段中字符串出現(xiàn)的次數(shù)對(duì)結(jié)果進(jìn)行排名:
1. The simplest approach is outlined by Particle Tree. You can actaully get ranked searches off of pure SQL (no fulltext, no nothing). The SQL query below will search a table and rank results based off the number of occurrences of a string in the search fields:
SELECT
SUM(((LENGTH(p.body) - LENGTH(REPLACE(p.body, 'term', '')))/4) +
((LENGTH(p.body) - LENGTH(REPLACE(p.body, 'search', '')))/6))
AS Occurrences
FROM
posts AS p
GROUP BY
p.id
ORDER BY
Occurrences DESC
編輯了他們的示例以提供更清晰的內(nèi)容
上述 SQL 查詢的變體,添加 WHERE 語句(WHERE p.body LIKE '%whatever%you%want')等可能會(huì)得到您所需要的.
Variations on the above SQL query, adding WHERE statements (WHERE p.body LIKE '%whatever%you%want'), etc. will probably get you exactly what you need.
2. 您可以更改數(shù)據(jù)庫架構(gòu)以支持全文.通常采取什么措施來保持 InnoDB 引用完整性、ACID 合規(guī)性和速度,而無需安裝諸如 Sphinx 全文搜索引擎 for MySQL 是將報(bào)價(jià)數(shù)據(jù)拆分到它自己的表中.基本上你會(huì)有一個(gè)表 Quotes 是一個(gè) InnoDB 表,而不是你的 TEXT 字段數(shù)據(jù)",你有一個(gè)引用quote_data_id",它指向 Quote_Data 表上的 ID,它是一個(gè) MyISAM 表.你可以在 MyISAM 表上做你的全文,加入與你的 InnoDB 表一起返回的 ID,瞧你有你的結(jié)果.
2. You can alter your database schema to support full text. Often what is done to keep the InnoDB referential integrity, ACID compliance, and speed without having to install plugins like Sphinx Fulltext Search Engine for MySQL is to split the quote data into it's own table. Basically you would have a table Quotes that is an InnoDB table that, rather than having your TEXT field "data" you have a reference "quote_data_id" which points to the ID on a Quote_Data table which is a MyISAM table. You can do your fulltext on the MyISAM table, join the IDs returned with your InnoDB tables and voila you have your results.
3. 安裝 Sphinx.祝你好運(yùn).
鑒于您的描述,我強(qiáng)烈建議您采用我介紹的第一種方法,因?yàn)槟幸粋€(gè)簡(jiǎn)單的數(shù)據(jù)庫驅(qū)動(dòng)站點(diǎn).第一個(gè)解決方案很簡(jiǎn)單,可以快速完成工作.Lucene 設(shè)置起來會(huì)很麻煩,特別是如果您想將它與數(shù)據(jù)庫集成,因?yàn)?Lucene 主要用于索引文件而不是數(shù)據(jù)庫.Google 自定義站點(diǎn)搜索只會(huì)讓您的站點(diǎn)名譽(yù)掃地(讓您看起來很業(yè)余和被黑),而 MySQL 全文很可能會(huì)導(dǎo)致您更改數(shù)據(jù)庫架構(gòu).
Given what you described, I would HIGHLY recommend you take the 1st approach I presented since you have a simple database driven site. The 1st solution is simple, gets the job done quickly. Lucene will be a bitch to setup especially if you want to integrate it with the database as Lucene is designed mainly to index files not databases. Google custom site search just makes your site lose tons of reputation (makes you look amateurish and hacked), and MySQL fulltext will most likely cause you to alter your database schema.
這篇關(guān)于我將如何使用 php 和 mySQL 實(shí)現(xiàn)簡(jiǎn)單的站點(diǎn)搜索?的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!