問題描述
我有一個(gè)帶有 textarea 的表單.用戶輸入存儲(chǔ)在數(shù)據(jù)庫中的文本塊.
有時(shí),用戶會(huì)粘貼 Word 中包含智能引號(hào)或短劃線的文本.這些字符在數(shù)據(jù)庫中顯示為:–、–、–、–
我應(yīng)該在輸入字符串上調(diào)用什么函數(shù)來將智能引號(hào)轉(zhuǎn)換為常規(guī)引號(hào)并將短劃線轉(zhuǎn)換為常規(guī)短劃線?
我在 PHP 工作.
更新:感謝您到目前為止的所有精彩回復(fù).Joel 網(wǎng)站上關(guān)于編碼的頁面非常有用:http://www.joelonsoftware.com/articles/Unicode.html
關(guān)于我的環(huán)境的一些說明:
MySQL 數(shù)據(jù)庫使用 UTF-8 編碼.同樣,顯示內(nèi)容的 HTML 頁面通過顯式設(shè)置元內(nèi)容類型使用 UTF-8(更新:).
在這些頁面上,智能引號(hào)和短劃線顯示為帶問號(hào)的菱形.
解決方案:
再次感謝您的回復(fù).解決方案是雙重的:
- 確保數(shù)據(jù)庫和 HTML文件被明確設(shè)置為使用UTF-8 編碼.
- 使用
htmlspecialchars()
而不是htmlentities()
.
這聽起來像是 Unicode 問題.Joel Spolsky 在這個(gè)主題上有一個(gè)很好的起點(diǎn):http://www.joelonsoftware.com/articles/Unicode.html
I have a form with a textarea. Users enter a block of text which is stored in a database.
Occasionally a user will paste text from Word containing smart quotes or emdashes. Those characters appear in the database as: a€", a€?, a€? ,a€
What function should I call on the input string to convert smart quotes to regular quotes and emdashes to regular dashes?
I am working in PHP.
Update: Thanks for all of the great responses so far. The page on Joel's site about encodings is very informative: http://www.joelonsoftware.com/articles/Unicode.html
Some notes on my environment:
The MySQL database is using UTF-8 encoding. Likewise, the HTML pages that display the content are using UTF-8 (Update:) by explicitly setting the meta content-type.
On those pages the smart quotes and emdashes appear as a diamond with question mark.
Solution:
Thanks again for the responses. The solution was twofold:
- Make sure the database and HTML files were explicitly set to use UTF-8 encoding.
- Use
htmlspecialchars()
instead ofhtmlentities()
.
This sounds like a Unicode issue. Joel Spolsky has a good jumping off point on the topic: http://www.joelonsoftware.com/articles/Unicode.html
這篇關(guān)于如何在字符串中轉(zhuǎn)換 Word 智能引號(hào)和破折號(hào)?的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!