問題描述
假設我需要插入以下文檔:
Assume that I need to insert the following document:
{
title: 'Péter'
}
(注意é)
當我使用以下 PHP 代碼時,它給了我一個錯誤......:
It gives me an error when I use the following PHP-code ... :
$db->collection->insert(array("title" => "Péter"));
... 因為它需要是 utf-8.
... because it needs to be utf-8.
所以我應該使用這行代碼:
So I should use this line of code:
$db->collection->insert(array("title" => utf8_encode("Péter")));
現在,當我請求文檔時,我仍然需要對其進行解碼... :
Now, when I request the document, I still have to decode it ... :
$document = $db->collection->findOne(array("_id" => new MongoId("__someID__")));
$title = utf8_decode($document['title']);
有什么方法可以使這個過程自動化?我可以更改 MongoDB 的字符編碼嗎(我正在遷移使用 cp1252 West Europe (latin1) 的 MySQL 數據庫?
Is there some way to automate this process? Can I change the character-encoding of MongoDB (I'm migrating a MySQL-database that's using cp1252 West Europe (latin1)?
我已經考慮過更改 Content-Type-header,問題是所有靜態字符串(硬編碼)都不是 utf8...
I already considered changing the Content-Type-header, problem is that all static strings (hardcoded) aren't utf8...
提前致謝!提姆
推薦答案
JSON 和 BSON 只能編碼/解碼有效的 UTF-8 字符串,如果您的數據(包括輸入)不是 UTF-8 則需要在傳遞之前對其進行轉換它到任何 JSON 依賴系統,像這樣:
JSON and BSON can only encode / decode valid UTF-8 strings, if your data (included input) is not UTF-8 you need to convert it before passing it to any JSON dependent system, like this:
$string = iconv('UTF-8', 'UTF-8//IGNORE', $string); // or
$string = iconv('UTF-8', 'UTF-8//TRANSLIT', $string); // or even
$string = iconv('UTF-8', 'UTF-8//TRANSLIT//IGNORE', $string); // not sure how this behaves
我個人更喜歡第一個選項,請參閱iconv()
手冊頁.其他替代方案包括:
Personally I prefer the first option, see the iconv()
manual page. Other alternatives include:
mb_convert_encoding()
一個>utf8_encode(utf8_decode($string))
您應該始終確保您的字符串是 UTF-8 編碼的,即使是用戶提交的字符串,但是既然您提到要從 MySQL 遷移到 MongoDB,您是否嘗試過將當前數據庫導出到 CSV 并使用導入Mongo 附帶的腳本?他們應該處理這個...
You should always make sure your strings are UTF-8 encoded, even the user-submitted ones, however since you mentioned that you're migrating from MySQL to MongoDB, have you tried exporting your current database to CSV and using the import scripts that come with Mongo? They should handle this...
我提到 BSON 只能處理 UTF-8,但我不確定這是否完全正確,我有一個模糊的想法 BSON 使用 UTF-16 或 UTF-32編碼/解碼數據,但我現在無法檢查.
I mentioned that BSON can only handle UTF-8, but I'm not sure if this is exactly true, I have a vague idea that BSON uses UTF-16 or UTF-32 to encode / decode data, but I can't check now.
這篇關于MongoDB PHP UTF-8 問題的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!