問題描述
有沒有辦法防止 json_encode()
為包含無效(非 UTF-8)字符的字符串返回 null
?
Is there a way to keep json_encode()
from returning null
for a string that contains an invalid (non-UTF-8) character?
在復雜的系統中調試可能會很麻煩.實際看到無效字符或至少將其省略會更合適.就目前而言,json_encode()
將靜默刪除整個字符串.
It can be a pain in the ass to debug in a complex system. It would be much more fitting to actually see the invalid character, or at least have it omitted. As it stands, json_encode()
will silently drop the entire string.
示例(UTF-8):
$string =
array(utf8_decode("Düsseldorf"), // Deliberately produce broken string
"Washington",
"Nairobi");
print_r(json_encode($string));
結果
[null,"Washington","Nairobi"]
想要的結果:
["D?sseldorf","Washington","Nairobi"]
注意:我不希望讓損壞的字符串在 json_encode() 中起作用.我正在尋找更容易診斷編碼錯誤的方法.null
字符串對此沒有幫助.
Note: I am not looking to make broken strings work in json_encode(). I am looking for ways to make it easier to diagnose encoding errors. A null
string isn't helpful for that.
推薦答案
php 確實會嘗試拋出錯誤,但僅當您關閉 display_errors 時.這很奇怪,因為 display_errors
設置僅用于控制是否將錯誤打印到標準輸出,而不是是否觸發錯誤.我想強調的是,當您打開 display_errors
時,即使您可能會看到各種其他 php 錯誤,php 不僅會隱藏此錯誤,它甚至不會觸發它.這意味著它不會出現在任何錯誤日志中,也不會調用任何自定義的 error_handlers.錯誤永遠不會發生.
php does try to spew an error, but only if you turn display_errors off. This is odd because the display_errors
setting is only meant to control whether or not errors are printed to standard output, not whether or not an error is triggered. I want to emphasize that when you have display_errors
on, even though you may see all kinds of other php errors, php doesn't just hide this error, it will not even trigger it. That means it will not show up in any error logs, nor will any custom error_handlers get called. The error just never occurs.
這里有一些代碼可以證明這一點:
Here's some code that demonstrates this:
error_reporting(-1);//report all errors
$invalid_utf8_char = chr(193);
ini_set('display_errors', 1);//display errors to standard output
var_dump(json_encode($invalid_utf8_char));
var_dump(error_get_last());//nothing
ini_set('display_errors', 0);//do not display errors to standard output
var_dump(json_encode($invalid_utf8_char));
var_dump(error_get_last());// json_encode(): Invalid UTF-8 sequence in argument
這種奇怪而不幸的行為與此錯誤有關 https://bugs.php.net/bug.php?id=47494 和其他一些,而且看起來永遠不會被修復.
That bizarre and unfortunate behavior is related to this bug https://bugs.php.net/bug.php?id=47494 and a few others, and doesn't look like it will ever be fixed.
解決方法:
在將字符串傳遞給 json_encode 之前清理字符串可能是一個可行的解決方案.
Cleaning the string before passing it to json_encode may be a workable solution.
$stripped_of_invalid_utf8_chars_string = iconv('UTF-8', 'UTF-8//IGNORE', $orig_string);
if ($stripped_of_invalid_utf8_chars_string !== $orig_string) {
// one or more chars were invalid, and so they were stripped out.
// if you need to know where in the string the first stripped character was,
// then see http://stackoverflow.com/questions/7475437/find-first-character-that-is-different-between-two-strings
}
$json = json_encode($stripped_of_invalid_utf8_chars_string);
http://php.net/manual/en/function.iconv.php
說明書上說
//IGNORE
靜默丟棄目標中的非法字符字符集.
//IGNORE
silently discards characters that are illegal in the target charset.
所以首先刪除有問題的字符,理論上 json_encode() 不應該得到任何它會窒息和失敗的東西.我還沒有驗證帶有 //IGNORE
標志的 iconv 的輸出與有效 utf8 字符是什么的 json_encodes 概念完全兼容,所以買家要當心......因為可能存在邊緣情況仍然失敗.呃,我討厭字符集問題.
So by first removing the problematic characters, in theory json_encode() shouldnt get anything it will choke on and fail with. I haven't verified that the output of iconv with the //IGNORE
flag is perfectly compatible with json_encodes notion of what valid utf8 characters are, so buyer beware...as there may be edge cases where it still fails. ugh, I hate character set issues.
編輯
在 php 7.2+ 中,json_encode
似乎有一些新標志:JSON_INVALID_UTF8_IGNORE
和 JSON_INVALID_UTF8_SUBSTITUTE
目前還沒有太多文檔,但就目前而言,此測試應該可以幫助您了解預期行為:https://github.com/php/php-src/blob/master/ext/json/tests/json_encode_invalid_utf8.phpt
Edit
in php 7.2+, there seems to be some new flags for json_encode
:
JSON_INVALID_UTF8_IGNORE
and JSON_INVALID_UTF8_SUBSTITUTE
There's not much documentation yet, but for now, this test should help you understand expected behavior:
https://github.com/php/php-src/blob/master/ext/json/tests/json_encode_invalid_utf8.phpt
而且,在 php 7.3+ 中有新標志 JSON_THROW_ON_ERROR
.參見 http://php.net/manual/en/class.jsonexception.php
And, in php 7.3+ there's the new flag JSON_THROW_ON_ERROR
. See http://php.net/manual/en/class.jsonexception.php
這篇關于如何防止 json_encode() 刪除包含無效字符的字符串的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!