問題描述
我正在嘗試使用 preg_match<搜索 UTF8 編碼的字符串/a>.
I'm trying to search a UTF8-encoded string using preg_match.
preg_match('/H/u', "xC2xA1Hola!", $a_matches, PREG_OFFSET_CAPTURE);
echo $a_matches[0][1];
這應該打印 1,因為H"在字符串?Hola!"的索引 1 處.但它打印 2.所以它似乎沒有將主題視為 UTF8 編碼的字符串,即使我正在傳遞 "u" 修飾符.
This should print 1, since "H" is at index 1 in the string "?Hola!". But it prints 2. So it seems like it's not treating the subject as a UTF8-encoded string, even though I'm passing the "u" modifier in the regular expression.
我的 php.ini 中有以下設置,并且其他 UTF8 函數正在運行:
I have the following settings in my php.ini, and other UTF8 functions are working:
mbstring.func_overload = 7
mbstring.language = Neutral
mbstring.internal_encoding = UTF-8
mbstring.http_input = pass
mbstring.http_output = pass
mbstring.encoding_translation = Off
有什么想法嗎?
推薦答案
看起來這是一個功能",見http://bugs.php.net/bug.php?id=37391
Looks like this is a "feature", see http://bugs.php.net/bug.php?id=37391
'u' 開關只對 pcre 有意義,PHP 本身并不知道.
'u' switch only makes sense for pcre, PHP itself is unaware of it.
從 PHP 的角度來看,字符串是字節序列,返回字節偏移似乎是合乎邏輯的(我不是說正確").
From PHP's point of view, strings are byte sequences and returning byte offset seems logical (i don't say "correct").
這篇關于PHP 中的 preg_match 和 UTF-8的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!