問題描述
我想獲取給定 UTF-8 字符串的 UCS-2 代碼點(diǎn).例如,單詞hello"應(yīng)該變成0068 0065 006C 006C 006F".請(qǐng)注意,字符可以來自任何語言,包括復(fù)雜的腳本,如東亞語言.
I want to get the UCS-2 code points for a given UTF-8 string. For example the word "hello" should become something like "0068 0065 006C 006C 006F". Please note that the characters could be from any language including complex scripts like the east asian languages.
因此,問題歸結(jié)為將給定字符轉(zhuǎn)換為其 UCS-2 代碼點(diǎn)"
So, the problem comes down to "convert a given character to its UCS-2 code point"
但是怎么樣?拜托,任何形式的幫助都將非常感謝,因?yàn)槲液苤?
But how? Please, any kind of help will be very very much appreciated since I am in a great hurry.
作為回答發(fā)布的提問者回復(fù)的轉(zhuǎn)錄
感謝您的回復(fù),但需要在 PHP v 4 或 5 而不是 6 中完成.
Thanks for your reply, but it needs to be done in PHP v 4 or 5 but not 6.
該字符串將是來自表單字段的用戶輸入.
The string will be a user input, from a form field.
我想實(shí)現(xiàn) utf8to16 或 utf8decode 之類的 PHP 版本
I want to implement a PHP version of utf8to16 or utf8decode like
function get_ucs2_codepoint($char)
{
// calculation of ucs2 codepoint value and assign it to $hex_codepoint
return $hex_codepoint;
}
你能幫我用 PHP 還是用上面提到的版本的 PHP 來完成?
Can you help me with PHP or can it be done with PHP with version mentioned above?
推薦答案
Scott Reynen 編寫了一個(gè)函數(shù)來將 UTF-8 轉(zhuǎn)換為 Unicode.我發(fā)現(xiàn)它在查看 PHP 文檔.
Scott Reynen wrote a function to convert UTF-8 into Unicode. I found it looking at the PHP documentation.
function utf8_to_unicode( $str ) {
$unicode = array();
$values = array();
$lookingFor = 1;
for ($i = 0; $i < strlen( $str ); $i++ ) {
$thisValue = ord( $str[ $i ] );
if ( $thisValue < ord('A') ) {
// exclude 0-9
if ($thisValue >= ord('0') && $thisValue <= ord('9')) {
// number
$unicode[] = chr($thisValue);
}
else {
$unicode[] = '%'.dechex($thisValue);
}
} else {
if ( $thisValue < 128)
$unicode[] = $str[ $i ];
else {
if ( count( $values ) == 0 ) $lookingFor = ( $thisValue < 224 ) ? 2 : 3;
$values[] = $thisValue;
if ( count( $values ) == $lookingFor ) {
$number = ( $lookingFor == 3 ) ?
( ( $values[0] % 16 ) * 4096 ) + ( ( $values[1] % 64 ) * 64 ) + ( $values[2] % 64 ):
( ( $values[0] % 32 ) * 64 ) + ( $values[1] % 64 );
$number = dechex($number);
$unicode[] = (strlen($number)==3)?"%u0".$number:"%u".$number;
$values = array();
$lookingFor = 1;
} // if
} // if
}
} // for
return implode("",$unicode);
} // utf8_to_unicode
這篇關(guān)于如何獲取 utf-8 字符串中給定字符的代碼點(diǎn)編號(hào)?的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!