問題描述
我有一個(gè)腳本將多個(gè)文件合并為一個(gè),當(dāng)其中一個(gè)文件具有 UTF8 編碼時(shí)它會中斷.我想我應(yīng)該在讀取文件時(shí)使用 utf8_decode()
函數(shù),但我不知道如何判斷哪個(gè)需要解碼.
I have a script which combines a number of files into one, and it breaks when one of the files has UTF8 encoding. I figure that I should be using the utf8_decode()
function when reading the files, but I don't know how to tell which need decoding.
我的代碼基本上是:
$output = '';
foreach ($files as $filename) {
$output .= file_get_contents($filename) . "
";
}
file_put_contents('combined.txt', $output);
目前,在 UTF8 文件的開頭,它會在輸出中添加以下字符:???
Currently, at the start of a UTF8 file, it adds these characters in the output: ???
推薦答案
嘗試使用 mb_detect_encoding
函數(shù).此函數(shù)將檢查您的字符串并嘗試猜測"其編碼是什么.然后,您可以根據(jù)需要對其進(jìn)行轉(zhuǎn)換.但是,正如 brulak 建議的,您最好轉(zhuǎn)換to UTF-8 而不是 from,以保留您正在傳輸?shù)臄?shù)據(jù).
Try using the mb_detect_encoding
function. This function will examine your string and attempt to "guess" what its encoding is. You can then convert it as desired. As brulak suggested, however, you're probably better off converting to UTF-8 rather than from, to preserve the data you're transmitting.
這篇關(guān)于檢測PHP中的文件編碼的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!