問題描述
我在編寫一些帶注釋的 PHP 類時偶然發現了一個問題.我的名字(對于@author 標簽)以 ?
(這是一個 UTF-8 字符,...還有一個奇怪的名字,我知道)結束.
I was writing some commented PHP classes and I stumbled upon a problem. My name (for the @author tag) ends up with a ?
(which is a UTF-8 character, ...and a strange name, I know).
即使我將文件保存為 UTF-8,一些朋友報告說他們看到該字符完全混亂 (è?
).通過添加 BOM 簽名,這個問題就消失了.但那件事讓我有點困擾,因為我對此知之甚少,除了我在 Wikipedia 上看到的內容以及 SO 上的其他一些類似問題.
Even though I save the file as UTF-8, some friends reported that they see that character totally messed up (è?
). This problem goes away by adding the BOM signature. But that thing troubles me a bit, since I don't know that much about it, except from what I saw on Wikipedia and on some other similar questions here on SO.
我知道它在文件的開頭添加了一些東西,據我所知,它并沒有那么糟糕,但我很擔心,因為我讀到的唯一有問題的場景涉及 PHP 文件.由于我正在編寫 PHP 類來共享它們,因此 100% 兼容比在評論中顯示我的名字更重要.
I know that it adds some things at the beginning of the file, and from what I understood it's not that bad, but I'm concerned because the only problematic scenarios I read about involved PHP files. And since I'm writing PHP classes to share them, being 100% compatible is more important than having my name in the comments.
但我正在嘗試了解其含義,我應該使用它而不用擔心嗎?或者是否有可能造成損壞的情況?什么時候?
But I'm trying to understand the implications, should I use it without worrying? or are there cases when it might cause damage? When?
推薦答案
確實,BOM 是發送到瀏覽器的實際數據.瀏覽器會很樂意忽略它,但您仍然無法發送標頭.
Indeed, the BOM is actual data sent to the browser. The browser will happily ignore it, but still you cannot send headers then.
我相信問題確實出在您和您朋友的編輯器設置上.如果沒有 BOM,您朋友的編輯器可能不會自動將文件識別為 UTF-8.他可以嘗試設置他的編輯器,使編輯器期望一個文件為 UTF-8(如果您使用真正的 IDE,例如 NetBeans,那么這甚至可以成為一個項目設置,您可以隨code一起轉).
I believe the problem really is your and your friend's editor settings. Without a BOM, your friend's editor may not automatically recognize the file as UTF-8. He can try to set up his editor such that the editor expects a file to be in UTF-8 (if you use a real IDE such as NetBeans, then this can even be made a project setting that you can transfer along with the code).
另一種方法是嘗試一些技巧:一些編輯器嘗試根據輸入的文本使用一些啟發式方法來確定編碼.你可以嘗試用
An alternative is to try some tricks: some editors try to determine the encoding using some heuristics based on the entered text. You could try to start each file with
<?php //úτ?-8 encoded
也許啟發式會得到它.可能有更好的東西可以放在那里,你可以谷歌搜索什么樣的編碼檢測啟發式是常見的,或者只是嘗試一些:-)
and maybe the heuristic will get it. There's probably better stuff to put there, and you can either google for what kind of encoding detection heuristics are common, or just try some out :-)
總而言之,我建議只修復編輯器設置.
All in all, I recommend just fixing the editor settings.
哦等等,我誤讀了最后一部分:為了將代碼傳播到任何地方,我想你最安全的方法是讓所有文件只包含低 7 位字符,即純 ASCII,或者只是接受一些人古代編輯看到你寫的名字很有趣.沒有萬無一失的方法.由于標題已經發送,BOM 肯定是壞的.另一方面,只要你只在注釋中放 UTF-8 字符等等,一些編輯誤解編碼的唯一影響就是奇怪的字符.我會正確拼寫您的名字并添加針對啟發式的評論,以便大多數編輯都能理解,但總會有人看到虛假字符.
Oh wait, I misread the last part: for spreading the code to anywhere, I guess you're safest just making all files only contain the lower 7-bit characters, i.e. plain ASCII, or to just accept that some people with ancient editors see your name written funny. There is no fail-safe way. The BOM is definitely bad because of the headers already sent thing. On the other side, as long as you only put UTF-8 characters in comments and so, the only impact of some editor misunderstanding the encoding is weird characters. I'd go for correctly spelling your name and adding a comment targeted at heuristics so that most editors will get it, but there will always be people who'll see bogus chars instead.
這篇關于PHP 文件中的 UTF-8 BOM 簽名的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!