問題描述
我正在嘗試解析包含符合 XML 1.1 規(guī)范的 XML 內(nèi)容的字符串一個(gè)>.XML 包含在 XML 1.0 規(guī)范中不允許但在 XML 1.1 規(guī)范中允許的字符引用(字符引用轉(zhuǎn)換為 U+0001–U+001F 范圍內(nèi)的 Unicode 字符).
I'm trying to parse a String which contains XML content which conforms to the XML 1.1 spec. The XML contains character references which are not allowed in the XML 1.0 spec but which are allowed in the XML 1.1 spec (character references which translate to Unicode characters in the range U+0001–U+001F).
根據(jù) Xerces2 網(wǎng)站,Xerces2 解析器支持解析 XML 1.1 文檔.但是,我不知道如何告訴它我們嘗試解析的 XML 包含符合 1.1 的 XML.
According the Xerces2 website, the Xerces2 parser supports parsing XML 1.1 documents. However, I cannot figure out how to tell it the XML we are trying to parse contains 1.1-compliant XML.
我正在使用 DocumentBuilder 來解析 XML(類似這樣):
I'm using a DocumentBuilder to parse the XML (something like this):
public Element parseString(String xmlString) {
try {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = dbf.newDocumentBuilder();
InputSource source = new InputSource(new StringReader(xmlString));
// Throws org.xml.sax.SAXParseException becuase of the invalid character refs
Document doc = documentBuilder.parse(source);
return doc.getDocumentElement();
} catch (ParserConfigurationException pce) {
// Handle the error
} catch (SAXException se) {
// Handle the error
} catch (IOException ioe) {
// Handle the error
}
}
我已嘗試設(shè)置 XML 標(biāo)頭以指示 XML 符合 1.1 規(guī)范...
I've tried setting the XML header to indicate the XML conforms to the 1.1 spec...
xmlString = "<?xml version="1.1" encoding="UTF-8" ?>" + xmlString;
...但仍被解析為 1.0 XML(仍會(huì)生成無效字符引用異常).
...but it is still parsed as 1.0 XML (still generates the invalid character reference exceptions).
如何配置 Xerces 解析器以將 XML 解析為 XML 1.1?是否有其他解析器可以為 XML 1.1 提供更好的支持?
How can I configure the Xerces parser to parse the XML as XML 1.1? Is there an alternative parser which provides better support for XML 1.1?
推薦答案
看這里 查看 xerces 支持的所有功能的列表.可能低于 2 個(gè)功能是您必須打開的.
See here for a list of all the features supported by xerces. May be below 2 features is what you have to turn on.
http://xml.org/sax/features/unicode-normalization-checking
True:執(zhí)行 Unicode 規(guī)范化檢查(如 XML 1.1 建議的第 2.13 節(jié)和附錄 B 中所述)并報(bào)告規(guī)范化錯(cuò)誤.
True: Perform Unicode normalization checking (as described in section 2.13 and Appendix B of the XML 1.1 Recommendation) and report normalization errors.
False:不報(bào)告 Unicode 規(guī)范化錯(cuò)誤.
False: Do not report Unicode normalization errors.
http://xml.org/sax/features/xml-1.1
正確:解析器同時(shí)支持 XML 1.0 和 XML 1.1.
False:解析器僅支持 XML 1.0.
訪問:只讀自:Xerces-J 2.7.0注意:此功能的價(jià)值取決于 SAX 解析器擁有的解析器配置是否已知支持 XML 1.1.
True: The parser supports both XML 1.0 and XML 1.1.
False: The parser supports only XML 1.0.
Access: read-only
Since: Xerces-J 2.7.0
Note: The value of this feature will depend on whether the parser configuration owned by the SAX parser is known to support XML 1.1.
這篇關(guān)于如何使用 Java 和 Xerces 解析符合 1.1 規(guī)范的 XML?的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!