久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

在不更改 XML 的情況下用 Java 解析包含 HTML 實體的

Parsing XML file containing HTML entities in Java without changing the XML(在不更改 XML 的情況下用 Java 解析包含 HTML 實體的 XML 文件)
本文介紹了在不更改 XML 的情況下用 Java 解析包含 HTML 實體的 XML 文件的處理方法,對大家解決問題具有一定的參考價值,需要的朋友們下面隨著小編來一起學習吧!

問題描述

I have to parse a bunch of XML files in Java that sometimes -- and invalidly -- contain HTML entities such as —, > and so forth. I understand the correct way of dealing with this is to add suitable entity declarations to the XML file before parsing. However, I can't do that as I have no control over those XML files.

Is there some kind of callback I can override that is invoked whenever the Java XML parser encounters such an entity? I haven't been able to find one in the API.

I'd like to use:

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();

DocumentBuilder parser = dbf.newDocumentBuilder();
Document        doc    = parser.parse( stream );

I found that I can override resolveEntity in org.xml.sax.helpers.DefaultHandler, but how do I use this with the higher-level API?

Here's a full example:

public class Main {
    public static void main( String [] args ) throws Exception {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder parser = dbf.newDocumentBuilder();
        Document        doc    = parser.parse( new FileInputStream( "test.xml" ));
    }

}

with test.xml:

<?xml version="1.0" encoding="UTF-8"?>
<foo>
    <bar>Some&nbsp;text &mdash; invalid!</bar>
</foo>

Produces:

[Fatal Error] :3:20: The entity "nbsp" was referenced, but not declared.
Exception in thread "main" org.xml.sax.SAXParseException; lineNumber: 3; columnNumber: 20; The entity "nbsp" was referenced, but not declared.

Update: I have been poking around in the JDK source code with a debugger, and boy, what an amount of spaghetti. I have no idea what the design is there, or whether there is one. Just how many layers of an onion can one layer on top of each other?

They key class seems to be com.sun.org.apache.xerces.internal.impl.XMLEntityManager, but I cannot find any code that either lets me add stuff into it before it gets used, or that attempts to resolve entities without going through that class.

解決方案

I would use a library like Jsoup for this purpose. I tested the following below and it works. I don't know if this helps. It can be located here: http://jsoup.org/download

public static void main(String args[]){


    String html = "<?xml version="1.0" encoding="UTF-8"?><foo>" + 
                  "<bar>Some&nbsp;text &mdash; invalid!</bar></foo>";
    Document doc = Jsoup.parse(html, "", Parser.xmlParser());

    for (Element e : doc.select("bar")) {
        System.out.println(e);
    }   


}

Result:

<bar>
 Some&nbsp;text — invalid!
</bar>

Loading from a file can be found here:

http://jsoup.org/cookbook/input/load-document-from-file

這篇關于在不更改 XML 的情況下用 Java 解析包含 HTML 實體的 XML 文件的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!

【網站聲明】本站部分內容來源于互聯網,旨在幫助大家更快的解決問題,如果有圖片或者內容侵犯了您的權益,請聯系我們刪除處理,感謝您的支持!

相關文檔推薦

Upload progress listener not fired (Google drive API)(上傳進度偵聽器未觸發(Google 驅動器 API))
Save file in specific folder with Google Drive SDK(使用 Google Drive SDK 將文件保存在特定文件夾中)
Google Drive Android API - Invalid DriveId and Null ResourceId(Google Drive Android API - 無效的 DriveId 和 Null ResourceId)
Google drive api services account view uploaded files to google drive using java(谷歌驅動api服務賬戶查看上傳文件到谷歌驅動使用java)
Google Drive service account returns 403 usageLimits(Google Drive 服務帳號返回 403 usageLimits)
com.google.api.client.json.jackson.JacksonFactory; missing in Google Drive example(com.google.api.client.json.jackson.JacksonFactory;Google Drive 示例中缺少)
主站蜘蛛池模板: 国产午夜精品久久久 | 一区中文字幕 | 精品久久久久久久久亚洲 | 91社区在线观看播放 | 日本精品一区二区三区视频 | 欧美日韩精品影院 | 欧美a在线 | 国产原创在线观看 | 成人av看片 | 午夜三级在线观看 | 久久久视频在线 | 99re视频在线免费观看 | 久久综合久久综合久久 | 久久精品99久久 | 一区精品国产欧美在线 | 999久久久久久久 | 久久人爽爽人爽爽 | 亚洲欧美激情精品一区二区 | 久久久久国产 | 99精品在线观看 | 在线国产视频观看 | 在线婷婷| 国产精品国产 | 国产午夜精品久久久久 | 91在线第一页 | 亚洲欧洲色视频 | www.色午夜.com| 成人免费观看视频 | 亚洲综合色| 成人精品视频在线观看 | 久久99精品久久久久久琪琪 | 久久久综合网 | 久久首页 | 日韩一区二区在线视频 | 密乳av | 超碰91在线 | 久久免费视频观看 | 国产成人免费视频网站高清观看视频 | 巨大荫蒂视频欧美另类大 | 欧美日韩成人在线 | 日韩图区 |