問(wèn)題描述
我正在使用 import org.jdom.* 編寫(xiě)一個(gè) java 應(yīng)用程序;
I'm writing an application in java using import org.jdom.*;
我的 XML 是有效的,但有時(shí)它包含 HTML 標(biāo)記.例如,像這樣:
My XML is valid,but sometimes it contains HTML tags. For example, something like this:
<program-title>Anatomy & Physiology</program-title>
<overview>
<content>
For more info click <a href="page.html">here</a>
<p>Learn more about the human body. Choose from a variety of Physiology (A&P) designed for complementary therapies.&#160; Online studies options are available.</p>
</content>
</overview>
<key-information>
<category>Health & Human Services</category>
所以我的問(wèn)題在于 <p > overview.content 節(jié)點(diǎn)內(nèi)的標(biāo)簽.
So my problem is with the < p > tags inside the overview.content node.
我希望這段代碼可以工作:
I was hoping that this code would work :
Element overview = sds.getChild("overview");
Element content = overview.getChild("content");
System.out.println(content.getText());
但它返回空白.
如何從 overview.content 節(jié)點(diǎn)返回所有文本(嵌套標(biāo)簽和所有)?
How do I return all the text ( nested tags and all ) from the overview.content node ?
謝謝
推薦答案
content.getText()
提供即時(shí)文本,該文本僅對(duì)帶有文本內(nèi)容的葉子元素有用.
content.getText()
gives immediate text which is only useful fine with the leaf elements with text content.
技巧是使用 org.jdom.output.XMLOutputter
(帶文本模式 CompactFormat
)
Trick is to use org.jdom.output.XMLOutputter
( with text mode CompactFormat
)
public static void main(String[] args) throws Exception {
SAXBuilder builder = new SAXBuilder();
String xmlFileName = "a.xml";
Document doc = builder.build(xmlFileName);
Element root = doc.getRootElement();
Element overview = root.getChild("overview");
Element content = overview.getChild("content");
XMLOutputter outp = new XMLOutputter();
outp.setFormat(Format.getCompactFormat());
//outp.setFormat(Format.getRawFormat());
//outp.setFormat(Format.getPrettyFormat());
//outp.getFormat().setTextMode(Format.TextMode.PRESERVE);
StringWriter sw = new StringWriter();
outp.output(content.getContent(), sw);
StringBuffer sb = sw.getBuffer();
System.out.println(sb.toString());
}
輸出
For more info click<a href="page.html">here</a><p>Learn more about the human body. Choose from a variety of Physiology (A&P) designed for complementary therapies.&#160; Online studies options are available.</p>
請(qǐng)?zhí)剿髌渌?格式化 選項(xiàng)并在上面進(jìn)行修改根據(jù)您的需要編寫(xiě)代碼.
Do explore other formatting options and modify above code to your need.
封裝XMLOutputter格式選項(xiàng)的類(lèi).典型用戶可以使用getRawFormat()(不改變空白)、getPrettyFormat()(空白美化)、getCompactFormat()(空白歸一化)得到的標(biāo)準(zhǔn)格式配置."
"Class to encapsulate XMLOutputter format options. Typical users can use the standard format configurations obtained by getRawFormat() (no whitespace changes), getPrettyFormat() (whitespace beautification), and getCompactFormat() (whitespace normalization). "
這篇關(guān)于如何從 JDOM 獲取節(jié)點(diǎn)內(nèi)容的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!