問(wèn)題描述
我有一個(gè)這樣的 XML 文檔:
I have an XML document which reads like this:
<xml>
<web:Web>
<web:Total>4000</web:Total>
<web:Offset>0</web:Offset>
</web:Web>
</xml>
我的問(wèn)題是如何使用 Python 中的 BeautifulSoup 之類的庫(kù)來(lái)訪問(wèn)它們?
my question is how do I access them using a library like BeautifulSoup in python?
xmlDom.web["Web"].Total ?不工作?
xmlDom.web["Web"].Total ? does not work?
推薦答案
BeautifulSoup is't 一個(gè) DOM 庫(kù)本身(它不實(shí)現(xiàn) DOM API).更復(fù)雜的是,您在該 xml 片段中使用命名空間.要解析特定的 XML,您可以使用 BeautifulSoup,如下所示:
BeautifulSoup isn't a DOM library per se (it doesn't implement the DOM APIs). To make matters more complicated, you're using namespaces in that xml fragment. To parse that specific piece of XML, you'd use BeautifulSoup as follows:
from BeautifulSoup import BeautifulSoup
xml = """<xml>
<web:Web>
<web:Total>4000</web:Total>
<web:Offset>0</web:Offset>
</web:Web>
</xml>"""
doc = BeautifulSoup( xml )
print doc.find( 'web:total' ).string
print doc.find( 'web:offset' ).string
如果您不使用命名空間,代碼可能如下所示:
If you weren't using namespaces, the code could look like this:
from BeautifulSoup import BeautifulSoup
xml = """<xml>
<Web>
<Total>4000</Total>
<Offset>0</Offset>
</Web>
</xml>"""
doc = BeautifulSoup( xml )
print doc.xml.web.total.string
print doc.xml.web.offset.string
這里的關(guān)鍵是 BeautifulSoup 對(duì)命名空間一無(wú)所知(或關(guān)心).因此 web:Web
被視為 web:web
標(biāo)記,而不是屬于 eweb
Web 標(biāo)記> 命名空間.雖然 BeautifulSoup 將 web:web
添加到 xml 元素字典中,但 python 語(yǔ)法不會(huì)將 web:web
識(shí)別為單個(gè)標(biāo)識(shí)符.
The key here is that BeautifulSoup doesn't know (or care) anything about namespaces. Thus web:Web
is treated like a web:web
tag instead of as a Web
tag belonging to th eweb
namespace. While BeautifulSoup adds web:web
to the xml element dictionary, python syntax doesn't recognize web:web
as a single identifier.
您可以通過(guò)閱讀文檔了解更多信息.
You can learn more about it by reading the documentation.
這篇關(guān)于如何使用 BeautifulSoup 訪問(wèn)命名空間的 XML 元素?的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!