日韩精品一区二区在线,国产成人久久,久久久久国产

本文介紹了如何在 Java 中高效地解析 200,000 個 XML 文件?的處理方法，對大家解決問題具有一定的參考價值，需要的朋友們下面隨著小編來一起學習吧！

問題描述

我有 200,000 個 XML 文件要解析并存儲在數據庫中.

I have 200,000 XML files I want to parse and store in a database.

這里是一個例子:https://gist.github.com/902292

這與 XML 文件一樣復雜.這也將在小型 VPS (Linode) 上運行，因此內存很緊.

This is about as complex as the XML files get. This will also run on a small VPS (Linode) so memory is tight.

我想知道的是:

1) 我應該使用 DOM 還是 SAX 解析器?由于每個 XML 都很小，因此 DOM 似乎更容易和更快.

1) Should I use a DOM or SAX parser? DOM seems easier and faster since each XML is small.

2) 關于所述解析器的簡單教程在哪里?(DOM 或 SAX)

2) Where is a simple tutorial on said parser? (DOM or SAX)

謝謝

編輯

盡管每個人都建議使用 SAX，但我嘗試了 DOM 路由.主要是因為我找到了一個更簡單"的 DOM 教程，并且我認為由于平均文件大小約為 3k - 4k，因此很容易將其保存在內存中.

I tried the DOM route even though everyone suggested SAX. Mainly because I found an "easier" tutorial for DOM and I thought that since the average file size was about 3k - 4k it would easily be able to hold that in memory.

但是，我編寫了一個遞歸例程來處理所有 200k 文件，它完成了大約 40% 的文件，然后 Java 內存不足.

However, I wrote a recursive routine to handle all 200k files and it gets about 40% of the way through them and then Java runs out of memory.

這是項目的一部分.https://gist.github.com/905550#file_xm_lparser.java

我現在應該放棄 DOM 而只使用 SAX 嗎?看起來如此小的文件 DOM 應該能夠處理它.

Should I ditch DOM now and just use SAX? Just seems like with such small files DOM should be able to handle it.

此外，速度足夠快".解析 2000 個 XML 文件大約需要 19 秒(在 Mongo 插入之前).

Also, the speed is "fast enough". It's taking about 19 seconds to parse 2000 XML files (before the Mongo insert).

謝謝

久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

如何在 Java 中高效地解析 200,000 個 XML 文件?

問題描述

推薦答案

相關文檔推薦