后进极品白嫩翘臀在线视频,综合久久综合久久,国产99热

本文介紹了pandas 中的大而持久的 DataFrame的處理方法，對大家解決問題具有一定的參考價值，需要的朋友們下面隨著小編來一起學習吧！

問題描述

作為長期 SAS 用戶，我正在探索切換到 python 和 pandas.

I am exploring switching to python and pandas as a long-time SAS user.

然而，今天在運行一些測試時，我很驚訝 python 在嘗試 pandas.read_csv() 一個 128mb 的 csv 文件時內存不足.它有大約 200,000 行和 200 列主要是數字數據.

However, when running some tests today, I was surprised that python ran out of memory when trying to pandas.read_csv() a 128mb csv file. It had about 200,000 rows and 200 columns of mostly numeric data.

使用 SAS，我可以將 csv 文件導入 SAS 數據集，它可以和我的硬盤一樣大.

With SAS, I can import a csv file into a SAS dataset and it can be as large as my hard drive.

pandas 中有類似的東西嗎?

我經常處理大文件，但無法訪問分布式計算網絡.

I regularly work with large files and do not have access to a distributed computing network.

推薦答案

原則上不應該用完內存，但是目前read_csv對大文件存在內存問題，原因是一些復雜的Python 內部問題(這個很模糊，但是早就知道了:http://github.com/pydata/pandas/問題/407).

In principle it shouldn't run out of memory, but there are currently memory problems with read_csv on large files caused by some complex Python internal issues (this is vague but it's been known for a long time: http://github.com/pydata/pandas/issues/407).

目前還沒有完美的解決方案(這是一個乏味的解決方案:您可以將文件逐行轉錄成預先分配的 NumPy 數組或內存映射文件--np.mmap)，但這是我將在不久的將來進行的工作.另一種解決方案是讀取較小的文件(使用 iterator=True, chunksize=1000)然后與 pd.concat 連接.當您一口氣將整個文本文件拉入內存時，問題就出現了.

At the moment there isn't a perfect solution (here's a tedious one: you could transcribe the file row-by-row into a pre-allocated NumPy array or memory-mapped file--np.mmap), but it's one I'll be working on in the near future. Another solution is to read the file in smaller pieces (use iterator=True, chunksize=1000) then concatenate then with pd.concat. The problem comes in when you pull the entire text file into memory in one big slurp.

這篇關于pandas 中的大而持久的 DataFrame的文章就介紹到這了，希望我們推薦的答案對大家有所幫助，也希望大家多多支持html5模板網！

【網站聲明】本站部分內容來源于互聯網,旨在幫助大家更快的解決問題，如果有圖片或者內容侵犯了您的權益，請聯系我們刪除處理，感謝您的支持！

久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

pandas 中的大而持久的 DataFrame

問題描述

推薦答案

相關文檔推薦