久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

從 SQLlite 數(shù)據(jù)庫中讀取許多表并在 R 中組合

reading in many tables from SQLlite databases and combining in R(從 SQLlite 數(shù)據(jù)庫中讀取許多表并在 R 中組合)
本文介紹了從 SQLlite 數(shù)據(jù)庫中讀取許多表并在 R 中組合的處理方法,對(duì)大家解決問題具有一定的參考價(jià)值,需要的朋友們下面隨著小編來一起學(xué)習(xí)吧!

問題描述

限時(shí)送ChatGPT賬號(hào)..

我正在使用一個(gè)輸出結(jié)果數(shù)據(jù)庫的程序.我有數(shù)百個(gè)結(jié)構(gòu)相同的數(shù)據(jù)庫,我想將它們組合成一個(gè)大數(shù)據(jù)庫.我最感興趣的是每個(gè)數(shù)據(jù)庫中的 1 個(gè)表.我不太使用數(shù)據(jù)庫/sql,但它會(huì)簡化過程中的其他步驟,跳過輸出 csv.

I'm working with a program that outputs a database of results. I have hundreds of these databases that are all identical in structure and I'd like to combine them into ONE big database. I'm mostly interested in 1 table from each database. I don't work with databases/sql very much, but it would simplify other steps in the process, to skip outputting a csv.

以前我是通過導(dǎo)出一個(gè) csv 并使用這些步驟來組合所有 csv 來做到這一點(diǎn)的:

Previously I did this by exporting a csv and used these steps to combine all csvs:

library(DBI)
library(RSQLite)
library(dplyr)

csv_locs<- list.files(newdir, recursive = TRUE, pattern="*.csv", full.names = TRUE)

pic_dat <- do.call("rbind", lapply(csv_locs, 
FUN=function(files){data.table::fread(files, data.table = FALSE)}))

如何用sql類型的數(shù)據(jù)庫表來做這個(gè)??

我基本上是拉出第一張桌子,然后用一個(gè)循環(huán)連接其余的桌子.

How to do this with sql type database tables??

I'm basically pulling out the first table, then joining on the rest with a loop.

db_locs <- list.files(directory, recursive = TRUE, pattern="*.ddb", full.names = TRUE)


# first table
con1<- DBI::dbConnect(RSQLite::SQLite(), db_locs [1])
start <- tbl(con1, "DataTable")

# open connection to location[i], get table, union, disconnect; repeat. 
for(i in 2:length(db_locs )){
con <- DBI::dbConnect(RSQLite::SQLite(), db_locs[i])
y <- tbl(con, "DataTable")
start <- union(start, y, copy=TRUE)
dbDisconnect(con)
}

這特別慢!好吧,公平地說,它的大數(shù)據(jù)和 csv 也很慢.

老實(shí)說,我想我寫了最慢的方法來做到這一點(diǎn):) 我無法讓 do.call/lapply 選項(xiàng)在這里工作,但也許我遺漏了一些東西.

This is exceptionally slow! Well, to be fair, its large data and the csv one is also slow.

I think I honestly wrote the slowest possible way to do this :) I could not get the do.call/lapply option to work here, but maybe I'm missing something.

推薦答案

這看起來類似于迭代rbind幀",因?yàn)槊看文氵@樣做union,它會(huì)將整個(gè)表復(fù)制到一個(gè)新對(duì)象中(未經(jīng)證實(shí),但這是我的直覺).這可能對(duì)少數(shù)人有效,但擴(kuò)展性很差.我建議您將所有表收集到一個(gè)列表中,并在最后調(diào)用 data.table::rbindlist 一次,然后插入到一個(gè)表中.

This looks similar to "iterative rbinding of frames", in that each time you do this union, it will copy the entire table into a new object (unconfirmed, but that's my gut feeling). This might work well for a few but scales very poorly. I suggest you collect all tables in a list and call data.table::rbindlist once at the end, then insert into a table.

沒有你的數(shù)據(jù),我會(huì)設(shè)計(jì)一個(gè)情況.并且因?yàn)槲也煌耆_定每個(gè) sqlite3 文件是否只有一個(gè)表,所以我將為每個(gè)數(shù)據(jù)庫添加兩個(gè)表.如果您只有一個(gè),則解決方案會(huì)很容易簡化.

Without your data, I'll contrive a situation. And because I'm not entirely certain if you have just one table per sqlite3 file, I'll add two tables per database. If you only have one, the solution simplifies easily.

for (i in 1:3) {
  con <- DBI::dbConnect(RSQLite::SQLite(), sprintf("mtcars_%d.sqlite3", i))
  DBI::dbWriteTable(con, "mt1", mtcars[1:3,1:3])
  DBI::dbWriteTable(con, "mt2", mtcars[4:5,4:7])
  DBI::dbDisconnect(con)
}
(lof <- list.files(pattern = "*.sqlite3", full.names = TRUE))
# [1] "./mtcars_1.sqlite3" "./mtcars_2.sqlite3" "./mtcars_3.sqlite3"

現(xiàn)在我將遍歷它們并讀取表格的內(nèi)容

Now I'll iterate over each them and read the contents of a table

allframes <- lapply(lof, function(fn) {
  con <- DBI::dbConnect(RSQLite::SQLite(), fn)
  mt1 <- tryCatch(DBI::dbReadTable(con, "mt1"),
                  error = function(e) NULL)
  mt2 <- tryCatch(DBI::dbReadTable(con, "mt2"),
                  error = function(e) NULL)
  DBI::dbDisconnect(con)
  list(mt1 = mt1, mt2 = mt2)
})
allframes
# [[1]]
# [[1]]$mt1
#    mpg cyl disp
# 1 21.0   6  160
# 2 21.0   6  160
# 3 22.8   4  108
# [[1]]$mt2
#    hp drat    wt  qsec
# 1 110 3.08 3.215 19.44
# 2 175 3.15 3.440 17.02
# [[2]]
# [[2]]$mt1
#    mpg cyl disp
# 1 21.0   6  160
# 2 21.0   6  160
# 3 22.8   4  108
### ... repeated

從這里開始,只需將它們組合在 R 中并寫入新數(shù)據(jù)庫.雖然您可以使用 do.call(rbind,...)dplyr::bind_rows,但您已經(jīng)提到了 data.table 所以我會(huì)堅(jiān)持下去:

From here, just combine them in R and write to a new database. While you can use do.call(rbind,...) or dplyr::bind_rows, you already mentioned data.table so I'll stick with that:

con <- DBI::dbConnect(RSQLite::SQLite(), "mtcars_all.sqlite3")
DBI::dbWriteTable(con, "mt1", data.table::rbindlist(lapply(allframes, `[[`, 1)))
DBI::dbWriteTable(con, "mt2", data.table::rbindlist(lapply(allframes, `[[`, 2)))
DBI::dbGetQuery(con, "select count(*) as n from mt1")
#   n
# 1 9
DBI::dbDisconnect(con)

如果您不能一次將它們?nèi)考虞d到 R 中,則將它們實(shí)時(shí)附加到表中:

In the event that you can't load them all into R at one time, then append them to the table in real-time:

con <- DBI::dbConnect(RSQLite::SQLite(), "mtcars_all2.sqlite3")
for (fn in lof) {
  con2 <- DBI::dbConnect(RSQLite::SQLite(), fn)
  mt1 <- tryCatch(DBI::dbReadTable(con2, "mt1"), error = function(e) NULL)
  if (!is.null(mt1)) DBI::dbWriteTable(con, "mt1", mt1, append = TRUE)
  mt2 <- tryCatch(DBI::dbReadTable(con2, "mt2"), error = function(e) NULL)
  if (!is.null(mt1)) DBI::dbWriteTable(con, "mt2", mt2, append = TRUE)
  DBI::dbDisconnect(con2)
}
DBI::dbGetQuery(con, "select count(*) as n from mt1")
#   n
# 1 9

這不會(huì)受到您正在經(jīng)歷的迭代放緩的影響.

This doesn't suffer the iterative-slowdown that you're experiencing.

這篇關(guān)于從 SQLlite 數(shù)據(jù)庫中讀取許多表并在 R 中組合的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!

【網(wǎng)站聲明】本站部分內(nèi)容來源于互聯(lián)網(wǎng),旨在幫助大家更快的解決問題,如果有圖片或者內(nèi)容侵犯了您的權(quán)益,請(qǐng)聯(lián)系我們刪除處理,感謝您的支持!

相關(guān)文檔推薦

What SQL Server Datatype Should I Use To Store A Byte[](我應(yīng)該使用什么 SQL Server 數(shù)據(jù)類型來存儲(chǔ)字節(jié) [])
Interpreting type codes in sys.objects in SQL Server(解釋 SQL Server 中 sys.objects 中的類型代碼)
Typeorm Does not return all data(Typeorm 不返回所有數(shù)據(jù))
Typeorm .loadRelationCountAndMap returns zeros(Typeorm .loadRelationCountAndMap 返回零)
How to convert #39;2016-07-01 01:12:22 PM#39; to #39;2016-07-01 13:12:22#39; hour format?(如何將“2016-07-01 01:12:22 PM轉(zhuǎn)換為“2016-07-01 13:12:22小時(shí)格式?)
MS SQL: Should ISDATE() Return quot;1quot; when Cannot Cast as Date?(MS SQL:ISDATE() 是否應(yīng)該返回“1?什么時(shí)候不能投射為日期?)
主站蜘蛛池模板: 人人性人人性碰国产 | 国产成人网| 久久精品欧美一区二区三区不卡 | 操网站 | 久久精品国产一区 | 91观看 | 亚洲成人免费视频在线观看 | 国产无人区一区二区三区 | 欧美理论片在线观看 | 国产精品爱久久久久久久 | 欧美一区视频 | 久久一区二区三区电影 | 成人h视频在线 | 日本在线免费看最新的电影 | 国产成人综合久久 | 婷婷久久一区 | 97视频在线免费 | 欧美在线视频一区二区 | 91欧美精品成人综合在线观看 | 欧美天堂一区 | 黄瓜av| 久久y| 九热在线 | 国产精品成人av | 在线成人一区 | 午夜理伦三级理论三级在线观看 | 亚洲精品久久久一区二区三区 | 91福利网 | 精品久久亚洲 | 午夜爱爱毛片xxxx视频免费看 | av在线一区二区三区 | 日本色综合 | www国产亚洲精品 | 亚洲国产视频一区 | 夜色www国产精品资源站 | 91成人在线视频 | 古典武侠第一页久久777 | 日韩视频免费看 | 老司机精品福利视频 | 天天草草草 | 亚洲综合色 |