問題描述
我正在使用 Apache Spark 1.5.1 并嘗試連接到名為 clinton.db
的本地 SQLite 數據庫.從數據庫表創建數據框工作正常,但是當我對創建的對象執行某些操作時,我收到以下錯誤消息,其中顯示SQL 錯誤或丟失的數據庫(連接已關閉)".有趣的是,我還是得到了手術的結果.知道我可以做些什么來解決問題,即避免錯誤嗎?
I am using Apache Spark 1.5.1 and trying to connect to a local SQLite database named clinton.db
. Creating a data frame from a table of the database works fine but when I do some operations on the created object, I get the error below which says "SQL error or missing database (Connection is closed)". Funny thing is that I get the result of the operation nevertheless. Any idea what I can do to solve the problem, i.e., avoid the error?
spark-shell 的啟動命令:
Start command for spark-shell:
../spark/bin/spark-shell --master local[8] --jars ../libraries/sqlite-jdbc-3.8.11.1.jar --classpath ../libraries/sqlite-jdbc-3.8.11.1.jar
從數據庫中讀取:
val emails = sqlContext.read.format("jdbc").options(Map("url" -> "jdbc:sqlite:../data/clinton.sqlite", "dbtable" -> "Emails")).load()
簡單計數(失敗):
emails.count
錯誤:
15/09/30 09:06:39 WARN JDBCRDD:異常結束語句java.sql.SQLException: [SQLITE_ERROR] SQL 錯誤或缺少數據庫(連接已關閉)在 org.sqlite.core.DB.newSQLException(DB.java:890)在 org.sqlite.core.CoreStatement.internalClose(CoreStatement.java:109)在 org.sqlite.jdbc3.JDBC3Statement.close(JDBC3Statement.java:35)在 org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.org$apache$spark$sql$execution$datasources$jdbc$JDBCRDD$$anon$$close(JDBCRDD.scala:454)在 org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1$$anonfun$8.apply(JDBCRDD.scala:358)在 org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1$$anonfun$8.apply(JDBCRDD.scala:358)在 org.apache.spark.TaskContextImpl$$anon$1.onTaskCompletion(TaskContextImpl.scala:60)在 org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:79)在 org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:77)在 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)在 scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)在 org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:77)在 org.apache.spark.scheduler.Task.run(Task.scala:90)在 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)在 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)在 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)在 java.lang.Thread.run(Thread.java:745)res1:長 = 7945
推薦答案
我遇到了同樣的錯誤 今天,并且重要的一行就在異常之前:
I got the same error today, and the important line is just before the exception:
15/11/30 12:13:02 INFO jdbc.JDBCRDD:關閉連接
15/11/30 12:13:02 INFO jdbc.JDBCRDD: closed connection
15/11/30 12:13:02 WARN jdbc.JDBCRDD:異常結束語句java.sql.SQLException: [SQLITE_ERROR] SQL 錯誤或缺少數據庫(連接已關閉)在 org.sqlite.core.DB.newSQLException(DB.java:890)在 org.sqlite.core.CoreStatement.internalClose(CoreStatement.java:109)在 org.sqlite.jdbc3.JDBC3Statement.close(JDBC3Statement.java:35)在 org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.org$apache$spark$sql$execution$datasources$jdbc$JDBCRDD$$anon$$close(JDBCRDD.scala:454)
15/11/30 12:13:02 WARN jdbc.JDBCRDD: Exception closing statement java.sql.SQLException: [SQLITE_ERROR] SQL error or missing database (Connection is closed) at org.sqlite.core.DB.newSQLException(DB.java:890) at org.sqlite.core.CoreStatement.internalClose(CoreStatement.java:109) at org.sqlite.jdbc3.JDBC3Statement.close(JDBC3Statement.java:35) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.org$apache$spark$sql$execution$datasources$jdbc$JDBCRDD$$anon$$close(JDBCRDD.scala:454)
所以Spark成功關閉JDBC連接,然后關閉JDBC語句
So Spark succeeded to close the JDBC connection, and then it fails to close the JDBC statement
看源碼,close()
被調用了兩次:
第 358 行(org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD,Spark 1.5.1)
Line 358 (org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD, Spark 1.5.1)
context.addTaskCompletionListener{ context => close() }
第 469 行
override def hasNext: Boolean = {
if (!finished) {
if (!gotNext) {
nextValue = getNext()
if (finished) {
close()
}
gotNext = true
}
}
!finished
}
如果您查看 close()
方法(第 443 行)
If you look at the close()
method (line 443)
def close() {
if (closed) return
您可以看到它檢查了變量 closed
,但該值從未設置為 true.
you can see that it checks the variable closed
, but that value is never set to true.
如果我沒看錯的話,這個bug還在master里面.我已提交錯誤報告.
If I see it correctly, this bug is still in the master. I have filed a bug report.
- 來源:JDBCRDD.scala(行號略有不同)
- Source: JDBCRDD.scala (lines numbers differ slightly)
這篇關于SQLITE_ERROR:通過 JDBC 從 Spark 連接到 SQLite 數據庫時,連接已關閉的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!