問題描述
我目前正在使用這樣的代碼來檢測 SQL 服務(wù)器作業(yè)是否正在運行.(這是 SQL Server 2005,所有 SP)
I'm currently using code like this to detect if a SQL server job is running. (this is SQL Server 2005, all SP's)
return (select isnull(
(select top 1 CASE
WHEN current_execution_status = 4 THEN 0
ELSE 1
END
from openquery(devtestvm, 'EXEC msdb.dbo.sp_help_job')
where current_execution_status = 4 and
name = 'WQCheckQueueJob' + cast(@Index as varchar(10))
), 1)
)
沒有問題,一般來說,它工作得很好.
No problems there, and generally speaking, it works just fine.
但是....(總是一個但是)
But.... (always a but)
有時,我會調(diào)用它,返回作業(yè)未運行"結(jié)果,此時我將嘗試通過
On occasion, I'll invoke this, get back a "job is not running" result, at which point I'll try and start the job, via
exec msdb.dbo.sp_start_job @JobName
并且 SQL 將返回SQLAgent 拒絕啟動作業(yè),因為它已經(jīng)有一個掛起的請求".
and SQL will return that "SQLAgent has refused to start the job because it already has a pending request".
好的.也不是問題.可以想象,在這段代碼可以啟動目標(biāo)作業(yè)之前,有一個小窗口可以啟動它,但是在檢查它是否啟動之后.但是,我可以將其包含在 try catch 中并忽略錯誤,對嗎?
Ok. Also not a problem. It's conceivable that there's a slight window where the target job could get started before this code can start it, but after checking if it's started. However, I can just wrap that up in a try catch and just ignore the error, right?
begin try
if dbo.WQIsQueueJobActive(@index) = 0 begin
exec msdb.dbo.sp_start_job @JobName
break
end
end try begin catch
-- nothing here
end catch
問題來了.
10 次中有 9 次,這很好用.SQL 代理將引發(fā)錯誤,它被捕獲,并且處理繼續(xù)進(jìn)行,因為作業(yè)已經(jīng)在運行,沒有傷害沒有犯規(guī).
9 times out of 10, this works just fine. SQL agent will raise the error, it's caught, and processing just continues on, since the job is already running, no harm no foul.
但偶爾,我會在作業(yè)歷史記錄"視圖中收到一條消息(請記住上面的代碼來檢測特定作業(yè)是否正在運行,如果不是從另一個作業(yè)實際運行,則啟動它)說作業(yè)失敗,因為SQLAgent 拒絕啟動作業(yè),因為它已經(jīng)有一個待處理的請求".
But occasionally, I'll get a message in the Job History view (keep in mind the above code to detect if a specific job is running and start it if not is actually running from another job) saying that the job failed because "SQLAgent has refused to start the job because it already has a pending request".
當(dāng)然,這正是 TRY CATCH 應(yīng)該處理的錯誤!
Of course, this is the exact error that TRY CATCH is supposed to be handling!
當(dāng)發(fā)生這種情況時,正在執(zhí)行的作業(yè)就會終止,但據(jù)我所知不會立即終止,只是非常接近.我把日志記錄到處都是,沒有一致性.一次失敗,它會在 a 點,下一次在 b 點.在某些情況下,A 地和 B 地只有一個
When this happens, the executing job just dies, but not immediately from what I can tell, just pretty close. I've put logging all over the place and there's no consistency. One time it fails, it'll be at place a, the next time at place b. In some cases, Place A and place B have nothing but a
select @var = 'message'
在他們之間.很奇怪.基本上,該作業(yè)似乎被毫不客氣地轉(zhuǎn)儲,并且該作業(yè)中剩下的任何要執(zhí)行的內(nèi)容都完全 + 不 + 執(zhí)行.
in between them. Very strange. Basically, the job appears to be unceremoniously dumped and anything left to execute in the job is +not+ executed at all.
但是,如果我刪除exec StartJob"(或者當(dāng)我知道目標(biāo)作業(yè)無法運行時,它只被調(diào)用一次),一切正常,我在作業(yè)中的所有處理都會運行.
However, if I remove the "exec StartJob" (or have it invoked exactly one time, when I KNOW that the target job can't already be running), everything works perfectly and all my processing in the job runs through.
這一切背后的目的是讓作業(yè)作為觸發(fā)器的結(jié)果(除其他外)啟動,如果作業(yè)已經(jīng)啟動,則真的沒有必要重新啟動".
The purpose behind all this is to have a job started as a result of a trigger (among other things), and, if the job is already started, there's really no need to "start it again".
有人在使用 SQL Agent 的作業(yè)處理時遇到過這樣的行為嗎?
Anyone ever run into behavior like this with SQL Agent's Job handling?
目前的控制流程是這樣的:
Current flow of control is like so:
- 更改為表(更新或插入)...
- 觸發(fā)調(diào)用...的觸發(fā)器
- 一個調(diào)用...的存儲過程
- sp_Start_Job 其中...
- 開始一個特定的工作......
- 調(diào)用另一個存儲過程(稱為 CheckQueue)...
- 執(zhí)行一些處理并...
- 檢查幾個表格,根據(jù)它們的內(nèi)容可能...
- 在另一個作業(yè)上調(diào)用 sp_start_job 以啟動第二個同步作業(yè)處理額外的工作(第二個作業(yè)也調(diào)用 CheckQueue sproc但是這兩個調(diào)用對完全獨立的數(shù)據(jù)集進(jìn)行操作)
推薦答案
首先,你有沒有機會看一下 service broker?從你的描述來看,這似乎是你真正想要的.
First of all, have you had a chance to look at service broker? From your description, it sounds like that's what you actually want.
不同之處在于您將數(shù)據(jù)放入 SB 隊列而不是開始工作,SB 將異步調(diào)用您的處理過程,并且完全回避已經(jīng)運行的工作等問題.它會自動生成/終止其他線程和需求決定,它負(fù)責(zé)秩序等.
The difference would be instead of starting a job, you put your data into a SB queue and SB will call your processing proc asynchronously and completely side-step issues with already-running jobs etc. It will auto spawn/terminate additional threads and demand dictates, it takes care of order etc.
這是一個很好的(并且相關(guān)的)教程.http://www.sqlteam.com/article/centralized-asynchronous-審計與服務(wù)經(jīng)紀(jì)人
Here's a good (and vaguely related) tutorial. http://www.sqlteam.com/article/centralized-asynchronous-auditing-with-service-broker
讓我們假設(shè)無論出于何種原因您都不能使用 SB(但說真的,一定要這樣做!).
Let's assume that you can't use SB for whatever reason (but seriously, do!).
如何使用作業(yè) spid 的 context_info.
What about using the job spid's context_info.
- 您的工作調(diào)用了一個單獨執(zhí)行每個步驟的包裝程序.
包裝過程中的第一條語句是
- Your job calls a wrapper proc that execs each step individually.
The first statement inside the wrapper proc is
DECLARE @context_info VARBINARY(30)
SET @context_info = CAST('MyJob1' AS VARBINARY)
SET CONTEXT_INFO @context_info
當(dāng)您的 proc 完成時(或在您的 catch 塊中)
When your proc finishes (or in your catch block)
SET CONTEXT_INFO 0x0
當(dāng)你打算打電話給你的工作時,這樣做:
When you are looking at calling your job, do this:
IF NOT EXISTS (SELECT * FROM master..sysprocesses WITH (NOLOCK) WHERE context_info=CAST('MyJob1' AS VARBINARY))
EXEC StartJob
當(dāng)您的包裝過程終止或連接關(guān)閉時,您的 context_info 將消失.
When your wrapper proc terminates or the connection is closed, your context_info goes away.
您還可以使用全局臨時表(即##JobStatus),當(dāng)所有引用它的 spid 斷開連接或顯式刪除它時,它們將消失.
You could also use a global temp table (i.e. ##JobStatus) They will disappear when all spids that reference it disconnect or if it's explicitly dropped.
只是一些想法.
這篇關(guān)于如何準(zhǔn)確檢測 SQL Server 作業(yè)是否正在運行并處理已運行的作業(yè)?的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!