問題描述
我的查詢包含一個日期、一個時間(基本上是時間戳)和一個計算每小時補償的字段.
My query consists of a date, a time (which basically is the timestamp) and a field that calculates the comp Per hour.
date time comp/H
---------- ----- ----------------------
2019-09-10 07:01 13640,416015625
2019-09-10 07:02 8970,3193359375
2019-09-10 07:03 6105,4990234375
2019-09-10 07:04 7189,77880859375
2019-09-10 07:08 2266,73657226563
2019-09-10 07:57 163,527984619141
我想填補時間戳之間的空白,并為沒有分配任何數據的每一分鐘添加一條新記錄(例如,為 07:05、07:06、07:07 添加記錄).我會為這些記錄的 comp/h 字段分配一個 0 值,但我不知道如何做到這一點.
i would like to fill the gaps between the timestamps, and add a new record for each minute that didn't have any data assigned to it (for example, add record for 07:05, 07:06, 07:07) . I would assign a 0 value for the comp/h field for those records but i have no idea how to do this.
最終目標是制作上述數據的折線圖,在其中可以直觀地看到停機時間.(因此空記錄"的值為 0)
Eventual Goal is to make a line graph of the data above, in which one could visually could see downtime. (hence the 0 values for the "empty records")
原始查詢:
select cast(p_timestamp as date) as 'datum', CONVERT(VARCHAR(5), p_timestamp, 108) as 'time', avg(((AantalPCBperPaneel*(AantalCP+AantalQP))/deltasec)* 3600) as 'comp/h'
from Testview3
where p_timestamp > '2019-09-01'
group by CONVERT(VARCHAR(5), p_timestamp, 108), cast(p_timestamp as date)
order by cast(p_timestamp as date) asc , CONVERT(VARCHAR(5), p_timestamp, 108) asc
推薦答案
您可以嘗試以下代碼:
填充模型場景
SET DATEFORMAT ymd;
DECLARE @mockTable TABLE([date] DATE,[time] TIME,[comp/H] DECIMAL(20,5));
INSERT INTO @mockTable VALUES
('2019-09-10','07:01',13640.416015625)
,('2019-09-10','07:02',8970.3193359375)
,('2019-09-10','07:03',6105.4990234375)
,('2019-09-10','07:04',7189.77880859375)
,('2019-09-10','07:08',2266.73657226563)
,('2019-09-10','07:57',163.527984619141);
--過濾到一天(只是為了保持這個簡單...)
--Filter it to one day (only to keep this simple...)
DECLARE @TheDate DATE='20190910';
--查詢
WITH CountMinutes(Nmbr) AS
(
SELECT TOP((SELECT DATEDIFF(MINUTE,MIN([time]),MAX([time]))
FROM @mockTable
WHERE [date]=@TheDate)+1) ROW_NUMBER() OVER(ORDER BY (SELECT NULL))-1
FROM master..spt_values
)
SELECT @TheDate AS [date]
,CAST(DATEADD(MINUTE,mc.Nmbr,(SELECT MIN(t.[time]) FROM @mockTable t WHERE t.[date]=@TheDate)) AS TIME) AS [time]
,t2.[comp/H]
FROM CountMinutes mc
LEFT JOIN @mockTable t2 ON t2.[date]=@TheDate AND t2.[time]=CAST(DATEADD(MINUTE,mc.Nmbr,(SELECT MIN(t.[time]) FROM @mockTable t WHERE t.[date]=@TheDate)) AS TIME);
簡單的想法:
我們需要一個計數表,只是一個運行數字的列表.我使用 master..spt_values
,它只不過是一個包含很多行的預填充表.您可以選擇具有足夠行數以覆蓋該范圍的任何現有表.我們不需要行的值,只需要集合的計數器.您還可以閱讀計數表以及如何在VALUES()
和CROSS JOIN
的組合中創建它們.這里的神奇之處在于計算出的 TOP()
子句和 ROW_NUMBER()
的組合.
We need a tally table, just a list of running numbers. I use master..spt_values
, which is nothing more than a pre-filled table with a lot of rows. You can pick any existing table with enough rows to cover the range. We do not need the row's values, only the counter for a set. You can also read about tally tables and how to create them in a combination of VALUES()
and CROSS JOIN
. The magic here is the combination of the computed TOP()
clause and ROW_NUMBER()
.
因此 CTE 將返回反映分鐘計數的數字列表.
So the CTE will return a list of numbers reflecting the count of minutes.
選擇將使用此列表和 DATEADD()
創建一個無間隙時間值列表.現在我們必須LEFT JOIN
你的集合來查看數據,哪里有數據...
The select will use this list and DATEADD()
to create a gap-less list of time values. Now we have to LEFT JOIN
your set to see data, where there is data...
在康斯坦丁蘇爾科夫的回答下面的評論中,我說過,使用循環的計數器函數會非常慢.康斯坦丁讓我衡量這個:
In a comment below Konstantin Surkov's answer I stated, that a counter function using a loop would be very slow. And Konstantin asked me to measure this:
這里我將比較三種方法
- 康斯坦丁斯 LOOP-TVF
- 一個簡單的動態計數
- 基于表格的方法
嘗試一下:
USE master;
GO
CREATE DATABASE testCounter;
GO
USE testCounter;
GO
--Konstantins 使用 WHILE 的多語句 TVF
--Konstantins multi-statement TVF using a WHILE
create function rangeKonstantin(@from int, @to int) returns @table table(val int) as
begin
while @from <= @to begin
insert @table values(@from)
set @from = @from + 1;
end;
return;
end;
GO
--使用 tally-on-the-fly 和 ROW_NUMBER()
create function rangeShnugo(@from int,@to int) returns table as
return
with cte1 AS(SELECT Nr FROM (VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(Nr))
,cte2 AS(SELECT c1.Nr FROM cte1 c1 CROSS JOIN cte1 c2)
,cte3 AS(SELECT c1.Nr FROM cte2 c1 CROSS JOIN cte2 c2)
,cte4 AS(SELECT c1.Nr FROM cte3 c1 CROSS JOIN cte3 c2)
select TOP(@to-@from+1) ROW_NUMBER() OvER(ORDER BY(SELECT NULL))+@from-1 AS val FROM cte4;
GO
--還有一個簡單的靜態數字表
--連同使用此表的函數
--And a simple static numbers table
--Together with a function using this table
CREATE TABLE persistantNumbers(val INT NOT NULL UNIQUE);
GO
--let's fill it
INSERT INTO persistantNumbers SELECT val FROM rangeKonstantin(0,1500000) --1.5 mio rows
GO
create function rangeTable(@from int,@to int) returns table as
return
SELECT val FROM persistantNumbers WHERE val BETWEEN @from AND @to;
GO
--這里我們可以保存結果
--Here we can save the results
CREATE TABLE Result (ID INT IDENTITY,Measurement VARCHAR(100),TimeKonst INT, TimeShnugo INT, TimeTable INT, tmpCount INT)
GO
--您可以使用這些行來測試代碼冷或保留注釋以測試引擎緩存和使用統計信息的能力.
--You can use these lines to test the code cold or keep it out-commented to test the engine's ability of caching and using statistics.
--DBCC FREESESSIONCACHE
--DBCC FREEPROCCACHE
--DBCC DROPCLEANBUFFERS
--我們需要一個 DATETIME2
來獲取動作之前的時刻
--We need a DATETIME2
to get the moment before the action
DECLARE @d DATETIME2;
--以及具有可變部分的范圍,以避免通過緩存結果產生任何偏差
--And a range with a variable part to avoid any bias through cached results
DECLARE @range INT=300 + (SELECT COUNT(*) FROM Result)
--現在讓我們開始:簡單計數到范圍 x 范圍
--Now let's start: Simple counting to range x range
SET @d=SYSUTCDATETIME();
SELECT * into tmp FROM rangeKonstantin(0,@range*@range);
INSERT INTO Result(Measurement,TimeKonst,tmpCount) SELECT 'a count to @range*@range',DATEDIFF(millisecond,@d,SYSUTCDATETIME()),(SELECT Count(*) FROM tmp);
DROP TABLE tmp;
SET @d=SYSUTCDATETIME();
SELECT * into tmp FROM rangeShnugo(0,@range*@range);
INSERT INTO Result(Measurement,TimeShnugo,tmpCount) SELECT 'a count to @range*@range',DATEDIFF(millisecond,@d,SYSUTCDATETIME()),(SELECT Count(*) FROM tmp);
DROP TABLE tmp;
SET @d=SYSUTCDATETIME();
SELECT * into tmp FROM rangeTable(0,@range*@range);
INSERT INTO Result(Measurement,TimeTable,tmpCount) SELECT 'a count to @range*@range',DATEDIFF(millisecond,@d,SYSUTCDATETIME()),(SELECT Count(*) FROM tmp);
DROP TABLE tmp;
--并且 - 更重要 - 使用 APPLY
調用具有逐行更改參數的函數
--And - more important - using APPLY
to call a function with a row-wise changing parameter
SET @d=SYSUTCDATETIME();
select h.val hour, m.val minute into tmp from rangeKonstantin(0, @range) h cross apply rangeKonstantin(0, h.val) m;
INSERT INTO Result(Measurement,TimeKonst,tmpCount) SELECT 'c @range apply',DATEDIFF(millisecond,@d,SYSUTCDATETIME()),(SELECT Count(*) FROM tmp);
DROP TABLE tmp;
SET @d=SYSUTCDATETIME();
select h.val hour, m.val minute into tmp from rangeShnugo(0, @range) h cross apply rangeShnugo(0, h.val) m;
INSERT INTO Result(Measurement,TimeShnugo,tmpCount) SELECT 'c @range apply',DATEDIFF(millisecond,@d,SYSUTCDATETIME()),(SELECT Count(*) FROM tmp);
DROP TABLE tmp;
SET @d=SYSUTCDATETIME();
select h.val hour, m.val minute into tmp from rangeTable(0, @range) h cross apply rangeTable(0, h.val) m;
INSERT INTO Result(Measurement,TimeTable,tmpCount) SELECT 'c @range apply',DATEDIFF(millisecond,@d,SYSUTCDATETIME()),(SELECT Count(*) FROM tmp);
DROP TABLE tmp;
--我們通過一個簡單的GO 10
--We repeat the above 10 times by a simple GO 10
GO 10 --do the whole thing 10 times
--現在讓我們獲取結果
--Now let's fetch the results
SELECT Measurement
,AVG(TimeKonst) AS AvgKonst
,AVG(TimeShnugo) AS AvgShnugo
,AVG(TimeTable) AS AvgTable
FROM Result
GROUP BY Measurement;
SELECT * FROM Result ORDER BY Measurement,ID;
--清理
USE master;
GO
DROP DATABASE testCounter;
在強大的機器上運行的 v2014 上使用緩存和統計的 range=300 的結果:
The results for range=300 using caching and statistics on a v2014 running on a strong machine:
Measurement AvgKonst AvgShnugo AvgTable
a count to @range*@range 626 58 34
c @range apply 357 17 56
我們可以看到,帶有 WHILE
的 TVF 比其他方法慢得多.
We can see, that the TVF with the WHILE
is much slower than the other approaches.
在真實世界場景中,使用的范圍(300 將計為 ~90k)相當小.在這里,我用 1000 的 @range
重復(計數超過 1 mio),仍然不是很大...
In a real-world-scenario the range used (300 will count to ~90k) is rather small. Here I repeated with a @range
of 1000 (count goes over 1 mio), still not very big...
Measurement AvgKonst AvgShnugo AvgTable
a count to @range*@range 6800 418 321
c @range apply 3422 189 177
我們學到了什么:
- 對于小范圍計數,即時計數似乎最好
- 當集合大小增加時,任何計數方法的擴展性都很差.
- 基于表格的方法最適合大型集合.
- 帶有
WHILE
循環的多語句 TVF 無法正常工作.
- For small-range counting the tally-on-the-fly seems best
- Any counting approach scales badly when the set size increases.
- The table-based approach is best with large sets.
- The multi-statment TVF with a
WHILE
loop is not holding up.
在本地運行 SQL-Server 2017 的中型筆記本電腦上,對于 range=1000,我得到以下信息:
On a medium laptop with SQL-Server 2017 running locally I get the following for range=1000:
Measurement AvgKonst AvgShnugo AvgTablea
count to @range*@range 10704 282 214
c @range apply 5671 1133 210
我們看到,使用更大的組合,桌子方法顯然會獲勝.
And we see, that with larger sets the table approach wins clearly.
值得一提的是:引擎會嘗試預測行數以找到最佳計劃.多語句 TVF 總是用一行來估計.一個簡單的計數器也將被估計為一行.但是使用索引表,引擎將能夠預測行并找到更好的計劃.
And worth to mention: The engine tries to predict row counts in order to find the best plan. A multi-statement TVF is always estimated with just one row. A simple counter will be estimated with one row too. But with the indexed table the engine will be able to predict the rows and find a better plan.
這篇關于如何添加“空"沒有數據的每分鐘記錄到我在 SQL 服務器中的查詢的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!