問題描述
我有一個(gè)程序可以匯總一個(gè)表中的非規(guī)范化數(shù)據(jù)并將其移動(dòng)到另一個(gè)表中,由于數(shù)據(jù)錯(cuò)誤,我們經(jīng)常在插入時(shí)遇到重復(fù)的鍵沖突.我想為用戶創(chuàng)建一個(gè)報(bào)告,以幫助他們確定錯(cuò)誤的原因.
I have a program that summarizes non-normalized data in one table and moves it to another and we frequently get a duplicate key violation on the insert due to bad data. I want to create a report for the users to help them identify the cause of the error.
例如,考慮以下人為設(shè)計(jì)的簡(jiǎn)單 SQL,它匯總了 Companies 表中的數(shù)據(jù)并將其插入 CompanySum 中,該 CompanySum 的主鍵為 State/Zone.為了使 INSERT 不會(huì)失敗,對(duì)于每個(gè)唯一的主鍵 State/Zone 組合,公司/代碼的不同組合不能超過一個(gè).如果有,我們希望插入失敗,以便可以更正數(shù)據(jù).
For example, consider the following contrived simple SQL which summarizes data in the table Companies and inserts it into CompanySum, which has a primary key of State/Zone. In order for the INSERT not to fail, there cannot be more than one distinct combinations of Company/Code for every unique primary key State/Zone combination. If there is, we want the insert to fail so that the data can be corrected.
INSERT INTO CompanySum
(
[State]
,[Zone]
,[Company]
,[Code]
,[Revenue]
)
SELECT
--Keys of target
[State]
,[Zone]
--We are expecting to have one distinct combination of these fields per key grouping
,[Company]
,[Code]
--Aggregate
,SUM([Revenue])
FROM COMPANIES
GROUP BY
[State]
,[Zone]
,[Company]
,[Code]
我想創(chuàng)建一份報(bào)告來幫助用戶輕松識(shí)別和更正數(shù)據(jù),以便在一個(gè)州/地區(qū)內(nèi)只有一個(gè)不同的公司/代碼組合.對(duì)于每個(gè)不同的州/地區(qū)值,我想確定州/地區(qū)內(nèi)不同的公司/代碼組合.如果一個(gè)州/地區(qū)中有多個(gè)公司/代碼組合,我希望該州/地區(qū)中的所有記錄都顯示在輸出中.例如,這里是示例輸入和所需的輸出:
I would like to create a report to help the users easily identify and correct the data so that there is only one distinct Company/Code combination within a State/Zone. For each distinct State/Zone value, I would like to identify the distinct Company/Code combinations within the State/Zone. If there are more than one Company/Code combinations within a State/Zone, I would like all of the records in the State/Zone to be displayed in the output. For example, here is the sample input and desired output:
Data:
RecordNumber State Zone Company Code Revenue
------------ ----- ---- ------- ---- --------
1 CT B State of CT 65453 10
2 CT B State of CT 65453 3
3 CT B Travelers 33443 20
4 CT C Cigna 45678 24
5 CT C Cigna 45678 234
6 MI A GM 48089 100
7 MI A GM 54555 200
8 MI B Chrysler 43434 44
Desired Output:
RecordNumber State Zone Company Code Revenue
------------ ----- ---- ------- ---- --------
1 CT B State of CT 65453 10
2 CT B State of CT 65453 3
3 CT B Travelers 33443 20
6 MI A GM 48089 100
7 MI A GM 54555 200
這是創(chuàng)建此測(cè)試場(chǎng)景所需的 DDL 和 DML
Here is the DDL and DML needed to create this test scenario
CREATE TABLE [dbo].[Companies](
[RecordNumber] [int] NULL,
[State] [char](2) NOT NULL,
[Zone] [varchar](30) NOT NULL,
[Company] [varchar](30) NOT NULL,
[Code] [varchar](30) NOT NULL,
[Revenue] [numeric](9, 1) NULL
) ON [PRIMARY]
CREATE TABLE [dbo].[CompanySum](
[State] [char](2) NOT NULL,
[Zone] [varchar](30) NOT NULL,
[Company] [varchar](30) NOT NULL,
[Code] [varchar](30) NOT NULL,
[Revenue] [numeric](9, 1) NULL,
CONSTRAINT [PK_CompanySum] PRIMARY KEY CLUSTERED
(
[State] ASC,
[Zone] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
DELETE FROM [dbo].[Companies]
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (1, N'CT', N'B', N'State of CT', N'65453', CAST(10.0 AS Numeric(9, 1)))
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (2, N'CT', N'B', N'State of CT', N'65453', CAST(3.0 AS Numeric(9, 1)))
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (3, N'CT', N'B', N'Travelers', N'33443', CAST(20.0 AS Numeric(9, 1)))
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (4, N'CT', N'C', N'Cigna', N'45678', CAST(24.0 AS Numeric(9, 1)))
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (5, N'CT', N'C', N'Cigna', N'45678', CAST(234.0 AS Numeric(9, 1)))
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (6, N'MI', N'A', N'GM', N'48089', CAST(100.0 AS Numeric(9, 1)))
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (7, N'MI', N'A', N'GM', N'54555', CAST(200.0 AS Numeric(9, 1)))
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (8, N'MI', N'B', N'Chrysler', N'43434', CAST(44.0 AS Numeric(9, 1)))
GO
這是對(duì)我之前一篇文章的更好重構(gòu)SQL 返回一組關(guān)鍵列中非關(guān)鍵列的唯一組合,我試圖幫助澄清問題并提供一個(gè)讀者可以使用的簡(jiǎn)單工作示例.
This is a hopefully better re-construction of a previous post of mine SQL to return unique combinations of non key columns within a set of key columns where I am trying to help clarify the question and provide a simple working example that readers can use.
請(qǐng)看這個(gè) SQL 小提琴:
Please see this SQL Fiddle:
http://sqlfiddle.com/#!18/d0141/1
推薦答案
這是解決方案嗎?
小提琴:http://sqlfiddle.com/#!18/12e9a0/9一個(gè)>
select c.*
from
Companies c
inner join (
select State, Zone
from Companies
group by State, Zone
having count(distinct Company + Code) > 1
) as dup_state_zone
on(
c.State = dup_state_zone.State
and c.Zone = dup_state_zone.Zone
)
已編輯 - 修復(fù)了 have 子句,有一點(diǎn)作弊...
Edited - Fix the having clause, with a little cheat...
這篇關(guān)于編寫 SQL 以識(shí)別分組中的多個(gè)子分組的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!