Friday, August 31, 2018
T‐SQL: MERGE statement - Single statement for Insert + Update + Delete operation
1.
Background
MERGE is a new feature of SQL Server
2008 that provides us an efficient way to perform multiple DML operations like
INSERT, UPDATE & DELETE. It is commonly known as UPSERT (update/insert).
In previous versions of SQL Server, we had to write separate statements for
inserting, updating and deleting data based on certain conditions, but now, by
using MERGE statement we can include the logic of such data modifications in
one statement that even checks when the data is matched then just update it and
when unmatched then insert it.
2.
Feature of
Merge Statement?
One of the most important advantages of MERGE
statement is all the data is read and processed only once. In previous versions
of SQL Server, three different statements (INSERT, UPDATE or DELETE) has to be
written to process three different activity, however using MERGE statement all
update activity can be done in one pass of database table. This is quite an
improvement in performance of database query.
3.
How MERGE
statement works?
The
MERGE statement performs insert, update, and delete operations within the same
statement. We can specify a "Source" record set and a "Target"
table and JOIN the Source & Target table with condition. Then
specify the type of modification like if the row exists in the target table, UPDATE
one or more columns; otherwise, INSERT the data into a new row.
4.
What are
the advantages of MERGE statement?
·
All
the data is read and processed only once in MERGE statement. All the update
activities can be done in one pass of database table. It improves the
performance of database query.
·
It
reduces the complexity of your code or procedure.
·
After
finishing the execution, we can audit all the transactions completed on the
rows with using OUTPUT clause.
·
Synchronization
of transactional tables.
5.
Syntax of
MERGE statement
MERGE
[
TOP ( expression ) [ PERCENT ] ]
[
INTO ] target_table [ WITH ( <merge_hint> ) ] [ [ AS ] table_alias]
USING <table_source>
ON <merge_search_condition>
[
WHEN MATCHED [ AND <clause_search_condition> ]
THEN <merge_matched> ]
[
WHEN NOT MATCHED [ BY TARGET ] [ AND <clause_search_condition> ]
THEN <merge_not_matched> ]
[
WHEN NOT MATCHED BY SOURCE [ AND <clause_search_condition> ]
THEN <merge_matched> ]
[
<output_clause> ]
[ OPTION (
<query_hint> [ ,...n ] ) ];
The MERGE syntax consists of five
primary clauses:
·
The
MERGE clause specifies the table or
view that is the target of the insert, update, or delete operations.
·
The
USING clause specifies the data
source being joined with the target.
·
The
ON clause specifies the join
conditions that determine where the target and source match.
·
The
WHEN clauses (WHEN MATCHED, WHEN NOT MATCHED BY TARGET, and WHEN NOT MATCHED BY SOURCE) specify the actions to take
based on the results of the ON clause and any additional search criteria
specified in the WHEN clauses.
·
The
OUTPUT clause returns a row for each
row in the target that is inserted, updated, or deleted.
Source & Target Search Condition:
·
A
matched pair consisting of one row from the target and one from the source.
This is the result of the WHEN
MATCHED clause.
·
A
row from the source that has no corresponding row in the target. This is the
result of the WHEN NOT MATCHED BY TARGET clause.
·
A
row from the target that has no corresponding row in the source. This is the
result of the WHEN NOT MATCHED BY SOURCE clause.
6.
Example of
MERGE statement
--Creating tables Employee and Associate
CREATE TABLE dbo.Employee
( EmployeeID INT
,EmployeeName VARCHAR(50)
,CONSTRAINT
Employee_PK PRIMARY KEY(EmployeeID)
)
CREATE TABLE dbo.Associate
( AssociateID INT
,AssociateName VARCHAR(50)
,CONSTRAINT
Associate_PK PRIMARY KEY(AssociateID))
GO
--Inserting data into Employee table
INSERT dbo.Employee(EmployeeID, EmployeeName) VALUES(1, 'Charles');
INSERT dbo.Employee(EmployeeID, EmployeeName) VALUES(2, 'Belton');
INSERT dbo.Employee(EmployeeID, EmployeeName) VALUES(3, 'John David');
INSERT dbo.Employee(EmployeeID, EmployeeName) VALUES(4, 'Rick Bulmar');
INSERT dbo.Employee(EmployeeID, EmployeeName) VALUES(5, 'Allan Paul');
GO
--Inserting data into Associate table
INSERT dbo.Associate(AssociateID,AssociateName) VALUES(4,'Mark Bulmar');
INSERT dbo.Associate(AssociateID, AssociateName) VALUES(5, 'Steward Paul');
INSERT dbo.Associate(AssociateID, AssociateName) Values(6, 'Jonarthan');
INSERT dbo.Associate(AssociateID, AssociateName) Values(7, 'Vijit');
GO
--Checking the data into Employee table
SELECT * FROM Employee
--Checking the data into Associate table
SELECT * FROM Associate
Below is the MERGE with
Source
Table:
Associate
Target
Table: Employee
Join
condition : E.EmployeeID =
A.AssociateID
Objectives
Obj‐1: If EmployeeID and AssociateID is not
matched & Associate name starts with ‘J’ then insert the new record into
the Employee table.
Obj‐2: If matched, then update the
EmployeeName with the AssociateName.
Obj‐3: If EmployeeID and AssociateID is not matched & Associate name
starts with ‘B’ then delete the employeeNames that starts with ‘B’
-- MERGE statement with the join
conditions specified correctly.
BEGIN TRAN;
MERGE Employee AS E
USING Associate AS A
ON (E.EmployeeID = A.AssociateID)
WHEN NOT MATCHED BY TARGET AND A.AssociateName LIKE 'J%'
THEN INSERT(EmployeeID, EmployeeName) VALUES(A.AssociateID, A.AssociateName)
WHEN MATCHED
THEN UPDATE SET E.EmployeeName = A.AssociateName
WHEN NOT MATCHED BY SOURCE AND E.EmployeeName LIKE 'B%'
THEN DELETE
OUTPUT $action, inserted.*, deleted.*;
ROLLBACK TRAN; --For checking different
scenarios, Rollbacking the TRAN
--COMMIT TRAN;
GO
After executing
above MERGE statement. Following result comes with OUTPUT Clause.
Solution‐Obj‐1: As per the Obj‐1,
if JOIN condition NOT MATCHED and AssociateName starts with ‘J’. Insert the new
record into employee table. So “Jonarthan” is the only associate whose
name start with ‘J’ and his AssociateID is 6 that is not available in Employee
table. So MERGE inserted this record into the Employee table.
Solution‐Obj‐2: As per the Obj‐2, If JOIN
condition matched then updates all the EmployeeName with AssociateName.
AssocaiteID‐ 4 & 5 and EmployeeID 4 & 5 are available.
So EmployeeName “Rick Bulmar” has been replaced by
AssociateName “Mark Bulmar” and “Allan Paul” is replaced by “Steward
Paul”.
Solution‐Obj‐3:
According to Obj‐3, if condition not matched and any employee has name started
with ‘B’ then deletes the records. So “Belton” is the only employee
whose name start with ‘B’ & his EmployeeID is 2 that ID is not available in
Associate table. So MERGE deleted this record from the Employee table.
7.
Limitations
MERGE DML operation only implements with single target
table or view. For example you can't insert, delete or update multiple tables
using MERGE statement at once.
8.
Conclusion
After lot of research in
T‐SQL by Microsoft, they introduce MERGE
DML Statement. Using this we can combine INSERT, UPDATE, or DELETE
operations on a destination table Based on results of a source table. After
finishing the execution, we can audit all the transactions completed on the
rows with using OUTPUT clause. And if you are writing huge procedure
including lots of IF ELSE statements and validating always RECORDS EXIST OR NOT
then Apply MERGE command.
‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐End
of Document‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
SQL Server Integration service (SSIS) 2012 – High Availability & Automated Fault Tolerance Proposed Solution
1.
Objective
The Objective of
this article is proposed a solution for SQL Server Integration Service for high
availability of SSIS envrionment as Integration Services service is not a
clustered or cluster-aware service and does not support failover from one
cluster node to another. Therefore, in a clustered environment,
Integration Services should be installed and started as a stand-alone service
on each node in the cluster.
https://docs.microsoft.com/en-us/sql/integration-services/service/integration-services-ssis-in-a-cluster
2.
Proposed Solution
The proposed solution will use the
following native capabilities of SQL Server
I.
SQL Server Always On to ensure Database
instance(s) High Availability
II.
SQL Server Agent Capability for automated
triggering & switching from one envrionment to another
III.
SQL Server Database mail configuration for
alert/notification
IV.
SQL Server Database Engine/T-SQL programming for
communicating with SSIS Server and storing their response
3.
Proposed Architecture
Fig-1 shows proposed
Architecture – SSIS 2012 High availability & automated Fault Tolerance
4.
Steps to implement SSIS HA & Fault Tolerance
4.1.
Table(s): Proposed
solution consists of 2 table(s):
•
SSIS_Server_Health_Monitoring – To store SSIS
server health status which we get from stored procedure which will ping the
server and store the response in the table. Below is the table structure
SSIS Instance Name
|
Health_Flag
|
Timestamp
|
SSIS – Primary
|
0
|
|
SSIS – Secondary
|
1
|
•
Fault_Tolerance_Configuration - This
table will be configuration table in which we can put configuration for
automated fault tolerance between Primary/Secondary server(s).
SSIS Instance Name
|
SSIS Failover Instance Name
|
Triggering Job
|
SSIS – Primary
|
SSIS – Secondary
|
Fault Tolerance Job
|
4.2.
SQL Agent Job(s): Proposed
solution consists of 2 SQL Agent Job(s):
•
Monitor_SSIS_Server_Health Job: which
will communicates to both the primary & secondary SSIS server and store the
response in the DB
•
Automatic_Fault_Tolerence Job:
•
It will keep check the
SSIS_Server_Health_Monitoring table
•
Whenever there is a zero flag found for a
server, it will start the job into another server based on the Fault_Tolerance_Configuration
5.
SSIS Package Level Load Balancing
·
Package Level load balancing CAN BE PERFORMED by
using SSIS -Balanced Data Distributor/ performing parallel processing in the
package.
6.
SSIS Server Level Load Balancing
·
Sever level load balancing has been proposed by
adding more no of SSIS server(s) with Read committed Isolation snapshot.
·
Following are the issue(s) if we go for multiple
server level Load balancing with default
read committed isolation level:
o
Process of one SSIS instance has to wait for
another to complete its execution. e.g. If SSIS 1 is processing the a table
then SSIS 2 has to wait till SSIS 1 done with its execution.
o
Implementing Load Balancing in ETL operation may
result into query wait state, locking & Deadlock causes serious performance
issues on the reporting portal
o
To avoid the locking /deadlocking issue, we can
use Read committed Isolation Level
which ensures that Report will show only those data that were committed before
the start of the transaction.
7.
Restarting of Packages
·
If SSIS instance goes down during package
execution
o
Process will be start from starting Point.
o
ETL & SQL Error & transactional handling
will be used to start the process from starting point
·
If the package failed during execution then
o
Package will be restarted from the point of
failure by using Checkpoint capabilities of the SSIS.
o
ETL & SQL Error & transactional handling
will be used for starting the process from point of failure.
8.
Other Options
·
Using existing version of SQL Server - SSIS 2012
& perform Manual failover with the
help with support team
·
SSIS 2016 – Always on for SSIS Catalog. Need to
upgrade the existing SQL 2012 to SQL 2016
9.
Conclusion :
By using above solution, High Availability & Automated Fault
Tolerance can be implemented for SSIS 2012 instances.
-------------------------------------End of
Article---------------------------------