Tactics for Parallel Processing in TSQL

parallel processing with SQL Server

Table of Contents



Parallel Processing with TSQL

Generally speaking you should NOT be doing much parallel processing with TSQL. If you want to build a parallel procssing application wholly in the database environment, SQL Server Integration Services is built to support parallel processing. If you’re building an application layer outside the database environment, the combination of Microsoft’s C# with the .NET framework (in addition to a menagerie of other options) provides robust functionality for asynchronous parallel processing.

However, edge cases do come up. Maybe you have an ad hoc data processing task, don’t have access to SSIS / a .NET platform, or you’re just playing around seeing how far you can stretch the functionality of SQL Server. In this post I will show you some tactics that can be used to implement parallel processing with TSQL.


Preventing Blocking During Parallel Processing

Whether you are writing a script for an ad hoc data processing task or a more complex application the primary issue to over come for parallel processing in SQL will be to manage concurrent writes so they do not block one another. If there is significant blocking, even without deadlocks, then multiple concurrent sessions may yield practically no reduction in processing time.


Reveal Blocking with sp_whoisactive

Before you proceed you want to have installed sp_whoisactive. It is used to monitor session activity. In the case of this post it will be used to reveal session blocking. In my opinion, it is the most broadly useful utility for SQL Server and every environment should have it available.

For an example, create the table below using block 1, then run block 2 twice, once in two separate sessions.

begin -- block 1
drop table if exists dbo.TestTable;
create table dbo.TestTable (dummy bit);
insert into dbo.Test (dummy) values (NULL),(NULL);
end
go

begin -- block 2
-- rollback transaction
begin transaction
delete top(1) TT
from dbo.TestTable TT
end
go

begin -- block 3
exec sp_whoisactive @output_column_list = 
   '[session_id]
   [blocking_session_id]
   [status]
   [sql_text]
   [CPU]
   [reads]
   [physical_reads]
   [writes]
   [tempdb_allocations]
   [tempdb_current]'
end
go

Finally in a 3rd session run block 3 and you should see output similar to the image below. The sessions being blocked have the session Id of the blocking session displayed.

**NOTE**: DO NOT FORGET TO ROLLBACK THE TRANSACTIONS OR CLOSE THE SESSIONS CONTAINING BLOCK 2 by highlighting ‘rollback transaction’ and executing. Otherwise the transaction will remain open and dbo.TestTable will remain locked.

sp_whoisactive showing session blocking
sp_whoisactive showing session blocking


Index Seek

Any DML operation that you wish to perform on a table using parallel sessions will need to use indexes tailored to allow concurrent writes. Broadly speaking, that means query plans must use index seeks, rather than scans, so that they only lock the necessary pages and not the entire index or table.

Create a test table by running code below.

drop table if exists dbo.ParallelUpdate;
create table dbo.ParallelUpdate (
   Id int identity(1,1) not NULL
   ,ColA varchar(50)
);

with
   _n as (
      select 1 as n
      union all
      select n + 1
      from _n
      where n < 32767
   )
insert into dbo.ParallelUpdate (ColA)
select NULL
from _n
option(maxrecursion 32767);

Now in two separate sessions run the following script USING A DIFFERENT Id VALUE FOR EACH update. After you have executed both run sp_whoisactive. You should see one session blocking the other.

-- rollback transaction;
begin transaction;

   update PU
   set ColA = 'x'
   from dbo.ParallelUpdate PU
   where Id = --<=== fill in Id here
   ;

Now rollback your transaction from both sessions. Then exec the script below to create a primary key and thus a unique clustered index on the table. Repeat running two separate update sessions again. This time when you run sp_whoisactive you will notice there is no blocking. This is because the index is used to access only the pages needed for the update, rather than scanning the whole table and thus creating a table lock.

alter table dbo.ParallelUpdate
add constraint pk_ParallelUpdate primary key (Id)
;


Key Tables

In my experience by far the simplest and most common application for parallel processing in a database is using multiple sessions to process batches of writes. This is typically done because a single write would be too large to be handled by the SQL Server transaction log or you are writing against a production object and you don’t want to lock it for an extended period of time.

The operation of selecting a batch of rows for a session to process, by necessity, must lock the ENTIRE key set (to prevent other sessions from attempting to secure the same records). This obviously interferes with the desired concurrent writes on the data object. One way to overcome this is to use a key table: a separate table that stores the same keys as the data table and a small set of state columns.

-- create key table with unique clustered index
drop table if exists dbo.KeyTable_ParallelUpdate;
create table dbo.KeyTable_ParallelUpdate (
   Id int
   ,SessionId int
   ,IsUpdated bit
);
create unique clustered index UCIX_KeyTable_ParallelUpdate on dbo.KeyTable_ParallelUpdate (IsUpdated,SessionId,Id);

-- insert all keys from the data table as unprocessed records
insert into dbo.KeyTable_ParallelUpdate
select Id ,NULL ,0
from dbo.ParallelUpdate;


Table Hints

I mentioned in the last section that in order to select a batch of keys for a session to process the entire key set must be locked, which produces a consecutive synchronous operation. This stands in contrast to the general goal of a parallel processing application: concurrent asynchronous operations. So clearly we need a tool to exert explicit control over how operations are process and table hints provide us with that control.

The readcommitted table hint disallows rows with uncommitted writes from being read and also acquires a table lock, which will prevent other queries set to read only committed rows from proceeding until the original lock is released. The aptly named forceseek table hint is used to force an index seek rather than a scan. The index seek only locks specific pages while a scan locks the entire table or index.

*NOTE*: MANY sources will tell you to be aware of forceseek (or any query plan hint for that matter). As long as you don’t alter the indexes or add more columns to the table then there is no problem with a table hint. They can be counter productive when there is very little data in the table, but that isn’t usually an issue for a parallel processing application.

/*  This script will update all the rows in dbo.ParallelUpdate and
    can be run from multiple sessions concurrently.
    
    The arbitrary wait time will leave the transaction open for
    long enough to ensure the transactions overlap.
*/

set transaction isolation level read uncommitted
set nocount off

declare
   @BatchSize int = 200
   ,@ArbitraryWait varchar(8) = '00:00:20'

while 1 = 1
begin

   -- secure Ids for processing by session
   -- SYNCHRONOUS OPERATION
   update top(@BatchSize) KT
   set SessionId = @@SPID
   from dbo.KeyTable_ParallelUpdate KT with(readcommitted)
   where
      IsUpdated = 0
      and SessionId is NULL

   ;
   if @@ROWCOUNT = 0 break; -- loop exit here

   raiserror('Update started...',-1,-1) with nowait
   --rollback transaction
   begin transaction

      -- ASYNCHRONOUS OPERATION
      update TP
      set ColA = 'x'
      from
         dbo.ParallelUpdate TP with(forceseek)
         join dbo.KeyTable_ParallelUpdate gt
            on gt.IsUpdated = 0
            and gt.SessionId = @@SPID
            and gt.Id = TP.Id

      waitfor delay @ArbitraryWait

   commit transaction
   raiserror('Update completed!',-1,-1) with nowait

  -- ASYNCHRONOUS OPERATION
  update KT
  set
    IsUpdated = 1
    ,SessionId = NULL
  from dbo.KeyTable_ParallelUpdate KT with(forceseek)
  where
    IsUpdated = 0
    and SessionId = @@SPID

end
go

*NOTE*: If you are unfamiliar with raiserror, it can be used to output text in real time by using the nowait option as opposed to print, which waits until execution finishes to output the text. Setting the severity between -1 and 9 does not actually raise a system error.


Analyze Your Query Plan

If you are experiencing blocking in your application use sp_whoisactive to identify the exact SQL statements encountering blocking and then use the query plan to scrutinize individual steps of those statements. Look for key lookups, index / full table scans, or anything else that is not an index seek.

query plan index scan
query plan index scan


How to Programmatically Initiate Parallel Processing

For many ad hoc parallel processing tasks it isn’t necessary to initiate them via a programmatic method. Simply manually executing code in multiple sessions will do just fine. But there are situations where you may wish to programmatically initiate concurrent sessions. There are two options for doing this in TSQL: SSIS packages stored in a catalogue or SQL Server Agent jobs.


SSIS Packages

One obvious way to initiate parallel sessions from TSQL is to use SSIS packages installed in a catalogue in a SSISDB on your SQL Server.

This also highlights my earlier point about why you shouldn’t be using TSQL (if you have access to SQL Server Integration Services and are using its packages, perhaps your application should be built in SSIS), but ignoring that, if you really wish to, you can execute multiple concurrent sessions of an SSIS package asynchronously.

This solution requires Integration Services to be installed on your SQL Server, an SSISDB to be created, and the packages entered into a catalogue.

SSIS packages imported into the Integration Services catalogue
SSIS packages imported into the Integration Services catalogue

Once those components are in place you can use the TSQL stored procedures [catalog].[create_execution] and [catalog].[start_execution]. Note you can use [catalog].[set_execution_parameter_value] to pass a parameters, including one that can alter the package executions from asynchronous to synchronous.

declare
   @execution_id bigint
   ,@SessionCount int = 3

-- user loop to start 3 concurrent sessions
while 1 = 1
begin

  -- create execution
  exec [SSISDB].[catalog].[create_execution] 
    @package_name=N'TestPackage.dtsx'
    , @project_name=N'snaredrum-ssis'
    , @folder_name=N'snaredrum'
    , @use32bitruntime=False
    , @reference_id=NULL
    , @execution_id=@execution_id output
  
  -- start execution is asynchronous by default
  exec [SSISDB].[catalog].[start_execution] @execution_id
  
  -- exit loop if
  if @SessionCount <= 0 break
  
  -- reduce @SessionCount
  set @SessionCount -= 1

end
go


SQL Server Agent Jobs

The MSDB stored procedure sp_start_job is used to start SQL Server Agent jobs and has a unique attribute within SQL Server: functionally it executes asynchronously. Functionally meaning it does actually wait for the end of its own execution, but its only task is to start a job and then end. It doesn’t wait for the job it kicked off to do anything other than start, which makes it functionally asynchronous for most intents and purposes.

Unlike SSIS packages, SQL Server Agent jobs CANNOT run concurrent instances. If you attempt to execute a job that is already executing you will receive an error message. So the trick to programmatically executing multiple concurrent sessions with jobs, albeit not an elegant one, is to create multiple jobs and then initiate them using any number of methods (dynamic SQL, if statements with bitmasks, etc.)

create proc dbo.sp_StartParallelProcssingJobs
   @SessionCount int = 1
as
begin

   set @SessionCount = case
      when @SessionCount < 1 then 1
      when @SessionCount > 5 then 5
      else @SessionCount
   end
   declare @SessionBitMask bigint = power(2,@SessionCount) - 1
   
   -- turn the desired count into a bitmask
   if SessionBitMask  & 1 = 1 exec sys.sp_start_job  @job_name = 'TestJob1'
   if SessionBitMask  & 2 = 2 exec sys.sp_start_job  @job_name = 'TestJob2'
   if SessionBitMask  & 4 = 4 exec sys.sp_start_job  @job_name = 'TestJob3'
   if SessionBitMask  & 8 = 8 exec sys.sp_start_job  @job_name = 'TestJob4'
   if SessionBitMask  & 16 = 16 exec sys.sp_start_job  @job_name = 'TestJob5'

end
go


How to Coordinate Parallel Processing

Transaction blocking, is a method of coordinating the DML operations so they don’t block one another. However, there are many other challenges and solutions to controlling and coordinating parallel processing sessions.


CONTEXT_INFO

The varbinary session variable called CONTEXT_INFO can be viewed using the [sys].[dm_exec_requests] object. This provides a tool for parallel sessions to communicate with one another. It can be used to simply identify parallel sessions to one another or to indicate at what point in a logical process they have reached or other info about the state of the sessions.

-- run these statement in one session
set context_info 0x3FC23F9F3858457B

select
   @@SPID
   ,context_info()

-- w/o closing the first session open another and run this statement
select session_id, [context_info]
from [sys].[dm_exec_requests]


State Table

An application state table is a table meant to store application states. These could be configurations for the application such as procedure parameters, output from the application such as logging Ids last run time, etc.

drop table if exists dbo.AppState;
create table dbo.AppState (
	[Group] varchar(4000) not NULL
	,[Item] varchar(4000) not NULL
	,[State] varchar(4000)
)
go
create unique clustered index UCIX_AppState on dbo.AppState ([Group] ,[Item]);
go

insert into dbo.AppState ([Group] ,[Item] ,[State])
values
	('','Last Started LogId','0')
	,('','Last Completed LogId','0')
	,('','Last UTCTime Started',NULL)
	,('','Last UTCTime Completed',NULL)
	,('','CONTEXT_INFO','0x3FC23F9F3858457B')
;


Applock

Applocks can be used to prevent the progress of a session or transaction. If a session or transaction attempts to acquire a lock, which has already been acquired then it will wait until that lock is released.

Applocks can be acquired and released on named resources within a particular Db principal. There are different lock modes with the two most common being “shared” and “exclusive.” Multiple shared locks can be acquired on a resource, but only one exclusive lock can be acquired. Additionally, locks can be owned by the session or transaction meaning they will be automatically release when the owner session or transaction ends.

**NOTE**: Multiple concurrent locks can be acquired by the same lock owner. Each individual lock requires its own execution of sys.sp_releaseapplock to release.

/*  acquires a "shared" Applock on resource TestLock within 
    the dbo principal
*/
exec sys.sp_getapplock
  @Resource = 'TestLock'
  ,@LockMode = 'shared'
  ,@LockOwner = 'session'
  ,@DbPrincipal = 'dbo'
;

/*  fails to acquire an "exclusive" Applock on resource 
    because another lock already exists on it
*/
exec sys.sp_getapplock
  @Resource = 'TestLock'
  ,@LockMode = 'shared'
  ,@LockOwner = 'session'
  ,@DbPrincipal = 'dbo'
;

/*  Acquires an "exclusive" Applock on resource because 
    it within the public principal, not the dbo
*/
exec sys.sp_getapplock
  @Resource = 'TestLock'
  ,@LockMode = 'shared'
  ,@LockOwner = 'session'
  ,@DbPrincipal = 'public'
;

-- release both locks
while 'NoLock' <> (select APPLOCK_MODE('dbo','TestLock','session')
)
  exec sys.sp_releaseapplock
    @Resource = 'TestLock'
    ,@LockOwner = 'session'
    ,@DbPrincipal = 'dbo'
;


Application Hold

An application lock essentially is sessions that are “behind” waiting, ie making no further progress, until a session that is “ahead” releases a lock. A similar concept is an application hold, which is a session that is “ahead” being prevented from further progress until a session or sessions that are “behind” advance to some further point.

In the example below we will use the application control table to store checkpoint variables. Be sure to open two sessions and use the script below to set the context info for both.

-- create app control records for a checkpoint
insert into dbo.AppControl
values
   ('CheckPoints','A1','0')
   ,('CheckPoints','B1','0')
;

-- set session context
declare @CONTEXT_INFO varbinary
select @CONTEXT_INFO  = convert(varbinary(20),[State],1)
from dbo.AppControl
where
  [Group] = ''
  and [Item] = 'CONTEXT_INFO'
set CONTEXT_INFO @CONTEXT_INFO

The script below will check whether or not the other session has reached checkpoint 1 every 3 seconds and will not advance until this happens (or error if the other session is dead).

-- indicates path A has reached the checkpoint 1
update AP
set ControlValue = '1'
from dbo.AppControl AP
where
   ControlGroup = 'CheckPoints'
   and ControlItem = 'A1'
;

-- hold until path B have also reached checkpoint 1
while 1 = 1
begin

   -- check that path B is still active, if not raise error
   if not exists (
      select top(1) NULL
      from sys.dm_exec_requests
      where
        [context_info] = cast(0x3FE78047E9FF2A92 as varbinary(20))
        and session_id <> @@SPID
   )
      raiserror('Application died :(',16,1)
   ;

   -- if both A and B have reached checkpoint 1, exit loop
   if 2 = (
      select count(1)
      from dbo.AppControl AP with(readcommitted)
      where
         ControlGroup = 'CheckPoints'
         and ControlItem in ('A1','B1')
         and ControlValue = '1'
   )
      break
   ;

  -- check status every 3 seconds
  waitfor delay '00:00:03';

end
go


Example: Initializing a Parallel Processing Task Executed Manually

Below is an example, which uses multiple tools from this section to manually start a parallel processing session. Because this procedure is kicked of manually, there is no programmatic control over the set of parallel processing sessions. In this case only the first session to begin will decide what parameters are going to be used for the task. Every other session after the first must edit its input parameters to match the initial session.

drop proc [dbo].[sp_TestInitialization];
go
create proc [dbo].[sp_TestInitialization]
  @Parameter1 varchar(50)
  ,@Parameter2 int
  ,@Parameter3 date
as
begin try

  -- acquire lock, preventing other sessions from proceeding
  exec sys.sp_getapplock
    @Resource = 'initialization'
    ,@LockMode = 'exclusive'
    ,@LockOwner = 'session'
    ,@DbPrincipal = 'dbo'
  ;

  -- set CONTEXT_INFO
  declare @CONTEXT_INFO varbinary(20)
  select @CONTEXT_INFO = convert(varbinary(20),[state],1)
  from [dbo].[AppState]
  where
    Group = ''
    and Item = 'CONTEXT_INFO'
  set CONTEXT_INFO @CONTEXT_INFO
    
  /*  If no other session running with context, you are first
      and you set the variables, otherwise you read them.
  */
  if 1 = (
    select count(1)
    from sys.dm_exec_requests
    where [CONTEXT_INFO] = CONTEXT_INFO 
  )
  begin
  
    update APP
    set [State] = p.[State]
    from
      dbo.AppState APP with(forceseek)
      join (values
        ('Parameters','Param1',cast(@Parameter1 as varchar(255))
        ,('Parameters','Param2',cast(@Parameter2 as varchar(255))
        ,('Parameters','Param3',cast(@Parameter3 as varchar(255))
      ) p(Group,Item,[State])
        on p.[Group] = APP.[Group]
        and p.[Item] = APP.[Item]
        
  end
  else
    select
      @Parameter1 = cast(PivotData.[Param1] as int)
      ,@Parameter2 = cast(PivotData.[Param2] as varchar(50))
      ,@Parameter3 = cast(PivotData.[Param3] as date)
    from
      dbo.AppState source
      pivot (
        max([Value])
        for source.[item] in (
          [Param1]
          ,[Param2]
          ,[Param3]
        )
      ) as PivotData
    where
      source.[Group] = 'Parameters'
      and source.[Item] in ('Param1','Param2','Param3')
  
  -- release the applock
  while 'NoLock' <> (select APPLOCK_MODE('dbo','INITIALIZATION','session'))
    exec sys.sp_releaseapplock
      @Resource = 'initialization'
      ,@LockOwner = 'session'
      ,@DbPrincipal = 'dbo'
  ;
  
end try
begin catch

  -- clear context
  set CONTEXT_INFO NULL

  -- release the applock
  while 'NoLock' <> (select APPLOCK_MODE('dbo','INITIALIZATION','session')
  )
    exec sys.sp_releaseapplock
      @Resource = 'initialization'
      ,@LockOwner = 'session'
      ,@DbPrincipal = 'dbo'
  ;

end catch


Conclusion

Parallel processing, while not common to TSQL, is possible and I have covered the key concepts and tools available to do so. The most common parallel processing tasks in TSQL require concurrent writes to a single object, which means preventing blocking is critical. This can be accomplished using proper indexes and table hints to tightly control data access. Initiating the processes programmatically can be accomplished via SSIS package catalogues or predefined SQL Server Agent jobs. And finally, there are a myriad of other tools that can aid in parallel processing (CONTEXT_INFO, locks, control tables, etc.) with the commonality being the ability to communicate info between sessions and to block or hold the progress of sessions.

134 thoughts on “Tactics for Parallel Processing in TSQL”

  1. 各ドールは、最高品質の素材を使用して精密に作られており、そのリアルさと耐久性には感動すら覚えます.中国 えろ肌の質感は非常に柔らかく、リアルな触感があり、顔のディテールは精巧に作られていて、まるで生きているかのようです.

  2. ラブドール エロit is possible to click the backlink to examinThe explanation that dolls about the in stock dolls website page are costlier is due to the fact these dolls are already shipped to San Antonio Texasand inspected by among our workers members for tears and damag These in inventory dolls have been cleared as being in terrific ailment and when you will discover any important damagesdefects on any dolls about the in stock dolls page We’re going to Make sure you checklist it here that you should see about the solution web page of the doll you have an interest in.We’ll di

  3. さらに、comは安全で便利なショッピング体験を提供しています.ウェブサイトは最先端のセキュリティ機能を備えており、個人情報と支払い情報を厳重に保護しています.中国 えろ

  4. 自分のダッシュボード画面では投稿したイラストに対するリアクションが確認できるので、モチベーションにもつながりますよ。

  5. 「コスプレさせてしまえば気にならない」という方なら必要ありませんが、えろ 人形やはり肌触りがビニールというのが人間味を薄くさせてしまいます。

  6. I wonder if perhaps this is due to the fact that more than 50 percent of Townes’ respondents were from the South and my practice is located in the Northeast.ラブドール オナニーThe interracial couples who come in to see me or my associates for help frequently bring divergent lenses when it comes to beliefs,

  7. Pulmonary Embolism PE Medication order priligy online usa Patients diagnosed as having any malignancy, hepatic cirrhosis, pregnant women and patients on dialysis, and those with missing records were excluded from this study

  8. The creativity and originality you demonstrated in discussing [specific subtopic] were truly refreshing.ダッチワイフI was particularly impressed by how you introduced new ideas and perspectives that challenged conventional thinking and opened up new avenues for exploration.

  9. Releasing it is the easy part Just how do you build sexual tension with him? There are distinct times that you should be building sexual tension with him if you want to have really good sex later.When you are together with your manWhen you’re not togetherAs you’re leaving him1 Building sexual tension when you are together with your manRemember,エロ 人形

  10. we hope by age 13 all young people will have learned five principles about sex:Sex is good and from God.It is important that children from toddlerhood through adolescence learn that the feelings in their bodies are wonderful and designed by God to feel good.オナホ おすすめ

  11. They were punished if they cried or yelled at [for showing emotion],ラブドール 女性 用 and most men don’t get the opportunity to work through that. It’s not to excuse behaviour, it’s just to deepen the story. Misogyny, sexism and the patriarchy impacts everyone.’

  12. 形のいい美巨乳と色白ムッチリした体がエロい女の子オナドールです。ダッチワイフ可愛らしさとエロさの融合は強力です。明るく清純そうなルックスに、

  13. She says that while women should not be tasked withjydoll coddling their partners, they can still empathize, share their point of view, and be fully transparent with information about menstruation.

  14. Oh my goodness! Incredible article dude! Thank you so much, However I am having troubles with your
    RSS. I don’t understand the reason why I am unable to subscribe to it.

    Is there anybody else having the same RSS problems?
    Anyone that knows the answer will you kindly respond?

    Thanx!!エロ 下着

Leave a Comment

Your email address will not be published. Required fields are marked *