More about Calculating Concurrent Sessions
Itzik provides a solution to calculating concurrent sessions, including interval information.
November 23, 2009
Recently I wrote a two-part series in my T-SQL Black Belt column about Calculating Concurrent Sessions (Part I – November 2009, Part II – December 2009). I used the following code to create a table called Sessions and populate it with sample data:
SET NOCOUNT ON;
USE tempdb;
IF OBJECT_ID('dbo.Sessions', 'U') IS NOT NULL DROP TABLE dbo.Sessions;
CREATE TABLE dbo.Sessions
(
keycol INT NOT NULL,
app VARCHAR(10) NOT NULL,
usr VARCHAR(10) NOT NULL,
host VARCHAR(10) NOT NULL,
starttime DATETIME NOT NULL,
endtime DATETIME NOT NULL,
CONSTRAINT PK_Sessions PRIMARY KEY(keycol),
CHECK(endtime > starttime)
);
GO
CREATE INDEX idx_nc_app_st_et ON dbo.Sessions(app, starttime, endtime);
CREATE INDEX idx_nc_app_et_st ON dbo.Sessions(app, endtime, starttime);
INSERT INTO dbo.Sessions(keycol, app, usr, host, starttime, endtime)
VALUES(2, 'app1', 'user1', 'host1', '20090212 08:30', '20090212 10:30');
INSERT INTO dbo.Sessions(keycol, app, usr, host, starttime, endtime)
VALUES(3, 'app1', 'user2', 'host1', '20090212 08:30', '20090212 08:45');
INSERT INTO dbo.Sessions(keycol, app, usr, host, starttime, endtime)
VALUES(5, 'app1', 'user3', 'host2', '20090212 09:00', '20090212 09:30');
INSERT INTO dbo.Sessions(keycol, app, usr, host, starttime, endtime)
VALUES(7, 'app1', 'user4', 'host2', '20090212 09:15', '20090212 10:30');
INSERT INTO dbo.Sessions(keycol, app, usr, host, starttime, endtime)
VALUES(11, 'app1', 'user5', 'host3', '20090212 09:15', '20090212 09:30');
INSERT INTO dbo.Sessions(keycol, app, usr, host, starttime, endtime)
VALUES(13, 'app1', 'user6', 'host3', '20090212 10:30', '20090212 14:30');
INSERT INTO dbo.Sessions(keycol, app, usr, host, starttime, endtime)
VALUES(17, 'app1', 'user7', 'host4', '20090212 10:45', '20090212 11:30');
INSERT INTO dbo.Sessions(keycol, app, usr, host, starttime, endtime)
VALUES(19, 'app1', 'user8', 'host4', '20090212 11:00', '20090212 12:30');
INSERT INTO dbo.Sessions(keycol, app, usr, host, starttime, endtime)
VALUES(23, 'app2', 'user8', 'host1', '20090212 08:30', '20090212 08:45');
INSERT INTO dbo.Sessions(keycol, app, usr, host, starttime, endtime)
VALUES(29, 'app2', 'user7', 'host1', '20090212 09:00', '20090212 09:30');
INSERT INTO dbo.Sessions(keycol, app, usr, host, starttime, endtime)
VALUES(31, 'app2', 'user6', 'host2', '20090212 11:45', '20090212 12:00');
INSERT INTO dbo.Sessions(keycol, app, usr, host, starttime, endtime)
VALUES(37, 'app2', 'user5', 'host2', '20090212 12:30', '20090212 14:00');
INSERT INTO dbo.Sessions(keycol, app, usr, host, starttime, endtime)
VALUES(41, 'app2', 'user4', 'host3', '20090212 12:45', '20090212 13:30');
INSERT INTO dbo.Sessions(keycol, app, usr, host, starttime, endtime)
VALUES(43, 'app2', 'user3', 'host3', '20090212 13:00', '20090212 14:00');
INSERT INTO dbo.Sessions(keycol, app, usr, host, starttime, endtime)
VALUES(47, 'app2', 'user2', 'host4', '20090212 14:00', '20090212 16:30');
INSERT INTO dbo.Sessions(keycol, app, usr, host, starttime, endtime)
VALUES(53, 'app2', 'user1', 'host4', '20090212 15:30', '20090212 17:00');
I presented a task to calculate the maximum number of concurrent sessions per each application and provided several solutions. I got great feedback from readers, including a beautiful new set-based solution that I will present in Part III in the series, which will probably be published in the March 2010 edition of SQL Server Magazine.
In the meanwhile I wanted to cover here a variation of the task that came up in a conversation I had with a fellow SQL Server MVP and SQL wizard, Peter Larsson (aka peso).
In Part II of the series I presented the following solution returning the maximum number of concurrent sessions per application:
-- Part 1
CREATE TABLE #Ends
(
app VARCHAR(10) NOT NULL,
endtime DATETIME,
n BIGINT
);
CREATE CLUSTERED INDEX idx_app_et ON #Ends(app, endtime);
INSERT INTO #Ends(app, endtime, n)
SELECT app, endtime,
RANK() OVER(PARTITION BY app ORDER BY endtime) AS n
FROM dbo.Sessions;
-- Part 2
WITH Counts AS
(
SELECT S.app, S.starttime,
ROW_NUMBER() OVER(PARTITION BY S.app ORDER BY S.starttime)
- A.n + 1 AS cnt
FROM dbo.Sessions AS S
CROSS APPLY (SELECT TOP (1) E.n
FROM #Ends AS E
WHERE E.app = S.app
AND E.endtime > S.starttime
ORDER BY E.endtime) AS A
)
SELECT app, MAX(cnt) AS mx
FROM Counts
GROUP BY app;
DROP TABLE #Ends;
As I described in the article, Part 1 of the solution populates a temporary table with ranked session end events. Part 2 assigns row numbers to session start events, and calculates the number of concurrent sessions at each start event by subtracting from the start event row number the rank of the next end event and adding 1. Finally, the outermost query groups the data by application, and returns the maximum count of concurrent sessions per each application.
In a conversation I had with Peter we discussed a variation of the problem where you also want to get back the start and end times of the actual intervals that appeared most often, and not just the max count. You can address this variation of the problem by applying a couple of revisions to the above solution to the original task. First, in the Counts CTE query return both the starttime and the endtime attributes so that you will be able to return information about the actual intervals in the outer query. Second, in the outer query use the option TOP (1) WITH TIES based on the ordering specification: ORDER BY RANK() OVER(PARTITION BY app ORDER BY cnt DESC). All intervals that occur most often per application will be ranked 1, hence this TOP specification will return all of them. Here’s the complete solution:
-- Part 1
CREATE TABLE #Ends
(
app VARCHAR(10) NOT NULL,
endtime DATETIME,
n BIGINT
);
CREATE CLUSTERED INDEX idx_app_et ON #Ends(app, endtime);
INSERT INTO #Ends(app, endtime, n)
SELECT app, endtime,
RANK() OVER(PARTITION BY app ORDER BY endtime) AS n
FROM dbo.Sessions;
-- Part 2
WITH Counts AS
(
SELECT S.app, S.starttime, A.endtime,
ROW_NUMBER() OVER(PARTITION BY S.app ORDER BY S.starttime)
- A.n + 1 AS cnt
FROM dbo.Sessions AS S
OUTER APPLY (SELECT TOP (1) E.endtime, E.n
FROM #Ends AS E
WHERE E.app = S.app
AND E.endtime > S.starttime
ORDER BY E.endtime) AS A
)
SELECT TOP (1) WITH TIES *
FROM Counts
ORDER BY RANK() OVER(PARTITION BY app ORDER BY cnt DESC);
DROP TABLE #Ends;
Note that in case there’s more than one interval that appears the maximum number of times, this solution will return all ties. If you want to return only one interval per application, use the ROW_NUMBER function instead of RANK, and add a tie breaker to the ROW_NUMBER’s ORDER BY list if you want the solution to be deterministic. As an example, if you want the earliest interval to win in case of ties, add starttime, endtime to the ORDER BY list, like so:
ORDER BY ROW_NUMBER() OVER(PARTITION BY app ORDER BY cnt DESC, starttime, endtime);
Cheers,
BG
About the Author
You May Also Like