Making date differences in SQL queries stand out

I had another employee here create a report that lists transactions from the transaction ledger for customers where the previous balance doesn’t match the balance on the next transaction, using SQL Server’s LEAD and LAG functions. To help quickly identify spans of time, next to the time I wanted something like ‘5s’, ’44m 5s’, or ‘2d 0h 45m 5s’. My eyes would automatically go to longer strings and identify those as a larger span of time.

-- User Defined Function

CREATE FUNCTION [dbo].[Humanize_DateDiff]
 @t1 DATETIME2(7),
 @t2 DATETIME2(7),
 @showOnlyItems VARCHAR(50) = NULL
 Pretty prints a time difference between 2 dates.

@showOnlyItems should be NULL to include all fields, or a comma delimited list
 (with no spaces) of any combination of D,H,M,S,MS.

This won't print any zero-unit items until at least one non-zero unit is printed. For example,
 if there are no days, then '0d ' won't be printed at all.

 SELECT dbo.Humanize_DateDiff('12/1/2016', '12/25/2016 3:45 AM', NULL); -- 24d 3h 45m 12s 678ms
 SELECT dbo.Humanize_DateDiff('12/1/2016', '12/25/2016 3:45:12.678 AM', 'd,h,m'); -- 24d 3h 45m

 20170823 DKD

/* for testing.
 DECLARE @t1 DATETIME2(7) = DATEADD(MILLISECOND, -((1000 * 60 * 60 * 24 * 3) + 4890976), SYSUTCDATETIME());
 DECLARE @showOnlyItems VARCHAR(50) = 'D,H,M,S'; -- Comma delimited, no spaces between items, only use combos of ms,s,m,h,d; NULL = show everything


 DECLARE @t_total BIGINT = 0;
 DECLARE @t_ms BIGINT, @show_ms BIT = 1;
 DECLARE @t_s INT, @show_s BIT = 1;
 DECLARE @t_m INT, @show_m BIT = 1;
 DECLARE @t_h INT, @show_h BIT = 1;
 DECLARE @t_d INT, @show_d BIT = 1;

IF (@showOnlyItems IS NOT NULL) BEGIN
 SET @showOnlyItems = ',' + @showOnlyItems + ',';
 IF (CHARINDEX(',d,', @showOnlyItems) = 0) SET @show_d = 0;
 IF (CHARINDEX(',h,', @showOnlyItems) = 0) SET @show_h = 0;
 IF (CHARINDEX(',m,', @showOnlyItems) = 0) SET @show_m = 0;
 IF (CHARINDEX(',s,', @showOnlyItems) = 0) SET @show_s = 0;
 IF (CHARINDEX(',ms,', @showOnlyItems) = 0) SET @show_ms = 0;

SET @t_total = DATEDIFF(MILLISECOND, @t1, @t2);

-- Days
 SELECT @t_d = FLOOR(@t_total / (1000 * 60 * 60 * 24));
 SELECT @t_total = @t_total - (@t_d * 1000 * 60 * 60 * 24);
 --PRINT '@t_d = ' + COALESCE(CAST(@t_d AS VARCHAR(MAX)), '') + '; @t_total = ' + COALESCE(CAST(@t_total AS VARCHAR(MAX)), '');
 IF ((@op != '' OR @t_d > 0) AND @show_d = 1) SELECT @op = @op + ' ' + CAST(@t_d AS VARCHAR(MAX)) + 'd';

-- Hours
 SELECT @t_h = FLOOR(@t_total / (1000 * 60 * 60));
 SELECT @t_total = @t_total - (@t_h * 1000 * 60 * 60);
 --PRINT '@t_h = ' + COALESCE(CAST(@t_h AS VARCHAR(MAX)), '') + '; @t_total = ' + COALESCE(CAST(@t_total AS VARCHAR(MAX)), '');
 IF ((@op != '' OR @t_h > 0) AND @show_h = 1) SELECT @op = @op + ' ' + CAST(@t_h AS VARCHAR(MAX)) + 'h';

-- Minutes
 SELECT @t_m = FLOOR(@t_total / (1000 * 60));
 SELECT @t_total = @t_total - (@t_m * 1000 * 60);
 --PRINT '@t_m = ' + COALESCE(CAST(@t_m AS VARCHAR(MAX)), '') + '; @t_total = ' + COALESCE(CAST(@t_total AS VARCHAR(MAX)), '');
 IF ((@op != '' OR @t_m > 0) AND @show_m = 1) SELECT @op = @op + ' ' + CAST(@t_m AS VARCHAR(MAX)) + 'm';

-- Seconds
 SELECT @t_s = FLOOR(@t_total / 1000);
 SELECT @t_total = @t_total - (@t_s * 1000);
 --PRINT '@t_s = ' + COALESCE(CAST(@t_s AS VARCHAR(MAX)), '') + '; @t_total = ' + COALESCE(CAST(@t_total AS VARCHAR(MAX)), '');
 IF ((@op != '' OR @t_s > 0) AND @show_s = 1) SELECT @op = @op + ' ' + CAST(@t_s AS VARCHAR(MAX)) + 's';

-- Milliseconds
 SELECT @t_ms = @t_total;
 SELECT @t_total = @t_total - (@t_ms); -- Should == ZERO now
 --PRINT '@t_ms = ' + COALESCE(CAST(@t_ms AS VARCHAR(MAX)), '') + '; @t_total = ' + COALESCE(CAST(@t_total AS VARCHAR(MAX)), '');
 IF ((@op = '' OR @t_ms > 0) AND @show_ms = 1) SELECT @op = @op + ' ' + CAST(@t_ms AS VARCHAR(MAX)) + 'ms';

SET @op = LTRIM(@op);

--SELECT @op;


Note that I didn’t have the code recalculate figures if a time format was excluded, meaning that if I were to exclude days, the hours wouldn’t reflect the number of days.


Grabbing a set and inserting ‘and’ before the last item in a set in SQL (with and without the oxford comma)

Our client wanted to display numbers that were drawn as follows:

“The numbers drawn were 01, 02, 03, 04 and 05, resulting in a win of $X.XX!”

Well, I had misinterpreted this, thinking he wanted all of the drawn numbers, but what he really wanted was just the numbers that cased the winnings to occur. The easy part was the INTERSECT statement to determine the numbers; the hard part was, because this was being inserted into an SSRS report, I needed to comma-delimit the numbers with an ‘and’ before the last item in the set. I made this snippet to get the logic right.

DECLARE @oxfordComma BIT = 0;
WITH cte1 AS
--SELECT '|' + cte1.NumbersCommaDelimited + '|' FROM cte1
, cte2 AS
, cte3 AS
SELECT LEN(Rev) - CHARINDEX(',', Rev, 1) + CASE WHEN @oxfordComma = 1 THEN 2 ELSE 1 END AS Pos, X FROM cte2
, cte4 AS
SELECT STUFF(X, Pos, CASE WHEN @oxfordComma = 1 THEN 0 ELSE 1 END, ' and') AS Y, X, Pos FROM cte3
SELECT X, Pos, Y FROM cte4

SSRS, Uncategorized

Removing elements from the flow of the document in a SSRS report

We’re designing a report that is printed and sent out to customers that has details about lottery syndicate entries. The details include the following data:

  • Details about winnings earned from the previous month, if any
  • Details about termination notice, if one was supplied
  • Details about participation in the next month, if any

The key words used here are “if any”. There’s a message that’s displayed on the report that says “Your termination notice was received on 4/28/2017 and your last month of play will be June 2017”. If they never called in to terminate their membership, this paragraph wouldn’t even show up. However, this is where the trouble begins. On the paragraph that contains the expression that generates the text in quotes above, I can add a visibility expression such as =IIF(IsNothing(First(Fields!ReceivedTerminationNoticeOn.Value, "DataSet"))), True, False), then the paragraph will be hidden. However, to introduce a CSS analogy, it’s more like visible:hidden rather than display:none. In other words, the element is hidden, but it’s not taken out of the document flow, so it makes it invisible, but takes up space and leaves a big ugly gap. These documents are printed and snail-mailed to people, so they have to look good. So, after about 15 minutes perusing StackOverflow for answers, and about 30 minutes of fooling around with SSRS, I’ve figured out how to trick SSRS into removing the element completely, shifting other page elements up.

  1. Drag a new table onto the design surface. I’m using VS 2015 and SSDT-BI 17.1, but I believe it’s worked this way since the beginning of time: A new table is created, that has 3 empty columns, and 1 row group, with 2 rows: The header, and the row ‘template’ that is repeated for each row in the dataset.
  2. Delete 2 of the columns. Now you have one column and 2 rows. We’re going to be adding something to the ‘header’ cell, since it’s only shown once, not once for each item in the dataset.
  3. Now, any element you drag into a cell is going to take up all available space in that column, meaning you can only have one element as the child element of that cell, and will resize based on the size of the table added. If you want to be able to add more elements, or position them how you like, then drag a rectangle into this header cell, then add your elements.
  4. Click in the header cell, but not in any elements in the cell, and right-click the bar on the left and select ‘Row Visibility…’, and then enter an expression that dictates whether the cell should be shown or not.
  5. Click in the ‘template’ row, then click the bar on the left and select ‘Row Visibility’. Select ‘Hide’ instead of the default ‘Show’.
  6. Size your table as you need it, and while you cannot delete the ‘template’ row, you can size it so it’s 1px high and it’s not going to show anyway since you set the row visibility to hide.

That will now collapse the table if the visibility condition isn’t met, rather than just making it ‘invisible’ and taking up space.

Caveat: The ‘elephant in the room’ is datasets. You can totally add a child table into this one, as that’s how we were able to show the results of the items. The outer table will grow to accommodate the sizing of its children. However, the dataset used by the elements inside must be the same as the outer dataset. This can obviously cause some problems if you want to have 2 tables with different datasets, but there is a way… which, by the way, is also not pretty.

Let’s say you have 2 datasets that you want to hide but you can only have one dataset as per the rules above.

Dataset1: Let’s say it’s database table ‘TableA’ and has columns A, B, and C; row with values 1, 2, 3 and row with values 4, 5, 6

Dataset2: Database table ‘TableB’ has columns D, E, and F; row with values 11, 12, 13 and row with values 14, 15, 16.

Combine them into a single dataset something like this SQL pseudocode:


On the inner table, you’ll have the columns you want to display, and you can then set a table filter to only show rows that match specific criteria, such as [Dataset] = 'TableA'. The other table would have the same thing, but it would = ‘TableB’. This way, you have multiple datasets under the guise of a single dataset. You would have to UNION all of the rows together and have NULLs in the columns that aren’t used for a particular dataset.

My dataset had the first column called ‘Dataset’ and had one row for ‘Customer’, one row for ‘Winnings_MajorLottery1’, ‘Winnings_MajorLottery2’, ‘NextDrawings_Overview’, ‘NextDrawings_LinesPlayed’. I didn’t know of an easy way in SSRS that you can find the number of rows in a table when using filters (to set the table to not display if there were not any rows), so if there are is no rows in the ‘Winnings_MajorLottery1’ “dataset”, I had a column in the “Customer” dataset called HasMajorLottery1Winnings” that was a BIT field; it was far easier to make a BIT field do my bidding rather than try to fanagle some crazy way to make it work right in SSRS.

Yes, this means you’re going to have a lot of columns – column 1 is the ‘Dataset’ column, then all of the columns from the first dataset, then all of the columns from the next dataset, which means that your query is going to have a lot of NULL entries for the columns that aren’t used for a particular dataset. You’re going to have to structure the SQL well so it’s readable in the sea of NULLs the query is going to have.

However, you may not have to go overboard for simpler reports, but the one I was working on (still am working on, who am I kidding!) was anything but simple and required some outside-of-the-box thinking.

Happy programming!

javascript, jQuery, markdown, user-content

Javascript Quick & Dirty Dynamic Numeric Evaluations From Dynamic Content

I’m writing a help page for a lottery order form to confirm to specific gaming standards. The page has to show odds for a given set of games that are offered on the order form. I made the decision to pull the content from the database, so that it can be edited from the admin site rather than us going in and editing it. I also decided that a Markdown syntax would restrict how the content was displayed yet allow a degree of power in how it was formatted.

However, there definitely was a problem in displaying things like pay tables. I had to insert a couple of tables into this content to show odds and payout factors for the games that were displayed. Here’s the table in Markdown (GFM-flavor) format. For those who don’t know what Markdown is, it is a method of writing content that is still very readable in its un-rendered form.

# Payout Factors for Different Bet Types

Game Type | Combination | Example Bet | Payout
 ------------ | ------------------------------------------- | -------------------- | --------
 Pick 2 | Straight: Any 2 digit # from 00 to 99 | 12 (exact order) | 90:1
 Pick 2 | Boxed (2-way; 2 unique digits) | 12 (any order) | 45:1
 Pick 3 | Straight: Any 3 digit # from 000 to 999 | 123 (exact order) | 900:1
 Pick 3 | Boxed (6-way; 3 unique digits) | 123 (any order) | 150:1
 Pick 3 | Boxed (3-way, 2 unique digits) | 334 (any order) | 300:1
 Pick 4 | Straight: Any 4 digit # from 0000 to 9999 | 1234 (exact order) | 9000:1
 Pick 4 | 24-Way Box (4 digit # with unique digits) | 1234 (any order) | 375:1
 Pick 4 | 12-Way Box (4 digit #, 3 unique digits) | 1233 (any order) | 750:1
 Pick 4 | 6-Way Box (4 digit #, 2 sets of 2 digits) | 3344 or 4343 (any order) | 1500:1
 Pick 4 | 4-Way Box (4 digit, 3 digits the same) | 1112 (any order) | 2250:1

The above renders into an H1 tag for the title and an HTML table for the table portion.

That’s all fine, but see where it says 90:1 for a straight bet? I don’t want a human to edit that number. Additionally, I don’t want to use an example number, since that would force the users to mentally calculate the payout factors themselves, and could possibly violate the standards we are implementing. Therefore, the actual payout factor values should be pulled from the database. There’s a number of different ways to do this, one being using something like a replacement token, e.g. {2BallPF}:1. But then, I gotta do something like {2BallPF/2}:1 for the boxed bet, which I would have to figure out on the server…. sounds like a huge pain. I could just let the users edit it every time mass payout factors are changed…

 Pick 2 | Straight: Any 2 digit # from 00 to 99 | 12 (exact order) | 70:1
Pick 2 | Boxed (2-way; 2 unique digits) | 12 (any order) | 45:1
Pick 3 | Straight: Any 3 digit # from 000 to 999 | 123 (exact order) | 700:1
Pick 3 | Boxed (6-way; 3 unique digits) | 123 (any order) | 150:1
Pick 3 | Boxed (3-way, 2 unique digits) | 334 (any order) | 300:1

NO. I can’t entrust the operators of the system to make these changes any time a payout factor update is done. And I definitely don’t want to manage that crap. It always happens on a Friday night when I am relaxing, or any time on Saturday or Sunday. I really value my free time, even if it’s to do absolutely nothing.

Would a framework like angular.js, react.js, durandal, etc. work for what I’m trying to do? Possibly, but I wasn’t interested in the learning curve (at this moment, mind you, although learning at least one or two of these frameworks sounds like fun), or the weight. All I wanted to do was something like take a token, replace it with its value, and evaluate anything as an expression. I thought to myself “man, a ‘data-‘ style attribute would be really nice about now”. I then remembered that Markdown allows HTML to be inserted.

So this is what I did! I made some example output in jsbin:

<!DOCTYPE html>
<script src="//code.jquery.com/jquery-1.11.1.min.js"></script>
 <meta charset="utf-8">
 <title>JS Bin</title>
<input type="hidden" id="max2" value="90" />
 <input type="hidden" id="max3" value="900" />
 <input type="hidden" id="max4" value="9000" />
 <input type="hidden" id="mx" value="6" />
 This is a test to see how:<br /><br />
 <span data-fix="{max2}">x</span>
 / <span data-fix="{mx}">16</span> = <span data-fix="{max2} / {mx}">y</span>.<br><br>
 <span data-fix="{max3}">x</span>
 / <span data-fix="{mx}">16</span> = <span data-fix="{max3} / {mx}">y</span>.<br><br>
 <span data-fix="{max4}">x</span>
 / <span data-fix="{mx}">16</span> = <input type="text" data-fix="{max4} / {mx}" value="?" />.<br><br>

So for elements that I want to dynamically render a mathematical value, I enclose them in a span tag with an attribute of data-fix. This contains the formula, which is actually something that eval() can parse in Javascript. A bit dangerous, if someone were to get into the admin and change a value to something destructive, but then they could do a heck of a lot more than just add stuff for eval() statements to hit.

A little javascript magic goes a long way:

 var regex = /\{[\w\d\-_]+?\}/gim;
 var that = $(this);
 var formula = that.attr("data-fix");
 //console.log("formula = " + formula);
 var rendered = formula.replace(regex, function(match, capture) {
 var s = match.replace("{", "").replace("}", "");
 //console.log("-- s = " + s);
 var elem = document.getElementById(s);
 //console.log("-- tag = " + elem.tagName);
 if (elem == null || elem === undefined)
 return "null";
 var value = $(elem).val() | $(elem).html();
 return value;
 //console.log("rendered = " + rendered);
 var final = eval(rendered);
 //console.log("final = " + final);
 if (this.tagName.toLowerCase() == "input" || this.tagName.toLowerCase() == "select")

What’s going on here is that I’m grabbing all instances of any element that has the data-fix attribute, grabbing the “template” formula, doing any replacing in it as needed, doing an eval() on the final formula, and then inserting the value back into the element, one time.

Check out the JS Bin for more details and to play with it!


Detecting changes in a table

Suppose I want to sync with my clients a bunch of information when the program has started and at specific intervals throughout the day. However, I don’t want the server to send over a huge list of all the information every single time it is requested, as it’s a waste of bandwidth and time. I should mention that a lot of clients I work with have spotty internet connections, so it’s important that the apps function well offline as well as transfer only the data that is needed to be transferred.)

How do I know that the client has the latest updated copy of data from the database? We do this by generating a checksum of the data that we would transfer. The client would send over its last known checksum value, or NULL if it does not have one stored, and the server only pushes data over if the checksum value passed back to the server is NULL or differs from the checksum the server generates. It’s ridiculously fast, at least for the data sets I send over – a few hundred to a few thousand records per set.

Here are some functions I created to get the data that needs to be synced:

CREATE FUNCTION FranchiseInfo_Franchises (@franchiseID INT) AS ...<br />CREATE FUNCTION FranchiseInfo_Items (@franchiseID INT) AS ...<br />CREATE FUNCTION FranchiseInfo_Staff (@franchiseID INT) AS ...<br />CREATE FUNCTION FranchiseInfo_Locations (@franchiseID INT) AS ...<br />CREATE FUNCTION FranchiseInfo_PrepaidCards (@franchiseID INT) AS ...

Et cetera.

Each function returns columns of data that I push to the client. You don’t specifically NEED to use table-valued functions, but in this scenario it definitely makes it easier to do the following steps. (Specifically, if I had used just stored procs, I would need to copy and paste the same SQL used for the procs in the checksum procedure, which reduces maintainability.)

What we want is some sort of a checksum that would change when the data in the table has changed.

CREATE PROCEDURE FranchiseInfo_GetChecksums
@franchiseID INT
SELECT 'Franchise' AS DataType, (SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*)) FROM dbo.FranchiseInfo_Franchises(@franchiseID)) AS [Checksum]
UNION SELECT 'Items', (SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*)) FROM dbo.FranchiseInfo_Items(@franchiseID))
UNION SELECT 'Locations', (SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*)) FROM dbo.FranchiseInfo_Locations(@franchiseID))<br /> UNION SELECT 'PrepaidCards', (SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*)) FROM dbo.FranchiseInfo_PrepaidCards(@franchiseID))<br /> UNION SELECT 'Staff', (SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*)) FROM dbo.FranchiseInfo_Staff(@franchiseID))<br /> ;

(Later I changed it so each one returned the checksum value in a column, so the procedure would return only one row with the checksum values. It was much easier to do that, rather than parse the row values. I’ll leave this as an exercise to the reader.)

BINARY_CHECKSUM() takes in a list of columns, in this case, the columns returned from the stored procedure. Yes, using * for columns is frowned upon but I don’t care, as it makes it easier to update the functions and know that it’s including that in the aggregation. CHECKSUM_AGG aggregates the other checksums into one checksum.

In my client code, I provide a list of the checksums I already have – as Nullable<long> / long? datatype, and if I have the checksum stored, I pass it in, otherwise, I leave the value as NULL. The server calls the function above and compares the checksums returned. If the checksum is different than what was passed in, then I call the appropriate procedure to return updated data to the client. This way, I’m only passing down the data set only if it has changed.


Using a CTE in SQL to update and delete anything over then Nth row for a given grouping of fields, a.k.a. Building a Token Management System

I learned a new trick for dealing with deleting rows after the nth index from a table. First, some backstory:

I’m in the gaming industry with clients overseas, and our client has an online casino and lottery house that runs games like slots, blackjack and poker. To launch these games, we use a token system. The customer launches a URL with a GUID token, which is immediately swapped once the game loads. The client continues to request tokens to perform actions behind the scenes, and the token must continually change to prevent someone who got access to a token – either by random guess, or worse.

Token rotation systems aren’t the simplest thing in the world – the token system has to handle multiple tokens for a given customer, because if the client application sends a request to our server that is processed successfully, changing the token, but the client never gets a response, the client is left with an old token and cannot reference that particular customer anymore. The system I set up for this vendor manages 5 tokens simultaneously, with them expiring after 10 minutes. My token table looks like this:

ID	CreatedOn	CustomerID	GamingVendorID	Token	CountOfUses	LastUsedOn	IsExpired
321	2014-02-13 12:09:48.337	1151	4	73F636BC-F6C4-4301-83A3-3A24744E0C65	1	2014-02-13 11:59:48.347	0
324	2014-02-13 11:59:48.363	1151	4	12C33AB2-44CA-4946-9CDC-AE0934FA8DBC	0	NULL	0

However, in production there will be tons of tokens, and each vendor is configured to allow a maximum number of tokens simultaneously – this vendor has 5. Without some sort of cleanup routine, there will be a lot more than 5 unexpired tokens in the table. I have a stored procedure to clean up these excess tokens, but since it has to run on every call to validate and rotate a token, I can’t have it look over the data by customer ID and vendor ID, as it would be too slow. I also didn’t want to clean up just the requesting vendor’s and customer’s tokens and leave everything else alone; I wanted the table to have as few rows as possible. So, how can it be done with a single batch and no looping, temporary tables or cursors?

Enter the Common Table Expression (CTE):

WITH cteTokens AS
			ROW_NUMBER() OVER (PARTITION BY g.ID, t.CustomerID ORDER BY t.CreatedOn DESC) AS RowNumber,
			g.[AllowedNumberOfTokensKeptBesidesCurrentToken] AS CountToKeep
		FROM CustomerRotatingTokens t
		LEFT OUTER JOIN GamingVendors g ON (g.ID = t.GamingVendorID)
			t.CountOfUses &gt;= g.[AllowedNumberOfTokenReuses]
			OR ([TokensExpireAfterNumberOfSeconds] IS NOT NULL AND DATEADD(SECOND, g.[TokensExpireAfterNumberOfSeconds], t.CreatedOn) &lt; GETDATE()) 	), 	cteTokensToExpire AS 	( 		SELECT ID FROM cteTokens WHERE RowNumber &gt; CountToKeep
	UPDATE CustomerRotatingTokens SET IsExpired = 1 WHERE ID IN (SELECT ID FROM cteTokensToExpire);

Here’s an overview of the interesting parts of the code:

  • WITH cteTokens AS – this names our expression (think of it as a table alias)
  • ROW_NUMBER() OVER (PARTITION BY g.ID, t.CustomerID ORDER BY t.CreatedOn DESC) AS RowNumber – this adds a sequential row number to each row, starting at 1. If you specify ROW_NUMBER() OVER (ORDER BY t.CreatedOn DESC) (this is the bare minimum syntax needed to do this), it will add a sequential number to each row. We need to delete any tokens after the 5th newest token, so we need the tokens to be “grouped” by the vendor’s ID and the customer ID. That’s where the PARTITION part comes in – think if it almost like a GROUP BY statement so ROW_NUMBER() starts counting over when it reaches a new combination of those fields.
  • cteTokensToExpire AS – this next CTE selects a new table containing only those tokens whose index (the ROW_NUMBER()) is greater than what the vendor is allowed to keep – 5.
  • UPDATE… WHERE ID IN (SELECT ID FROM cteTokensToExpire) – expires any tokens that were selected by cteTokensToExpire.

Part of the client requirements is that we differentiate between invalid and expired tokens, so I can’t just delete them from the table immediately, but I have another CTE that blows out expired rows that are a bit older than when they expired.

The speed of the CTE is amazing. It’s most definitely faster than building a temporary table to aggregate the results and then expire and delete the rows, which would not have flown since this has to run a LOT.


Wow. I learned something very important about SQL today.

Consider this SQL statement, containing a inline table-valued function (TVF)

FROM dbo.Games_StandardLotto_Summaries(
dbo.Date(DATEADD(D, -1, GETDATE())), 
dbo.EndOfDay(DATEADD(D, 1, GETDATE())), 

It’s taking 17 seconds to run, and I cannot figure out WHY. I copied the guts of the TVF from a sql script I was tweaking for the good part of an hour. It’s a single line with CTEs that runs a fairly complex query in about a second or two – vast improvement from getting the same data from temp tables in 15-20 times the time. But now, it’s back to square one, it seems.


SELECT @sd = dbo.Date(DATEADD(D, -1, GETDATE())), 
    @ed = dbo.EndOfDay(DATEADD(D, 1, GETDATE()));
SELECT * FROM dbo.Games_StandardLotto_Summaries(@sd, @ed, NULL, NULL, NULL);

…runs in under 1 second for me.

So what SQL is telling me here is that for TVFs, and I’m sure more than that, it uses the parameters by reference – meaning it executes the representation of the expression, rather than the value of the expression at the time the TVF is called.

The lesson learned here is that you should always declare and set your variables and use those to execute the TVF, rather than calling it with complex statements. As you can see above, I’m calling GETDATE(), then adding or subtracting a day from that value, and then running a scalar function to make it midnight or 11:59:59:999, twice per row minimum. Storing it into variables knocked the time down to next to nothing. Changing the SQL code above that calls the TVF resulted in a drop from 16.766 seconds to 0.275 seconds. That’s almost a 61x improvement!