Suppose I want to sync with my clients a bunch of information when the program has started and at specific intervals throughout the day. However, I don’t want the server to send over a huge list of all the information every single time it is requested, as it’s a waste of bandwidth and time. I should mention that a lot of clients I work with have spotty internet connections, so it’s important that the apps function well offline as well as transfer only the data that is needed to be transferred.)
How do I know that the client has the latest updated copy of data from the database? We do this by generating a checksum of the data that we would transfer. The client would send over its last known checksum value, or NULL if it does not have one stored, and the server only pushes data over if the checksum value passed back to the server is NULL or differs from the checksum the server generates. It’s ridiculously fast, at least for the data sets I send over – a few hundred to a few thousand records per set.
Here are some functions I created to get the data that needs to be synced:
CREATE FUNCTION FranchiseInfo_Franchises (@franchiseID INT) AS ...<br />CREATE FUNCTION FranchiseInfo_Items (@franchiseID INT) AS ...<br />CREATE FUNCTION FranchiseInfo_Staff (@franchiseID INT) AS ...<br />CREATE FUNCTION FranchiseInfo_Locations (@franchiseID INT) AS ...<br />CREATE FUNCTION FranchiseInfo_PrepaidCards (@franchiseID INT) AS ...
Each function returns columns of data that I push to the client. You don’t specifically NEED to use table-valued functions, but in this scenario it definitely makes it easier to do the following steps. (Specifically, if I had used just stored procs, I would need to copy and paste the same SQL used for the procs in the checksum procedure, which reduces maintainability.)
What we want is some sort of a checksum that would change when the data in the table has changed.
CREATE PROCEDURE FranchiseInfo_GetChecksums</p><p>(</p><p>@franchiseID INT<br />)<br />AS<br /> SELECT 'Franchise' AS DataType, (SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*)) FROM dbo.FranchiseInfo_Franchises(@franchiseID)) AS [Checksum]<br /> UNION SELECT 'Items', (SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*)) FROM dbo.FranchiseInfo_Items(@franchiseID))<br /> UNION SELECT 'Locations', (SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*)) FROM dbo.FranchiseInfo_Locations(@franchiseID))<br /> UNION SELECT 'PrepaidCards', (SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*)) FROM dbo.FranchiseInfo_PrepaidCards(@franchiseID))<br /> UNION SELECT 'Staff', (SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*)) FROM dbo.FranchiseInfo_Staff(@franchiseID))<br /> ;
(Later I changed it so each one returned the checksum value in a column, so the procedure would return only one row with the checksum values. It was much easier to do that, rather than parse the row values. I’ll leave this as an exercise to the reader.)
BINARY_CHECKSUM() takes in a list of columns, in this case, the columns returned from the stored procedure. Yes, using * for columns is frowned upon but I don’t care, as it makes it easier to update the functions and know that it’s including that in the aggregation. CHECKSUM_AGG aggregates the other checksums into one checksum.
In my client code, I provide a list of the checksums I already have – as Nullable<long> / long? datatype, and if I have the checksum stored, I pass it in, otherwise, I leave the value as NULL. The server calls the function above and compares the checksums returned. If the checksum is different than what was passed in, then I call the appropriate procedure to return updated data to the client. This way, I’m only passing down the data set only if it has changed.