Monday, December 22, 2014

Linked lists

http://www.sqlservercentral.com/articles/T-SQL/118863/

Introduction

Imagine you want to sort a dataset by something that isn’t stored in the database. Let me give you an example to better imagine the problem: I have a table with songs (i.e. song title, artist, etc.) that has a column called Song_ID as its primary key. I want to create playlists with some songs from that table.
To do that I create another table that contains the main data for these playlists including the names of the playlists. It also has a column as its primary key which is called Playlist_ID. I could easily create a table, say Playlist_Contents that consists of Playlist_ID and Song_ID to model the many-to-many-relationship between songs and playlists. But I like the songs within each playlist to be played in a specific order but there are no criteria I could use for sorting them. Instead the sequence of the songs will be determined by some criteria unknown to the database. My ER diagram would look like this:

Suppose the tables have the following contents:

**Playlists**
Playlist_ID	Playlist_Name
1	All time favourites
2	Christmas songs

**Songs**
Song_ID	Song	Artist
10	My girl	Temptations
20	Good vibrations	Beach Boys
30	Clocks	Coldplay
40	Velouria	Pixies
50	Day tripper	Beatles
60	Wonderwall	Oasis
70	Temptation	Heaven 17
80	Mercy	Duffy

Now I'd like to put 6 of the songs into the playlist "My favourite songs". I want to play the songs in this order:

Good vibrations - Beach Boys
Day tripper - Beatles
Wonderwall - Oasis
Clocks - Coldplay
Velouria - Pixies
My girl - Temptations

I can simply insert 6 rows into Playlist_Contents but I need to store the order as well. The table Playlist_Contents has to be extended to do this.

First attempt

A straightforward solution would be to add a column to Playlist_Contents that stores the position of the song in the playlist.

Playlist_Contents with Position column
Playlist_ID	Song	Position
1	20	1
1	50	2
1	60	3
1	30	4
1	40	5
1	10	6

But what if you want to edit the playlist, i.e. adding, deleting or moving songs? You can easily add a song at the end of a playlist or remove the last song. All other operations lead to multiple updates, esp. if you want the position to hold contiguous numbers starting with 1 onwards:

-- add "Mercy" right after "Good vibrations:
UPDATE Playlist_Contents SET Position = Position + 1 WHERE Position >= 2
INSERT Playlist_Contents VALUES (1, 80, 2)
 
-- remove "Clocks" from the playlist:
DELETE FROM Playlist_Contents WHERE Playlist_ID = 1 AND Song_ID = 30
UPDATE Playlist_Contents SET Position = Position - 1 WHERE Position > 4

-- move "Velouria" to the beginning of the list:
UPDATE Playlist_Contents SET Position = Position + 1 WHERE Position >= 1
UPDATE Playlist_Contents SET Position = 1 WHERE Song_ID = 40
UPDATE Playlist_Contents SET Position = Position - 1 WHERE Position > 5

Maybe some playlists get quite long, so performance could become an issue. So using a position column seems to be quite inefficient.

A better solution

What about alternatives? How can we reduce the necessary modifications to a minimum? My solution was to not have a position column in the Playlist_Contents table at all. Instead I tried to use a linked list.

CREATE TABLE Playlist_Contents(
 Playlist_ID int NOT NULL,
 Song_ID  int NOT NULL,
 Predecessor int NULL,
 CONSTRAINT PK_Playlist_Contents PRIMARY KEY CLUSTERED 
(
 Playlist_ID ASC,
 Song_ID ASC
))
GO

ALTER TABLE Playlist_Contents  WITH CHECK
 ADD CONSTRAINT FK_Playlist_Contents_Self FOREIGN KEY (Playlist_ID, Predecessor)
 REFERENCES Playlist_Contents (Playlist_ID, Song_ID)

ALTER TABLE Playlist_Contents CHECK CONSTRAINT FK_Playlist_Contents_Self

CREATE UNIQUE NONCLUSTERED INDEX AK_Playlist_Contents ON Playlist_Contents
(
 Playlist_ID ASC,
 Predecessor ASC
)

ALTER TABLE Playlist_Contents  WITH CHECK
 ADD CONSTRAINT CK_Playlist_Contents_Predecessor CHECK (Song_ID!=Predecessor)

ALTER TABLE Playlist_Contents CHECK CONSTRAINT CK_Playlist_Contents_Predecessor
GO

In order to do that I created a column “Predecessor” that is designed to point to the Song_ID of the preceding song (within the same playlist, of course). The first song does not have a predecessor, so I allowed NULL for this column. Since no two songs can have the same predecessor it must be unique within each playlist. Additionally a song’s predecessor must not point to itself. The playlist from above would now look like this:

Playlist_Contents with Predecessor column
Playlist_ID	Song	Predecessor
1	20	NULL
1	50	20
1	60	50
1	30	60
1	40	30
1	10	40

In order to have a position again I created a view which dynamically generates the position number. It is based on a recursive CTE:

CREATE VIEW [dbo].[vPlaylist_Contents] AS
WITH List AS (
 SELECT Playlist_ID, 1 AS Position, Song_ID, Predecessor
 FROM Playlist_Contents
 WHERE Predecessor IS NULL
 UNION ALL
 SELECT p.Playlist_ID, l.Position + 1, p.Song_ID, p.Predecessor
 FROM Playlist_Contents p INNER JOIN List l
   ON p.Playlist_ID = l.Playlist_ID AND
    p.Predecessor = l.Song_ID
 )
SELECT * FROM List

GO

Modifying data in a linked list

How can the sequence of songs in a playlist be changed? If we want to insert a new song or change a song’s position in the list, we just name the predecessor of the song or put NULL in it. Of course we cannot really execute the inserts, updates and deletes as they are. So we need a trigger to perform the necessary additional modifications on the table.
This trigger will be an instead-of-trigger for all three types of data modification. The trigger has to determine the additional modifications and execute those in a single statement in order to fulfill all constraints we have on the table, i.e. the primary key, the foreign key, the unique index, and the check constraint. Since we use the view to read the data we create the trigger on the view.

CREATE TRIGGER [dbo].[tr_vPlaylist_Contents]
   ON  [dbo].[vPlaylist_Contents] 
   INSTEAD OF INSERT,DELETE,UPDATE
AS 
BEGIN
 SET NOCOUNT ON;
 WITH Modifications AS (
  SELECT Playlist_ID, Song_ID, Predecessor
  FROM inserted
  UNION ALL
  SELECT d.Playlist_ID, d.Song_ID, d.Predecessor
  FROM inserted i FULL OUTER JOIN deleted d
    ON i.Playlist_ID = d.Playlist_ID AND
     i.Song_ID = d.Song_ID
  WHERE i.Playlist_ID IS NULL
  UNION ALL
  SELECT p.Playlist_ID,
    p.Song_ID,
    i.Song_ID
  FROM Playlist_Contents p
    INNER JOIN inserted i
    ON p.Playlist_ID = i.Playlist_ID AND
     (p.Predecessor = i.Predecessor OR 
      (p.Predecessor IS NULL AND i.Predecessor IS NULL))
  UNION ALL
  SELECT p.Playlist_ID,
    p.Song_ID,
    d.Predecessor
  FROM Playlist_Contents p
    INNER JOIN deleted d
    ON p.Playlist_ID = d.Playlist_ID AND
     p.Predecessor = d.Song_ID
 )
 MERGE INTO Playlist_Contents AS targettable
  USING Modifications AS sourcetable
  ON  targettable.Playlist_ID = sourcetable.Playlist_ID AND
    targettable.Song_ID = sourcetable.Song_ID
  WHEN MATCHED AND targettable.Predecessor = sourcetable.Predecessor OR
      (targettable.Predecessor IS NULL AND sourcetable.Predecessor IS NULL)
   THEN DELETE
  WHEN MATCHED AND (targettable.Predecessor != sourcetable.Predecessor) OR
      (targettable.Predecessor IS NULL AND sourcetable.Predecessor IS NOT NULL) OR
      (targettable.Predecessor IS NOT NULL AND sourcetable.Predecessor IS NULL)
   THEN UPDATE SET targettable.Predecessor = sourcetable.Predecessor
  WHEN NOT MATCHED BY TARGET
   THEN INSERT (Playlist_ID, Song_ID, Predecessor) 
   VALUES (sourcetable.Playlist_ID, sourcetable.Song_ID, sourcetable.Predecessor);
END

It uses a MERGE statement to perform all necessary modifications at once. The CTE produces all rows that have to be modified. The rows whose primary key values do not match a row in the table will be inserted. The rows which do match a row in the table but have a different predecessor value will be updated. Finally rows where all column values match a row in the table will be deleted.

Testing the solution

Finally let’s try to insert, update and delete some data in the table to test the trigger.

-- add "Mercy" right after "Good vibrations:
BEGIN TRAN
INSERT vPlaylist_Contents (Playlist_ID, Song_ID, Predecessor) VALUES (1, 80, 20)
SELECT * FROM vPlaylist_Contents
ROLLBACK
 
-- remove "Clocks" from the playlist:
BEGIN TRAN
DELETE FROM vPlaylist_Contents WHERE Playlist_ID = 1 AND Song_ID = 30
SELECT * FROM vPlaylist_Contents
ROLLBACK

-- move "Velouria" to the beginning of the list:
BEGIN TRAN
UPDATE vPlaylist_Contents SET Predecessor = NULL WHERE Song_ID = 40
SELECT * FROM vPlaylist_Contents
ROLLBACK

The difference between this solution and the first attempt is the number of rows that have to be modified. In this solution we have an upper bound of 2 for inserts, 2 for deletes and 3 for updates. These figures are independent of the number of rows in the linked list whereas the figures in our first attempt increase as the number of songs in the playlist increases.
Actually I use this construction in an application that modifies one row at a time. The trigger works fine for some statements that modify multiple rows but not for all. The solution can be extended to allow all possible multiple row modifications but that involves much more effort to make things work.

Restrictions

Apart from multiple row modifications this implementation does not allow to change the primary key values of a row, i.e. you cannot replace a song by another one or move a song from one playlist to another. If you try to do that the trigger will fail due to constraint violations or missing uniqueness within the MERGE statement. To avoid error messages like that you can simply test for changed primary key values at the beginning of the trigger. If moving songs between playlists or replacements of songs within one statement are required you could extend the table by a surrogate primary key and use this new PK in the statement inside the trigger instead.

Don't use hierarchyid for linked lists

A linked list like this can also be regarded as a degenerated tree structure that contains one branch only. I was thinking of using a hierarchyid column in the beginning but discarded this idea quickly. Using a hierarchyid column doesn't reduce the number of updates. It is about the same as with a position column. Also a hierarchy column would use much more space than a predecessor column of type int.

Using the MERGE Statement to Perform an UPSERT

http://www.databasejournal.com/features/mssql/using-the-merge-statement-to-perform-an-upsert.html

The term UPSERT has been coined to refer to an operation that inserts rows into a table if they don’t exist, otherwise they are updated. To perform the UPSERT operation Microsoft introduced the MERGE statement. The MERGE statement was included into the set of TSQL statements when SQL Server 2008 was introduced. Not only does the MERGE statement support the UPSERT concept, but it will also support deleting records. In this article I discuss how to use the MERGE statement to UPDATE, INSERT and DELETE records from a target table.

What is the Value of the MERGE Statement?

Prior to the introduction of SQL Server 2008 if you needed to write logic that inserted rows into a target table if they didn’t exist, or updated them if they did exist you needed a series of “if then else” logic and needed to perform both the UPDATE and INSERT statement. With the introduction of the MERGE statement with SQL Server 2008 you can perform either an INSERT or UPDATE statement using a single MERGE statement. Being able to use the MERGE statement to perform inserts or updates to a table makes it easier to code your UPSERT logic in TSQL.

Performing UPSERT Using MERGE Statement

The MERGE statement supports inserting and updating rows in a table with a single operation. In order to accomplish this the MERGE statement requires both a Source and Target table. The Source table is used to identify the rows that needed be inserted or update, and the Target table is the table that rows will be inserted or updated.
In order to demo the MERGE statement I will need a Source and Target table, which will be used in my MERGE example. I will use the following code to create my Source and Target table:

USE tempdb;
GO
SET NOCOUNT ON;
CREATE TABLE dbo.Product (
       Id int identity, 
       ProductName varchar(100),
       Qty int);
INSERT INTO dbo.Product (ProductName, Qty) VALUES('Magic Stripper', 5);
CREATE TABLE dbo.NewInventory(
       Id int identity, 
       ProductName varchar(100), 
       Qty int);
INSERT INTO dbo.NewInventory (ProductName, Qty) VALUES
       ('Sandpaper',20), 
       ('Paint Brush 1 inch',15),
       ('Magic Stripper',5);

With this code I created two tables. The first table I created was the Product table, which will be my Target table. This is the table that I will be updating or inserting rows using the MERGE statement. The other table, NewInventory, is the Source table. My Source table identifies the records that will be used to determine if a new record needs to be inserted into my Product table. The Source table is also used to determine if an UPDATE needs to be performed against any existing records in my Product table. These two tables will be used in the following MERGE statement, which performs an UPSERT operation:

USE tempdb;
GO
MERGE dbo.Product AS T  
USING dbo.NewInventory AS S 
ON T.ProductName = S.ProductName
WHEN MATCHED THEN  
  UPDATE SET T.Qty = T.Qty + S.Qty
WHEN NOT MATCHED THEN  
  INSERT (ProductName,Qty) VALUES (S.ProductName,S.Qty);
SELECT * FROM dbo.Product;

When I run this code I get the following output:

Id          ProductName                                                     Qty
----------- --------------------------------------------------------------- -----------
1           Magic Stripper                                                  10
2           Sandpaper                                                       20
3           Paint Brush 1 inch                                              15

Let me review the above code and explain how it worked. The table identified right after the MERGE statement, the one with the table alias of T is known as the Target. The Target table is the table in which I will be performing an UPDATE, or an INSERT. The table following the USING clause, with the table alias of S is known as the Source table. The Source table identifies the potential records that will be inserted or updated in the Target table. The ON clause identifies how to join the Target and Source tables. In my example above I’m joining the Source and Target table based on the ProductName column. Following the ON clause are two WHEN clauses. These clauses identify the conditions when a UPDATE or an INSERT will be performed based on the results of the join between the Target and Source tables.
The first WHEN clause says “WHEN MATCHED”. The “WHEN MATCHED” conditions means when the Target and Source tables are joined based on the “ON” clause if there is a match then the UPDATE statement will be performed. In the UPDATE statement I take the existing Qty value in the Target table and add to it the Qty value in the Source table to increase the amount of inventory I have on hand for a given ProductName.
The second WHEN clause has “WHEN NOT MATCHED”. The “WHEN NOT MATCH” condition means when joining the Target and Source if there is a ProductName in the Source table that is not found in the Target table then this condition will be met. When this condition is met based on the Source and Target table join operation the Source row will be inserted into the Product table. This condition allows me to insert a new row into the Product table when new ProductName’s are found in the Source table.
If you review the original rows I had in my Product table you will see that I only had the “Magic Stripper” product in my Product table and that product had a Qty of 5. After the MERGE statement ran, the “Magic Stripper” product now has a Qty of 10. This happened because the MERGE statement’s WHEN MATCH condition was met and therefore the Qty value from the Product table, which was 5 and the Qty value from the New Inventory, which was also 5 were summed together to UPDATE the matched row in the Target table. The other two Products “Sandpaper” and “Paint Brush 1 inch” where inserted into the Product table because these two products didn’t already exist in the Product table so they met the “WHEN NOT MATCHED” condition. When the “WHEN NOT MATCHED” condition is met the unmatched rows in the Source table were inserted into the Target table.

What about Deleting Rows Using the Merge Statement?

Can the MERGE statement be used to delete records from a table? Yes it can! To see how this works let’s suppose we have a business requirement to delete rows from the Target table of the MERGE statement when they don’t exist in the Source table. In order to show how this works I’m going to using my exiting Product table, which now after running the prior examples has records for the following three products:

Magic Stripper
Sandpaper
Paint Brush 1 Inch

But this time I’m going to use the NewInventory2 table that is created with the following TSQL code. This new table will be the Source table for the MERGE statement:

USE tempdb;
GO
CREATE TABLE dbo.NewInventory2(
       Id int identity, 
       ProductName varchar(100), 
       Qty int);
INSERT INTO dbo.NewInventory2 (ProductName, Qty) VALUES
       ('Sandpaper',5), 
       ('Paint Brush 1 inch',10);

To use the MERGE statement to perform a delete from my Product table I will use the following code:

USE tempdb;
GO
MERGE dbo.Product AS T  
USING dbo.NewInventory2 AS S
ON T.ProductName = S.ProductName 
WHEN NOT MATCHED BY SOURCE THEN  
  DELETE;
SELECT * FROM dbo.Product;

When I run this code I get the following output:

Id          ProductName                                                      Qty
----------- ---------------------------------------------------------------- -----------
2           Sandpaper                                                        20
3           Paint Brush 1 inch                                               15

By reviewing the result set above you can see that the “Magic Stripper” product got deleted from the Product table. This was accomplished by using the “WHEN NOT MATCHED BY SOURCE” condition of the above MERGE statement. This condition is met when there is no row in the Source table that matches a Target table row. When this occurs SQL Server deletes rows in the Target table that don’t have a corresponding matching row in the Source table. Note you can include the WHEN NOT MATCHED BY SOURCE, WHEN MATCHED and WHEN NOT MATCHED is the same MERGE statement.

Beware of Filtering Rows Using ON Clause

If you read Books Online (BOL) there is a cautionary note on Filtering Rows by associating additional constraints with the ON clause of the MERGE statement. Here is the actual text regarding trying to perform additional filtering using the ON clause that was found in the BOL:
It is important to specify only the columns from the target table that are used for matching purposes. That is, specify columns from the target table that are compared to the corresponding column of the source table. Do not attempt to improve query performance by filtering out rows in the target table in the ON clause, such as by specifying AND NOT target_table.column_x = value. Doing so may return unexpected and incorrect results.
To show you an example of how adding additional filtering using the ON clause of a MERGE statement can cause unexpected results. I’m going to first run this code to recreate my Target and Source tables:

USE tempdb;
GO
DROP table Product, NewInventory
SET NOCOUNT ON;
CREATE TABLE dbo.Product (
       Id int identity, 
       ProductName varchar(100),
       Qty int);
INSERT INTO dbo.Product (ProductName, Qty) VALUES('Magic Stripper', 5);
CREATE TABLE dbo.NewInventory(
       Id int identity, 
       ProductName varchar(100), 
       Qty int);
INSERT INTO dbo.NewInventory (ProductName, Qty) VALUES
       ('Sandpaper',20), 
       ('Paint Brush 1 inch',15),
       ('Magic Stripper',5);

This code just sets my Target and Source back to what it was originally, where there is only 1 row in the Product table and 3 rows in the NewInventory table.
Suppose I only want to add a subset of the rows in my NewInventory table to my Product table. For this example, suppose I only want to add rows to my Product table if the Qty value in my NewInventory table is greater than 5.
To demonstrate how a MERGE statement can produce unexpected results if you add additional filtering criteria using the ON clause, let me run the following code:

USE tempdb;
GO
MERGE dbo.Product AS T  
USING dbo.NewInventory AS S 
ON T.ProductName = S.ProductName
   AND S.QTY > 5
WHEN MATCHED THEN  
  UPDATE SET T.Qty = T.Qty + S.Qty
WHEN NOT MATCHED THEN  
  INSERT (ProductName,Qty) VALUES (S.ProductName,S.Qty);
SELECT * FROM dbo.Product;

When I run this code I get this output:

Id          ProductName                                                   Qty
----------- ------------------------------------------------------------- -----------
1           Magic Stripper                                                5
2           Sandpaper                                                     20
3           Paint Brush 1 inch                                            15
4           Magic Stripper                                                5

If you look at this output you can see it doesn’t meet my requirement by only performing an UPSERT operation of the records in my Source table that have a Qty value greater than 5. Why did it do this? It did this because I added an additional filter rule that wasn’t an equality comparison between a column in the Target table and a column in the Source tables. This additional filter criteria is the “S.Qty > 5” condition. By adding this additional condition that only identified a source column I told SQL Server this was part of the matching criteria. This additional condition was used to determine which records match between the Source and Target tables. In this particular example no row in the Source table matched any rows in the Target table therefore all three records in my Source table where inserted into the Target table. This is why I now have two product records with a ProductName of “Magic Stripper”.
In order to meet my requirements I can write my MERGE statement like below:

USE tempdb;
GO
MERGE dbo.Product AS T  
USING dbo.NewInventory AS S 
ON T.ProductName = S.ProductName
WHEN MATCHED AND S.Qty > 5 THEN  
  UPDATE SET T.Qty = T.Qty + S.Qty
WHEN NOT MATCHED AND S.Qty > 5 THEN  
  INSERT (ProductName,Qty) VALUES (S.ProductName,S.Qty);
SELECT * FROM dbo.Product;

In this code I place the additional filter rule on the WHEN clause. Now SQL Server first matches the rows between the Source and Target tables based on the ON condition and then when SQL Server executes the WHEN conditions it excludes the Source rows that have a Qty less than 6 from being inserted or updated. To test this for yourself you will first need to run the first code block in this section that creates the original Source and Target rows, and then run this code.

Duplicates in Source Table

Another condition that causes problems with the MERGE statement is when the Source table contains multiple records with the same column values associated with the ON clause. When this occurs SQL Server will throw a duplicate error message. To demonstrate let me create a new Source table “NewInventory3” and then run it through my MERGE statement. Here is the code that will throw the MERGE error:

USE tempdb;
GO
CREATE TABLE NewInventory3 (
       Id int identity, 
       ProductName varchar(100), 
       Qty int);
INSERT INTO NewInventory3 values 
       ('Magic Stripper',15),
       ('Magic Stripper',5); 
       USE tempdb;
GO
MERGE dbo.Product AS T  
USING dbo.NewInventory3 AS S 
ON T.ProductName = S.ProductName
WHEN MATCHED THEN  
  UPDATE SET T.Qty = T.Qty + S.Qty
WHEN NOT MATCHED THEN  
  INSERT (ProductName,Qty) VALUES (S.ProductName,S.Qty);
SELECT * FROM dbo.Product;

When I run this code I get this error:

Msg 8672, Level 16, State 1, Line 12

The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.
This error was caused because I had two duplicate rows in my Source table that matched to a single Target row. To resolve this problem I need to eliminate the duplicate rows based on the matching criteria. Suppose I wanted both of those rows to actually add to my inventory of “Magic Stripper”. In this case, I could run the following MERGE statement to accomplish this:

MERGE dbo.Product AS T  
USING (SELECT ProductName, SUM(Qty) as Qty
      FROM dbo.NewInventory3
         GROUP by ProductName) AS S 
ON T.ProductName = S.ProductName
WHEN MATCHED THEN  
  UPDATE SET T.Qty = T.Qty + S.Qty
WHEN NOT MATCHED THEN  
  INSERT (ProductName,Qty) VALUES (S.ProductName,S.Qty);
SELECT * FROM dbo.Product;

Here I removed the duplicate row in my Source table by first aggregating all the similar ProductName rows in my Source table by using a sub-query when I identified the Source rows for my MERGE statement.

Summary

The MERGE statement allows you to write a single TSQL statement that allows you to INSERT, UPDATE, and/or DELETE records from a Target table. The MERGE statement controls whether an INSERT, UPDATE, or DELETE clause is executed by matching a set of rows in the Source table against the Target table. When a Source row matches a Target row the Target row is updated, with information from the Source row. If the Source row does not match a Target row then the Source row is inserted into the Target table. The MERGE statement can also DELETE rows from the Target table when Target table rows are not found within the Source table. Using the MERGE statement greatly simplifies the amount of code you would need to write using “if then else” logic to perform INSERT, UPDATE, and/or DELETE operations against a Target table. Next time you need to perform an UPSERT operation look into using the MERGE statement if you are on SQL Server 2008 and above.

Wednesday, December 3, 2014

Converting comma separated data in a column to rows for selection

http://www.codeproject.com/Tips/732596/Converting-comma-separated-data-in-a-column-to-row

Introduction
Suppose you have a table with two columns:

Collapse | Copy Code

FlatNo    Residents
   1      David Eddings,Terry Pratchett,Greg Bear
   2      Gregory Benford,Orson Scott Card,David Brin,Raymond E Feist

Suppose you want to return this as a single table, organised by flat and individual resident?
Well personally, I'd create a table that had a row for the flat number and resident name for each person rather than using a comma delimited list as a column, but sometimes we have to live with others mistakes and just get on with it... What we want is:

Collapse | Copy Code

Flat Resident
  1 David Eddings
  1 Terry Pratchett
  1 Greg Bear
  2 Gregory Benford
  2 Orson Scott Card
  2 David Brin
  2 Raymond E Feist

How do we do that?

Doing the work

It's not too bad - the code that does the hard work is something I've published as a Tip before: Using comma separated value parameter strings in SQL IN clauses[^] but slightly modified to keep the ID (in this case flat number) together with the resident. All you need to do is define an SQL Function which splits the string and creates a temporary table together with the ID value:

Collapse | Copy Code

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE FUNCTION [dbo].[CSVToTableWithID] (@ID INT, @InStr VARCHAR(MAX))
RETURNS @TempTab TABLE
   (tempid int IDENTITY(1,1) not null, 
   Id int not null,
   Data NVARCHAR(MAX))
AS
BEGIN
    ;-- Ensure input ends with comma
 SET @InStr = REPLACE(@InStr + ',', ',,', ',')
 DECLARE @SP INT
DECLARE @VALUE VARCHAR(1000)
WHILE PATINDEX('%,%', @INSTR ) <> 0 
BEGIN
   SELECT  @SP = PATINDEX('%,%',@INSTR)
   SELECT  @VALUE = LEFT(@INSTR , @SP - 1)
   SELECT  @INSTR = STUFF(@INSTR, 1, @SP, '')
   INSERT INTO @TempTab(Id, Data) VALUES (@ID,@VALUE)
END
 RETURN
END
GO

Using the code

Then all we have to do is feed each row value into the function, and select that into the result. Happily, we can do this very easily with a CROSS APPLY:

Collapse | Copy Code

SELECT ca.Id AS [Flat], ca.Data AS [Resident] FROM MyTable t
CROSS APPLY (SELECT * FROM [CSVToTableWithID](t.FlatNo, t.Residents)) ca

Points of Interest

Seriously: use separate rows if you can. Comma delimited data within SQL columns are a PITA to work with! It's a heck of a lot easier to reformat separate rows into comma separated values on the odd occasion you need it than to work with them all the time.

http://stackoverflow.com/questions/13873701/convert-comma-separated-column-value-to-rows

 SELECT A.[id],  
     Split.a.value('.', 'VARCHAR(100)') AS String  
 FROM  (SELECT [id],  
         CAST ('<M>' + REPLACE([ColumnName], ',', '</M><M>') + '</M>' AS XML) AS String  
     FROM  TableName) AS A CROSS APPLY String.nodes ('/M') AS Split(a);

GeneralSQL

Monday, December 22, 2014

Linked lists

Introduction

First attempt

A better solution

Modifying data in a linked list

Testing the solution

Restrictions

Don't use hierarchyid for linked lists

Using the MERGE Statement to Perform an UPSERT

What is the Value of the MERGE Statement?

Performing UPSERT Using MERGE Statement

Related Articles

What about Deleting Rows Using the Merge Statement?

Beware of Filtering Rows Using ON Clause

Duplicates in Source Table

Summary

Wednesday, December 3, 2014

Converting comma separated data in a column to rows for selection

Doing the work

Using the code

Points of Interest