Wednesday, March 21, 2012

Help with Recursive Stored Procedure

Hello,

I am having problem writing a recursive stored procedure for a Content Management System and am hoping that someone can give me a hand with it. I have posted (below) a script to recreate a table and data to test it with, and I've included my sample stored procedure that isn't working as expected. Any help would really be appreciated, I'm stuck.

I have a table which represents a hierarchical view of data. In this case, it is an example of classified ads. The hierarchy looks something like this:

............
Classifieds
--Animals
----Dogs
-----Golden Retrievers
-----Poodles
--Automobiles
............

Each of these above I call a NODE. Each NODE has a Parent Node, and since a NODE can have multiple Parent Nodes, each NODE also has a PriorParentID (which is its parent's parent node). (still with me here?) Each NODE supports many different Types of Content. What I need to do is walk the hierarchy from a specific NODE, up through each of it's parents to the top-most node. Throughout the recursion up the tree, I need to collect a distinct list of Content Types that the parents support.

So, suppose that the "Classifieds" node supports "ContentTypeID" of 3. Every ascendent node below it inherits that "ContentTypeID". So if I am looking at "Poodles", I want to walk the tree up through its parents, collecting the "ContentTypeID" all the way up, I'd end up with a "3" for a ContentTypeID for Poodles because one of it's parents in the hierarchy supports ContentTypeID "3".

And if the "Dogs" node supports a "ContentTypeID" of 1 and "Animals" supports a "ContentTypeID" of 1 and 2, then I need to get back 1, 2 and 3 in a resultset...because as you walk up the hierarchy from Poodles to Classifieds, you encounter a "1" and a "2" and a "3" for ContentTypeID's.

Here's a script to set up the data:

---------------

CREATE TABLE [tmpNodalHierarchy] (
[ChildNodeID] [int] NULL ,
[ParentNodeID] [int] NULL ,
[PriorParentNodeID] [int] NULL ,
[NodeLabel] nvarchar(50) NULL,
[ContentTypeID] [int] NULL)
GO

INSERT INTO tmpNodalHierarchy
(ChildNodeID, ParentNodeID, PriorParentNodeID, NodeLabel, ContentTypeID)
VALUES
(1, 0, -1, 'Classifieds', 3)
INSERT INTO tmpNodalHierarchy
(ChildNodeID, ParentNodeID, PriorParentNodeID, NodeLabel, ContentTypeID)
VALUES
(2, 1, 0, 'Animals', 1)
INSERT INTO tmpNodalHierarchy
(ChildNodeID, ParentNodeID, PriorParentNodeID, NodeLabel, ContentTypeID)
VALUES
(2, 1, 0, 'Animals', 2)
INSERT INTO tmpNodalHierarchy
(ChildNodeID, ParentNodeID, PriorParentNodeID, NodeLabel, ContentTypeID)
VALUES
(3, 1, 0, 'Automobiles', 1)
INSERT INTO tmpNodalHierarchy
(ChildNodeID, ParentNodeID, PriorParentNodeID, NodeLabel, ContentTypeID)
VALUES
(4, 2, 1, 'Dogs', 1)
INSERT INTO tmpNodalHierarchy
(ChildNodeID, ParentNodeID, PriorParentNodeID, NodeLabel, ContentTypeID)
VALUES
(5, 4, 2, 'Golden Retrievers', 1)
INSERT INTO tmpNodalHierarchy
(ChildNodeID, ParentNodeID, PriorParentNodeID, NodeLabel, ContentTypeID)
VALUES
(5, 4, 2, 'Golden Retrievers', 2)
INSERT INTO tmpNodalHierarchy
(ChildNodeID, ParentNodeID, PriorParentNodeID, NodeLabel, ContentTypeID)
VALUES
(6, 4, 2, 'Poodles', 1)
INSERT INTO tmpNodalHierarchy
(ChildNodeID, ParentNodeID, PriorParentNodeID, NodeLabel, ContentTypeID)
VALUES
(7, 4, 2, 'Chows', NULL)
---------------

CREATE PROC dbo.tmpGetRecursiveContentTypes
(
@.ParentNodeID int,
@.PriorParentNodeID int,
@.ContentTypeID int = 0,
@.InContentTypes nvarchar(500) = '',
@.OutContentTypes nvarchar(500) = null OUTPUT
)
AS
IF NOT @.PriorParentNodeID IS NULL
BEGIN
SET NOCOUNT ON
DECLARE @.ContentTypes nvarchar(500)
SELECT @.ContentTypes = @.InContentTypes
DECLARE @.curContentTypeID int, @.tmpPriorParentNodeID int, @.curNodeLabel nvarchar(100), @.curChildNodeID int, @.curParentNodeID int, @.curPriorParentNodeID int

WHILE 0=0
BEGIN
SELECT TOP 1 @.curContentTypeID = ContentTypeID, @.curParentNodeID = ParentNodeID, @.curPriorParentNodeID = PriorParentNodeID FROM dbo.tmpNodalHierarchy WHERE ChildNodeID=@.ParentNodeID and ParentNodeID = @.PriorParentNodeID and (ContentTypeID > @.ContentTypeID) ORDER BY ContentTypeID ASC
IF @.@.ROWCOUNT = 0
BEGIN
SELECT @.curContentTypeID = ContentTypeID, @.curParentNodeID = ParentNodeID, @.curPriorParentNodeID = PriorParentNodeID FROM dbo.tmpNodalHierarchy WHERE ChildNodeID=@.ParentNodeID and ParentNodeID = @.PriorParentNodeID and (ContentTypeID = @.ContentTypeID) ORDER BY ContentTypeID ASC
SET @.ContentTypeID = @.curContentTypeID
BREAK
END
SET @.ContentTypeID = @.curContentTypeID

IF NOT @.curContentTypeID IS NULL
SELECT @.ContentTypes = @.ContentTypes + ',' + CAST(@.curContentTypeID as nvarchar(10))
END
EXEC dbo.tmpGetRecursiveContentTypes @.curParentNodeID,@.curPriorParentNodeID,@.ContentTypeID,@.ContentTypes
END
SET @.OutContentTypes = @.ContentTypes
IF @.@.NESTLEVEL=1
SELECT @.ContentTypes
GO

----------------
--To execute the sproc as if we wanted
--to get the ContentTypeID's for "Poodles"
--
--
EXEC tmpGetRecursiveContentTypes 4, 2
--
----------------

I have been concatenating the results, but ideally I want the resultset to look like:

ContentTypeIDs
------
1
2
3

Thanks for any help you can give.

DanHave you ever thought to use XML to present these data?|||Search booksonline for 'Expanding hierarchies', pretty good example|||I thought of using XML, but I can't use it...it needs to return a recordset because it needs to easily adapt to other databases other than SQL Server.

Dan|||Thanks for the link Dutch. It inspired an idea that worked and resulted in an even simpler solution. Thanks a lot.

Dan|||Big, I'm not going to write your stored proc for you, but I can point you in a direction. The basic problem you have is getting all parents of a leaf node. if you have all the parents, a simple select will give you all the ContentTypeID's right? Ok, so let's forget about the ContentTypeIDs and focus on the real problem - getting that list of parents given a certain node.

First: get rid of the PriorParentID - and whack yourself on the head with a blunt object 10 times repeating 'I can get the prior parent Id be looking at the parent's ParentNodeID'

Then: Please normalise your tables! One node = Many ContentTypeIDs, so:

Table1:
ChildNodeID, ParentNodeID, NodeLabel

Table2:
ChildNodeID, ContentTypeID

So you'll have this in the tables:

Table1:
1, null, Classifieds
2, 1, Animals
3, 1, Automobiles
4, 2, Dogs
5, 4, Golden...
6, 4, Poodles
7, 4, Cows

Table2:
1, 3
2, 1
2, 2
3, 1
4, 1
5, 1
5, 2
6, 1

Now all you do is this:

1 Start at leaf node
2 Add the ContentTypeIDs for the node to your list - don't add ones in the list already
3 Get the parentID
4 if the parentID is null, finish
5 Else get the node where id = parentID
6 go to 2

Recursive, which is bad for performance. So: how do we do this without recursion? Textbook answer is by using a bridge table. What is a bridge table? It's a table that lists all parent-child relationships. Example - this is a condensed version of your table1: (ChildId, ParentId)

1, null
2, 1
3, 1
4, 2
5, 4
6, 4
7, 4

This is the brige table for it: (ChildID, ParentID)
2, 1
3, 1
4, 1
5, 1
6, 1
7, 1
4, 2
5, 2
6, 2
7, 2
5, 4
6, 4
7, 4

As you can see, with one select from the bridge table I can get all parents of a node. E.g. in your example, you looked at poodles, node 6, and you wanted all parents:
select ParentID
from Bridge
where ChildId = 6

and you get all the parent nodes: 1, 2, 4 (classifieds, animals and dogs) and now with one simple join, and perhaps a distinct, you can get all the ContentTypeIds

select distinct Table2.ContentTypeID
from Bridge
join Table2
on Bridge.ChildID = 6
and Bridge.ParentID = Table2.ID

How do you build / maintain the bridge table? Triggers on Table1.

Phew... Enough for now.

No comments:

Post a Comment