Joe Celko's Trees and Hierarchies in SQL for Smarties, (The Morgan Kaufmann Series in Data Management Systems)

Author: Joe Celko
4.2
All Stack Overflow 30
This Year Stack Overflow 1
This Month Stack Overflow 1

Comments

by anonymous   2019-07-21

The best information I have found on representing tree structures in a database is in Joe Celko's Trees and Hierarchies in SQL for Smarties.

You can probably find enough information on the web but I'd recommend getting the book, I found it very useful when implementing a nested set hierarchy.

You can either use an adjacency list or nested sets to model the tree in the database, I'm assuming you are using an adjacency list (where each entry has a parent attribute)

If you want to be able to move entire sub trees from one parent to anther then it should be as easy as changing the parent_id (or whatever PK you are using) to reference the new parent. A nested set model requires changes to all the nodes when a sub tree is moved.

However, other operations are easier on a nested set, such as selecting all the child nodes under a specific parent. This can be more difficult with the adjacency list model but it has become easier with the advent of recursive CTE.

If you are not worried about the order of the content I would avoid storing the chapter numbers with your data. Apply them when you select the data and then you avoid having to update every node when your tree changes.

by anonymous   2019-07-21

Hierarchies are complicated in MySQL. I don't think you can store a bunch of custom stuff in fql so you probably can't do it that way unless it's a built in feature.

Here's an article about MySQL hierarchies:
http://www.phpro.org/tutorials/Managing-Hierarchical-Data-with-PHP-and-MySQL.html

See also this question with good answers:
Hierarchical Data in MySQL

There are even books written on the subject:
http://www.amazon.com/Hierarchies-Smarties-Kaufmann-Management-Systems/dp/1558609202

Complicated, but doable.

by anonymous   2019-07-21

You could change your SQL table structure to use the nested sets model for a tree; it makes it much easier to test for inclusion of a child which may be deeply nested under a particular parent.

This page has a good description and comparison of the adjacency list model and nested sets.

You might find the following answer to a nested set question is also helpful: Help with writing a SQL query for Nested Sets

Pick up a copy of Joe Celko's Trees and Hierarchies in SQL for Smarties. I keep recommending this book because it helped me immensely when I was modelling a tree structure in SQL using nested sets.

by anonymous   2019-07-21

as @teabot says, MySQL doesn't ship with any inbuilt functionality to do this. However, a number of sources exist which will give you more than a half decent starting point for the queries you require, so you want have to entirely roll your own.

For examples, see Working with Graphs in MySQL , the MySQL docs and maybe even Joe Celko's book. Depending on your exact use case, you might also find that someone has done it for you (see django-mptt implementation).

by Bill Karwin   2017-08-20

There are several ways to store tree-structured data in a relational database. What you show in your example uses two methods:

  • Adjacency List (the "parent" column) and
  • Path Enumeration (the dotted-numbers in your name column).

Another solution is called Nested Sets, and it can be stored in the same table too. Read "Trees and Hierarchies in SQL for Smarties" by Joe Celko for a lot more information on these designs.

I usually prefer a design called Closure Table (aka "Adjacency Relation") for storing tree-structured data. It requires another table, but then querying trees is pretty easy.

I cover Closure Table in my presentation Models for Hierarchical Data with SQL and PHP and in my book SQL Antipatterns: Avoiding the Pitfalls of Database Programming.

CREATE TABLE ClosureTable (
  ancestor_id   INT NOT NULL REFERENCES FlatTable(id),
  descendant_id INT NOT NULL REFERENCES FlatTable(id),
  PRIMARY KEY (ancestor_id, descendant_id)
);

Store all paths in the Closure Table, where there is a direct ancestry from one node to another. Include a row for each node to reference itself. For example, using the data set you showed in your question:

INSERT INTO ClosureTable (ancestor_id, descendant_id) VALUES
  (1,1), (1,2), (1,4), (1,6),
  (2,2), (2,4),
  (3,3), (3,5),
  (4,4),
  (5,5),
  (6,6);

Now you can get a tree starting at node 1 like this:

SELECT f.* 
FROM FlatTable f 
  JOIN ClosureTable a ON (f.id = a.descendant_id)
WHERE a.ancestor_id = 1;

The output (in MySQL client) looks like the following:

+----+
| id |
+----+
|  1 | 
|  2 | 
|  4 | 
|  6 | 
+----+

In other words, nodes 3 and 5 are excluded, because they're part of a separate hierarchy, not descending from node 1.


Re: comment from e-satis about immediate children (or immediate parent). You can add a "path_length" column to the ClosureTable to make it easier to query specifically for an immediate child or parent (or any other distance).

INSERT INTO ClosureTable (ancestor_id, descendant_id, path_length) VALUES
  (1,1,0), (1,2,1), (1,4,2), (1,6,1),
  (2,2,0), (2,4,1),
  (3,3,0), (3,5,1),
  (4,4,0),
  (5,5,0),
  (6,6,0);

Then you can add a term in your search for querying the immediate children of a given node. These are descendants whose path_length is 1.

SELECT f.* 
FROM FlatTable f 
  JOIN ClosureTable a ON (f.id = a.descendant_id)
WHERE a.ancestor_id = 1
  AND path_length = 1;

+----+
| id |
+----+
|  2 | 
|  6 | 
+----+

Re comment from @ashraf: "How about sorting the whole tree [by name]?"

Here's an example query to return all nodes that are descendants of node 1, join them to the FlatTable that contains other node attributes such as name, and sort by the name.

SELECT f.name
FROM FlatTable f 
JOIN ClosureTable a ON (f.id = a.descendant_id)
WHERE a.ancestor_id = 1
ORDER BY f.name;

Re comment from @Nate:

SELECT f.name, GROUP_CONCAT(b.ancestor_id order by b.path_length desc) AS breadcrumbs
FROM FlatTable f 
JOIN ClosureTable a ON (f.id = a.descendant_id) 
JOIN ClosureTable b ON (b.descendant_id = a.descendant_id) 
WHERE a.ancestor_id = 1 
GROUP BY a.descendant_id 
ORDER BY f.name

+------------+-------------+
| name       | breadcrumbs |
+------------+-------------+
| Node 1     | 1           |
| Node 1.1   | 1,2         |
| Node 1.1.1 | 1,2,4       |
| Node 1.2   | 1,6         |
+------------+-------------+

A user suggested an edit today. SO moderators approved the edit, but I am reversing it.

The edit suggested that the ORDER BY in the last query above should be ORDER BY b.path_length, f.name, presumably to make sure the ordering matches the hierarchy. But this doesn't work, because it would order "Node 1.1.1" after "Node 1.2".

If you want the ordering to match the hierarchy in a sensible way, that is possible, but not simply by ordering by the path length. For example, see my answer to MySQL Closure Table hierarchical database - How to pull information out in the correct order.

by Bill Karwin   2017-08-20

As @Macka writes, the left and right values are not foreign keys to tree nodes, so they don't have to be the same type. They can be integers while the tree node primary key is a GUID.

Celko also wrote "Trees and Hierarchies in SQL for Smarties" which goes into more detail about Nested Set Model, and other solutions. Reading this book will save you a lot of time and a lot of mistakes.

There are other solutions to storing hierarchical data in a database. See my answer here: What is the most efficient/elegant way to parse a flat table into a tree?

by Bill Karwin   2017-08-20

There is no SQL-92 solution for recursive queries.

The best option is to use one of the solutions for encoding hierarchical relationships so that you can query all descendants or ancestors, using standard SQL.

See a brief description here: "What is the most efficient/elegant way to parse a flat table into a tree?".

Or read "Trees and Hierarchies in SQL for Smarties" by Joe Celko.

by anonymous   2017-08-20

if you have any links that might help me

No problem. Joe Celko's Trees and Hierarchies in SQL for Smarties has got you covered. All the tree inserts, updates and deletes you can handle.

Also:
Populate TreeView from DataBase

by Bill Karwin   2017-08-20

There are more options than just the two you mention. There are:

  • Adjacency List (the "parent_id" one almost everyone uses)
  • Nested Sets
  • Path Enumeration
  • Closure Table (aka Adjacency Relation)

See my answer to "What is the most efficient/elegant way to parse a flat table into a tree?"

Or a couple of books: