Everyday Trees in Computer Science – [Everyday Code]

The goal of this article was to equip developers with everything they need to know about trees to use Instruction Steps Framework (IS) and Instruction Converter (IC) that we are currently working on in Axlessoft. While developing the article for internal use I realized that these are things that every developer should have under their sleeve to successfully use tree structures in their day to day tasks.

image of a tree that will make the article more interesting. Not a real programming tree

Some of us might be tempted to revise the heavy Knuth books when it comes to solving the next tree problem. But generally we don’t need to do it. You know the meme:

  • First year Computer Science – “algorithms, data structures, etc …”
  • Second year – “functional programming, object oriented, etc…”
  • Last year – “cryptography, network programming, design of languages, etc
  • First day at work – “move this button to the left”

So let’s start with a few things you need to know day to day.

You will see “parent->children” more often than “node->left,right”, eg. binary trees.

If you have a structure of parent and children this is a tree structure. You have three scenarios:

  1. The parent knows about the children and the children know about the parent. There is a parent.children and child.parent
  2. The parent knows about the children, but the children do not know about the parent. There is parent.children, but there is no child.parent
  3. The parent does not know about the children, but the children know about the parent. There is no parent.children, but there is child.parent

In many books, algorithms and generally articles you will have people talking about Binary Trees. Now, it might be because of my line of work, but:

  1. I can not remember the last time that I had to properly work with a binary tree – something more than “just find this node”.
  2. For every time I have to work with “binary trees” there are at least 1024 times I have to work with “parent->children” structures.

Yes, it is the same. But it is also not the same.

Take away:

All the three cases represent a tree. It is best to have case 1, but this is not always possible. You will deal with “parent->children” more often than with “left node”, “right node”.

If there is a library for it, use a library.

This should be a no-brainer, but a lot of times we like to code a few iterations here and there and develop things for basic iteration over a tree. Just don’t. If there is something tree related that you can not find a library to do for you in about 5 minutes internet search, then you are doing something worth of a Phd or you are searching with the wrong queries (or you are trying to do something stupid!).

There are of course exceptions. Because when we, as great developers, implement Pre and Post order iteration we are doing it better than anyone else. So this exception is acceptable. Of course.

Iteration -Pre, Post, In and visiting a node

Visiting a node means that you are processing it. For example printing to the screen.

There are a few basic ways. Just search for them on the internet. There are a lot of really nice resources of how you iterate over tress. The basic ways are:

  1. LRN – Left Right Node – first you visit the left child, then you visit the right child, then you visit the node. But as we say this is the theory. In a real life scenario with “parent->children” you visit children[0]..children[n] and then the parent
  2. LNR – Left Node Right – How do you do this for “parent->children” I don’t know.
  3. NLR – Node Left Right – first the parent then children[0]..children[n]
  4. NRL – Node Right Left – first the parent then children[n]..children[0]
  5. RLN – Right Left Node – first children[n]..children[0] then the parent.

Avoid recursion

Yes, you have trees. But avoid recursion. Use “traversers”. Almost any reasonably useful library has a “traverser” support. The idea is simple – somebody else is doing the recursion for you and this somebody else is the library.

In pseudo code:

rootNode = ... 
rootNode.traverse((node)=> {
   print node
})

The idea is simple. You have a rootNode. It has a method ‘traverse’ that will traverse the whole tree, iterate, recurse, do a lot of magic, but at the end it will traverse the whole tree and will call the “print node” for every node in the tree.

There is no recursion or iteration in your code. There is just traverse. Sometimes it is called “visitor” ( Visitors pattern, Gang of four, things like this). I kind of preferred “visitor” in early stage of my software development career, but now I more often call them “traversers”. Traversers are visitors, but not all visitors are traversers from my point of view. Would be happy to talk about this over a beer. Anyone..?

Take away:

If you are using a library that does not have traverse you should seriously consider using another library.

Avoid ‘getDescendants()’ and getDeepChildren()

Yes. These are some libraries that try to help. Imagine that you nave:

node 1
  node 1.1
    node 1.1.1
  node 1.2
    node 1.2.1
    node 1.2.2
  node 1.3

You need to get all the children of node 1 and all of their children and you need this in an array. Why? Because reasons. This is the feature request. There are libraries like BABYLON.js that try to help. They offer a “getDescendants()’ method. You can call:

anArray = node.getDescendants();
// here you have all the elements of the tree in anArray

// loop over the array
anArray.forEach((a)=> {
   print a
})

This method saves a lot. You do not need a “traverser”. You don’t need to know anything about order. You need all the elements of the whole tree so get an array. But there are drawbacks.

  1. It could be a large tree. So you are creating a copy, but this time in an array and memory and user experience are expensive.
  2. Even if it is not large you are still creating an array. Do you really need an array or you could just traverse the whole tree and do your job without previously creating an array?
  3. If you have to call getDescendants() more then once, you should refactor your code.

Consider the following:

anArray = node.getDescendants();
// here you have all the elements of the tree in anArray

// loop over the array
anArray.forEach((a)=> {
   if(a.name.includes("1")) {
      print a.getDescendants().length
   }
})

In the example above we would like to print the number of descendants for each node that includes the string “1” in its name. How many times is the tree traversed? How many arrays are created? The answer is “a lot”. You traverse once for the rootNode and then for every other node again and you create new arrays.

Take away:

If there is a helper method it is probably giving you, the developer, something, but it is at the expense of the user experience – more memory, slower execution. Consider the implications of using helper methods. They might help you, but are they helping the user using your software.