Drawing Trees (#40)

A common implementation of Heap data structures involves storing their tree in a single Array. If you add one extra element at the beginning (creating a base 1 Array), you can use easy math to find child nodes. This doesn't generally bother the user, because you don't usually access a Heap by index.

Using that system and given any index i, 2 * i is the left child node and 2 * i + 1 is the right child node. So the root of the tree is i = 1 and it has a child node at i = 2 and another at i = 3. Then the i = 2 node has child nodes at i = 4 and i = 5. Reversing the traversal, i / 2 (assuming integer division) will bring you to a parent node.

Now take a deep breath and relax because I'm not going to ask you to write a Heap. In fact, here's a basic pure Ruby implementation (not exhaustively tested) using the system I just described:

ruby
class Heap
def initialize( *elements, &comp )
@heap = [nil]
@comp = comp || lambda { |p, c| p <=> c }

insert(*elements)
end

def clear( )
@heap = [nil]
end

def extract( )
extracted = @heap[1]
@heap[1] = @heap.pop unless @heap.size.zero?
sift_down
extracted
end

def insert( *elements )
elements.each do |element|
@heap << element
sift_up
end
end

def size( )
@heap.size - 1
end

def inspect( )
@heap[1..-1].inspect
end

def to_s( )
# Write this!
end

private

def sift_down( )
i = 1
loop do
c = 2 * i
break if c >= @heap.size

c += 1 if c + 1 < @heap.size and @heap[c + 1] < @heap[c]
break if @comp[@heap[i], @heap[c]] <= 0

@heap[c], @heap[i] = @heap[i], @heap[c]
i = c
end
end

def sift_up( )
i = @heap.size - 1
until i == 1
p = i / 2
break if @comp[@heap[p], @heap[i]] <= 0

@heap[p], @heap[i] = @heap[i], @heap[p]
i = p
end
end
end

priority_queue = Heap.new
priority_queue.insert(12, 20, 15, 29, 23, 17, 22, 35, 40, 26, 51, 19)
puts priority_queue

priority_queue.insert(13)
puts priority_queue

priority_queue.extract
puts priority_queue

If you read that real closely, you saw the blank Heap.to_s() method. Implementing a Heap as an Array is one thing and it's easy to show like that (see Heap.inspect() above). However, we generally envision a Heap as a tree, not an Array. Wouldn't it be nice if Heap.to_s() would show it to us that way? I think so too.

This week's Ruby Quiz is to write Heap.to_s() so that it returns an ASCII art representation of a Heap as a tree. For example, here's what the first tree printed by the above code might look like:

12
|
+-------+-------+
| |
20 15
| |
+---+---+ +---+---+
| | | |
29 23 17 22
| | |
+-+-+ +-+-+ +-+
| | | | |
35 40 26 51 19

Your tree doesn't have to look like that. Maybe you would prefer to draw it horizontally. Maybe you make better looking ASCII art than me. Whatever. Anything that gives a feel of the structure is fine.

Heaps don't have to hold two-digit Integers of course, any Comparable should work...


Quiz Summary

Did you spot the errors in my Heap class? Dominik Bathon did, so let's fix those up before we go any further.

To start with, the extract() method would not remove the last item from the heap, because it would pop() it off but then reassign it to @heap[1]. Here's my fix:

ruby
def extract( )
case size
when 0
nil
when 1
@heap.pop
else
extracted = @heap[1]
@heap[1] = @heap.pop
sift_down
extracted
end
end

That handles the edge cases with a little more care.

The other issue was that I originally developed Heap using the normal comparison operators, but later decided to add the @comp block to allow customized ordering for any Heap you create. I thought I updated all of the comparisons, but the truth is that I missed one in sift_down(). Here's the corrected (by Dominik) method:

ruby
def sift_down( )
i = 1
loop do
c = 2 * i
break if c >= @heap.size

c += 1 if c + 1 < @heap.size and @comp[@heap[c + 1], @heap[c]] < 0
break if @comp[@heap[i], @heap[c]] <= 0

@heap[c], @heap[i] = @heap[i], @heap[c]
i = c
end
end

Enough about James's buggy code, let's talk trees. How do they look? Let's compare:

Brian Schroeder
===============
12
.----' `----.
20 15
.-' `-. .-' `-.
29 23 17 22
/ \ / \ /
35 40 26 51 19

Dave Burt
=========
12-15-22-
| | `-
| `-17-
| `-19
`-20-23-51
| `-26
`-29-40
`-35

Dominik Bathon
==============
12
|
+--+-----+
| |
20 15
| |
+--+--+ +-++
| | | |
29 23 17 22
| | |
+-++ +-++ +
| | | | |
35 40 26 51 19

Florian Frank
=============
12
+---20
| +---29
| | +---35
| | `---40
| `---23
| +---26
| `---51
`---15
+---17
| `---19
`---22

David Tran
==========
+-o 22
|
+-o 15
| |
| +-o 17
| |
| +-o 19
|
-o 12
|
| +-o 51
| |
| +-o 23
| | |
| | +-o 26
| |
+-o 20
|
| +-o 40
| |
+-o 29
|
+-o 35

There's some interesting variety in there. Which display you prefer is probably a matter taste, but hopefully there's something in there for everyone.

It may be interesting to note that this quiz grew out of an actual problem I was working with. I dove into the problem fully expecting to spend ten minutes on it and was surprised when I kept running into complication after complication. I had to restart twice as I realized that I was attacking it from the wrong angle.

Given that, I was looking through the solutions with an eye for elegance and simplicity. I found a couple of things I liked, but really I thought David Tran's solution is what I was searching for. Let's take a look at that code:

ruby
def to_s( )
return "[empty heap]" if @heap.size <= 1
result = ''
root = 1

if has_right?(root)
print_node(result, ' ', true, right_index(root))
result << " |\n"
end

result << "-o #{@heap[root]}\n"

if has_left?(root)
result << " |\n"
print_node(result, ' ', false, left_index(root))
end

result
end

Clearly that relies on a few helper methods we'll meet shortly, but it's surprisingly straight forward. The code checks for the special case of an empty heap, then initializes a String to hold the result and a root node for the tree.

From there, the method checks to see if the root has a right child node, and passes some information down to print_node() if it does. Note that an extra line containing "|\n" is added if after the call.

After that, the root node is added to the result and the left child node is forwarded to print_node(), if needed. This time though the "|\n" line is added before the call.

Finally, the completed result is returned.

You can see the the right side is handled first, then the node itself, and finally left. Put another way, the "right" side will be on top, the node in the middle, then the "left" is placed on the bottom. That may seem odd, but you need to remember that this tree is on its side. If you rotate it 90% clockwise in your mind's eye, everything should snap into place.

Here's the work we've seen so far:

print_node(..., right_index(root))
|
-o root
|
print_node(..., left_index(root))

The four index methods should be pretty obviously. They're the node jumping math right out of the quiz. Here they are:

ruby
private

def left_index( index ) ; index * 2 ; end
def right_index( index ) ; index * 2 + 1 ; end
def has_left?( index ) ; left_index(index) < @heap.size ; end
def has_right?( index ) ; right_index(index) < @heap.size ; end

I think pulling those out into methods is one of the things that kept this code so clean and approachable.

The last piece of this puzzle is the mysterious print_node() method. Let's dig inside that now:

ruby
def print_node( result, line, right, index )
if has_right?(index)
print_node(result, line + (right ? ' ' : '| '), true,
right_index(index))
result << "#{line}#{right ? ' ' : '|'} |\n"
end

result << "#{line}+-o #{@heap[index]}\n"

if has_left?(index)
result << "#{line}#{right ? '|' : ' '} |\n"
print_node(result, line + (right ? '| ' : ' '), false,
left_index(index))
end
end

We can see that this method takes four parameters. The first is the result String we are building up, which it appends to. The second is the line up to this branch of the tree, used in indenting new lines. The third parameter is true or false, depending if we're on the right side of the tree (true) or left (false). Finally, the method receives the index of the node to use as the root of the branch.

The rest of the method is very similar to the original to_s(), even in its recursive calls to itself to drill down into the tree. The output lines are slightly different, mainly to account for the line variable that must be prepended and also to deal with which way the pipe characters are flowing.

The only minus here is the near duplication of code between to_s() and print_node(). If we want, we can remove it though. The following code is my adaptation of David's methods to remove the duplication but produce identical results:

ruby
def to_s( )
return "[empty heap]" if @heap.size <= 1
print_node("", "", true, true, 1).gsub!(/^[+ ]/, "")
end

private

def left_index( index ) ; index * 2 ; end
def right_index( index ) ; index * 2 + 1 ; end
def has_left?( index ) ; left_index(index) < @heap.size ; end
def has_right?( index ) ; right_index(index) < @heap.size ; end

def print_node( result, line, right, left, index )
if has_right?(index)
print_node( result, line + (right ? ' ' : '| '), true, false,
right_index(index) )
result << "#{line}#{right ? ' ' : '|'} |\n"
end

result << "#{line}+-o #{@heap[index]}\n"

if has_left?(index)
result << "#{line}#{left ? ' ' : '|'} |\n"
print_node( result, line + (left ? ' ' : '| '), false, true,
left_index(index) )
end

result
end

The resulting tree may be a little unusual with it's hanging nodes, but shifting all data to the right freed the code of the tedious need to account for data width at each point. Florian Frank capitalized on the same trick using a slightly different tree format and that code is also pretty straight forward. It's amazing what a little rethinking of the problem can do.

My usual thanks to all the solvers. You again taught me (and hopefully others) new things about how code should be designed.

Ruby Quiz will take a one week break starting tomorrow, so I can enjoy a little summer vacation with my family. As always, I hope to return to an inbox full of new quiz material. Send those ideas along! The Quiz will return with a task to show how Ruby can help you cash in on the stock market. Stay tuned...