BST Implementation in Python
I've been going through John Washam's Google interview repo on GitHub and checking off concepts that I feel confident about. Because I don't use binary trees (BTs) in my code that often, I've felt that I need to brush up on them.
I wrote the following little implementation of the data structure in Python. While I'm more than comfortable with recursion, it's the base and edge-cases that can be tricky at times. For instance, when I was working with discrete convolution algorithms, recursively solving for the ranges over which to spatially iterate an image with each core in my laptop was more difficult to write than the convolution algorithm, and what I have there still isn't an optimal solution (e.g. what if the number of cores isn't a power of two?). [I think I'll attempt that again in the future.]
I'm still not positive I've weeded out all the errors. I'm writing unit tests though, and if I find any, I'll update it. I've written a bunch of unit tests with
EDIT (3/17): I was just reading the source of
What's more, I think it'd be fun to write a generalized BT. For example, a class that accepts any type and a function/operator for comparing two objects (for objects of type
Altogether, it's an interesting concept, and Python allows you to be very expressive in what you're trying to convey, both to the machine and others.
I wrote the following little implementation of the data structure in Python. While I'm more than comfortable with recursion, it's the base and edge-cases that can be tricky at times. For instance, when I was working with discrete convolution algorithms, recursively solving for the ranges over which to spatially iterate an image with each core in my laptop was more difficult to write than the convolution algorithm, and what I have there still isn't an optimal solution (e.g. what if the number of cores isn't a power of two?). [I think I'll attempt that again in the future.]
unittest
, and I'm pretty sure that this is more correct.EDIT (3/17): I was just reading the source of
typing
and found that Optional[<some type>]
can be used as a shorthand for Union[<some type>, None]
(where None
is replaced by type(None)
.#! /usr/bin/env python3.4 # -*- coding: utf-8 -*- """ A binary tree implementation to store data. I like splitting the BT data structure into two classes: one for the nodes, and one for the actual BT. """ from typing import Union, Tuple, Optional __all__ = ['BT'] class Node: """ A node type for the BT """ def __init__(self, value: Optional[int], lvalue: Optional[int] =None, rvalue: Optional[int] =None) -> None: self.value = value if lvalue: # lvalue != None self.lvalue = Node(lvalue) else: self.lvalue = None if rvalue: # rvalue != None self.rvalue = Node(rvalue) else: self.rvalue = None def get_value(self) -> Optional[int]: return self.value def get_l(self) -> Optional['Node']: """ Get the actual _object_ node """ return self.lvalue def get_r(self) -> Optional['Node']: """ Again, get the actual node """ return self.rvalue def get_lvalue(self) -> Optional['Node']: """ Every node points to another node or None """ if self.lvalue: return self.lvalue.get_value() else: return None def get_rvalue(self) -> Optional['Node']: """ Again, every node points to another node or None """ if self.rvalue: return self.rvalue.get_value() else: return None def set_lvalue(self, val: Union[int, 'Node', None]) -> None: """ Set the left child of the current node to a value """ if self.has_lvalue(): # Update what it points to raise ValueError('** Cannot update at the node level if already set') else: self.lvalue = Node(val) if type(val) is int or type(val) is None \ else val def set_rvalue(self, val: Union[int, 'Node', None]) -> None: """ Set the right child of the current node to a value """ if self.has_rvalue(): # Update it raise ValueError('** Cannot update at the node level if already set') else: self.rvalue = Node(val) if type(val) is int or type(val) is None \ else val # If you'd like to check if it points to an actual node and not None: def has_lvalue(self) -> bool: return self.lvalue is not None def has_rvalue(self) -> bool: return self.rvalue is not None class BT: """ A binary tree structure. """ def __init__(self, root: int) -> None: self.root = Node(root) # (association relationship) self._size = 1 def get_root_value(self) -> Optional[int]: return self.root.get_value() def get_size(self) -> int: return self._size def insert_iteratively(self, i: int) -> None: """ Creates a new node in the BT """ current_node = self.root # != None while True: # This will terminate for any finite tree if i <= current_node.get_value(): if current_node.has_lvalue(): current_node = current_node.get_l() else: current_node.set_lvalue(i) self._size += 1 break else: if current_node.has_rvalue(): current_node = current_node.get_r() else: current_node.set_rvalue(i) self._size += 1 break def insert_recursively(self, i: int, parent: Optional[Node] =None) -> None: """ Insert a value in the BT recursively """ if not parent: parent = self.root if i <= parent.get_value(): if parent.has_lvalue(): self.insert_recursively(i, parent.get_l()) else: parent.set_lvalue(i) self._size += 1 else: if parent.has_rvalue(): self.insert_recursively(i, parent.get_r()) else: parent.set_rvalue(i) self._size += 1 def lookup_from_node(self, i: int, n: Node) \ -> Union[Tuple[Node, None], Tuple[Node, Node], Tuple[None, Node]]: """ Make it easier to look up a value starting at some node """ return self.lookup(i, n) def lookup(self, i: int, n: Optional[Node] =None, parent: Optional[Node] =None) \ -> Union[Tuple[Node, None], Tuple[Node, Node], Tuple[None, Node]]: """ Return the instance of Node that contains a value """ if n is None: n = self.root # self.root cannot be None and has no parent parent = None nval = n.get_value() if i < nval: if n.has_lvalue(): return self.lookup(i, n.get_l(), n) else: return None, parent elif i > nval: if n.has_rvalue(): return self.lookup(i, n.get_r(), n) else: return None, parent else: return n, parentI also learned about type forward referencing by writing this script. You can read more about it in PEP 484, which introduces type hints. For example, if you try the following class without quotes around the return type
'Node'
, you get a NameError suggesting that the class A
hasn't been declared yet (I wrote this while I was trying to figure out why it wouldn't work in the Node
class above, which is why they look similar).from typing import Union class Node: def __init__(self, value: int, other: Union[int, None] =None) -> None: self.value = value self.l = Node(other) if type(other) is int else other def get_l(self) -> 'Node': return self.l if __name__ == '__main__': X = Node(15, 16) print(X.get_l())However, if you write them as string literals, i.e.
'A'
, then the type will be resolved later. This is nicely implemented in PyCharm as well.What's more, I think it'd be fun to write a generalized BT. For example, a class that accepts any type and a function/operator for comparing two objects (for objects of type
int
we have <, >, and ==); or, given that the submitted type has rich comparison methods defined, I don't think it'd be necessary to submit a function as the way they interact with one another is thus defined by the objects in the binary tree.Altogether, it's an interesting concept, and Python allows you to be very expressive in what you're trying to convey, both to the machine and others.