Trees, higher-order functions and maps
Contents:
# Binary trees
A binary tree is a tree wherein each node has at most 2 children. Its very similar to a singly linked list (support for which is the default in OCaml), except that it has two nodes instead of one.
(*
Recall how we'll make a match case for this:
match x with
| Leaf -> None
| Node (a, b, c) -> Some a (*maybe*)
*)
Its a recursively parameterized variant. A Leaf is an empty tree that contains nothing. Why does the Node have 3 elements in it? Recall what we said for:
we said that the Cons tag can be thought of as something having a head and a tail which is mylist again. In case of the tree, we have 2 branches not one, so we’ll have two of the recursive 'a list definitions, and the first alpha can be thought of as the branch that points TO the node.
Now understand this, to create a binary tree, I will have to define it like this:
let some_tree = Node ;;
We’re saying the some_tree is a Node, with a branch 1 that points to it (that is, the root of this node). It then creates two more nodes, one with a 2 and another with a 3. The tree ends here because every other branch is a Leaf.
What if we wanted to take the size of the tree? That is to say, how many nodes does a tree have, how will you do it?
0
1 + size left + size right
(*This works because left and right will return type of 'a tree! or in this case int tree*)
This is awesome, and pretty easy to define right! What if you wanted to sum up all the elements, again pretty simple:
0
root + sum left + sum right
Man functional programming makes it really easy to work with trees.
# Functions as higher order
In OCaml (and in functional programming in general), functions are higher order, as in they can be passed around and used just like anything else.
2 * x;;
4 * x;;
double ;;
x |> double |> double;;
What if we wanted to take the 4th power of a number? We can assume that we have the following primitives:
x * x;;
x |> square |> square;;
There is a notion here of “applying a function multiple times” (in this case, twice). What if we could write a function that applies a function twice?
f ;;
That’s it! This is a function that takes in as a function f and an argument x, applies f twice on x. Now we can do something like:
twice square x;;
But we can also have partial functions. Something like this:
let partial_forth = twice square;;
Now partial_forth is a function, which is awaiting an argument x which it inherited from twice.
# Map
Apparently Google’s MapReduce was inspired from Lisp and functional programming. Anyways…
The idea of a map is to give it a function and an iteratable element and it will apply that function to each element of that iteratable element.
Do note that, doing something like this:
let a = x;;
is the same as doing:
let a = some_function x;;
Do not get confused!
Alright, suppose we want to add one to every element of a list:
[
a+1::add_1 t;;
(*yes I know its not optimial and not tail recursive, but its okay for now*)
What if we wanted to concatenate a string “3110” to every element of the list?:
[
:: concat_3110 t;;
Now they are quite similar right? The similar patterns are:
- There is a pattern matching on the input
- There is returning of the empty list
- There is a pattern matching for head and tail of a non-empty list
- There is a cons operation
- To the right of the cons is a recursive call
- To the left of the cons is another operation
This screams that the entire thing can be abstracted away! Let’s try and factor out most of the code:
[
?? :: transform t;;
But now the question is, what do we push in place of the ??? So, how about we take in the function as an argument that tells me how to do the transformation!
[
f a :: transform ~f:f t;;
beautiful! Now we can do something like:
1+x;;
x ^ "3110";;
transform ~f:add_1 ;;
transform ~f:concat_3110 ;;
Pretty neat right! But we could also write the add_1 and concat_3110 functions using the transform function:
transform y;;
transform y;;
Guess what! transform is actually named a map. Its present in the standard library as List.map. You can use it in both of the ways we’ve defined above.
[
f a :: map ~f:f t;;
map lst;;
convert_string ;;
remember what we said about bad programming smells above. We just fell into that trap!
map string_of_int lst;;
This is the right way of defining it. Both of them work, but this is what separates the great from good. The idea is:
Factor out recurring code patterns, and reuse them, not re-write them