Maths for Hackers

This is the second in what I’m hoping will be a series of articles entitled ‘Maths for Hackers’. For a ‘manifesto’ or sorts, see the ‘About’ page.

I recommend reading the other article first, as this is a continuation of that work.

The Halting Problem

This is a very famous problem - probably the most famous thing that Alan Turing did, aside from the fabled Turing Test (we’ll get to that later).

The halting problem can be stated as follows:

The Halting Problem - we know that certain computer programs don’t HALT on certain inputs. Is there are computer program $\phi_{HALT}(P,x)$ that can decide for any program $P$ whether it will halt on any given input $x$ or not?

Simple right? Well, it’s actually quite a complex problem. But first, we need to solve the hidden problem-elephant in the room - can one Turing machine simulate all the actions of another? Can we virtualise a Turing machine on a Turing machine? Is such a ‘yo dawg!’ moment really possible?

The Universal Turing Machine

This is where things get a little more interesting - so, here’s a tl;dr just in case: there exists at least one Universal Turing Machine (we’ll refer to it as a ‘UTM’) that takes two things as an input - a Turing program $P$, and an input $x$. This UTM will effectively run any program $P$ on input $x$, and the UTM’s behaviour will copy the behaviour of $P$: if $P(x)$ (read that as ‘$P$ on $x$’ - nothing to do with urination…) halts with output $y$, then $UTM(P,x)$ will halt with output $y$, and if $P(x)$ does not halt, then you guessed it, $UTM(P,x)$ will not halt.

To all those who are slow clapping the ‘idiot mathematician teaching computerers how 2 du VMs’, let me just elaborate - before Turing, there was NO unified model of computing. Computer meant a person, usually a woman (WWII Bletchley Park Wrens who did the calculations were called ‘computers’), who sat doing calculations all day. The notions of a machine simulating a machine in full was as crazy to academics back then as Google’s AI that made its own crypto is to us now. We’ll look back at this in years to come and wonder how it was ever ‘inventive’ as a use of neural nets, I am quite sure.

Turing invented the virtual machine in 1936. 27 years before IBM built their IBM 360 emulators for backwards compatibility in 1963, and 30 years before the O-Code interpreters. And his genius do not end there…

But how do you prove that this can be done on a Turing Machine? Easy. You write a program that will go between the two input strings - the program $P$ and the input $x$ and perform the computation by doing seek-left look ups for what to do, then seek-right and do the action on the current form of $x$ on the tape, and then go back and forth doing the computation.

But what does this look like? Well, that’s actually the easy bit. We can add some symbols and states that $P$ doesn’t use but our UTM does. We will use these symbols to keep track of where we are in $P$ and $x$, and use special states to let us do this seeking back and forth. Remember, $P$’s states are now symbols so we can have a state for each symbol in $P$ as it is on the tape, and use this to track the internal state of $P$ in our computation.

If you’ve done anything with writing emulators or VM software, or even just tried your hand at qemu or vmware, this should all sound very familiar. That’s because it is, and it’s not a million miles away from how Turing originally proved the theorem.

Magic.

But we still have to resolve the pesky issue of providing a…

Proof of the Halting Problem

Ok, so this is probably sounding a little intimidating, but if you got the proof in the previous section, you’ll get this no bother. Just takes a little bit of effort, but it’s well, well worth it!!

Ok, so let’s assume (because we’re mathematicians and we can because we’re cool and we really wanna go the party and … Never mind) that the Halting problem can be solved. We will show the opposite, but first we need to start somewhere. The easiest way is to assume it can be solved, and show that something weird/nonsensical/illogical captain/contradictory happens.

So, if we can solve the Halting Problem, then we have a machine, call it ‘Halt()’ that takes 2 inputs $P$ and $x$, which we know we can simulate all or part of $P$ acting on input $x$ if we have to (see why I introduced UTM’s first?). But what will it do? Well, $Halt(P,x)$ will return true if $P(x) \downarrow$ (‘$P$ on $x$ halts’) and false otherwise.

COOL! So, we’re done, right? Answer: Sodium MonoBromate…

NaBrO…

Let’s make a counter program $foo()$ (like all good programmers) and have it do this:

def foo(P.tostring()):
    if Halt(P,P):
        foo(P)
    else:
        return

So, if $Halt(P,P)$ is true, the function loops forever. If $Halt(P,P)$ doesn’t halt then $foo()$ halts. $foo()$ is one contrariwise mofo.

Now, the following thought may have crossed your mind: So what?!?

Well, consider the following:

\[foo(foo())\]

If your reaction was WHHOOOOOOOAAAAAAAAAAAA!!!!!!! then come study Computability and let’s write some beautiful theorems together…

If not, let me explain. We’ve just created the computation equivalent of “This statement is false. True/false?”. Here’s how:

if $foo(foo())$ halts, then $foo()$ loops forever…
- So by this looping forever, $foo(foo())$ never halts…
  - So by this, $foo()$ must halt here, as it always halts if the input doesn’t halt…
    - Go back to the beginning…

This is an endless loop of contradictions. When a statement falls into this trap, it is called undecidable - as in, you can’t decide at all whether $foo(foo())$ halts or not.

So is $foo()$ broken? Well, no - look at it! It’s clearly a very simple program that is evidently computable (meaning ‘obvs… can totes be done on a computer, yo(lo)…’). So the problem isn’t with $foo()$ - it’s a lovely program.

This means, through logical deduction, that it must be $Halt()$ that is broken. If $Halt()$ can be so easily broken so as to create an instance where it doesn’t hold, like we just did with $foo()$, then the Halting Problem will always be inherently undecideable. We can always construct $foo()$-like functions to reprove this theorem.

tl;dr - There can be no $Halt()$ program, because if $Halt()$ does what it says it does, then we can make it do the exact opposite. Therefore the very idea that $Halt()$ exists is false. There can never be any program that answers the Halting Problem.

But this isn’t the only major result in computability I want to introduce - there’s the rather weird and wonderful…

The Recursion Theorem

The ‘Recursion Theorem’ is a type of theorem called a ‘fixed point’ theorem. They’re ace. They tell you that you have a ‘fixed point’ in any scenario of the actions proclaimed. What does that mean? Well, here’s an easy example:

The Mountain Path Fixed Point Theorem

So, to allay fears, this is not an allegory or rewrite of the recursion theorem - this is a completely different fixed point theorem! I want to use this example to show that fixed point theorems are a bit mad! This was initially taken from a Malcolm Stewart book of puzzles (they are totes amaze!) here’s the conundrum:

A man leaves at 8am one day, and ascends a mountain to the very summit. He stays the night at the summit (meditating and such), and at 8am the next day he descends the mountain. Is there a time on both days where he is at the same height up the mountain?

Now, it’s fairly obvious that the man goes up and down the mountain at different rates - he is almost certain to descend a lot faster than he ascends, for example. But there is always a time, on both days, where he is at precisely the same height up the mountain.

Wat??

Yeah, that’s the brilliant thing about fixed point theorems - they demonstrate the existence (but not necessarily the construction) of values in our functions that have some fixed nature.

But is this true of Mountain Man? Yes. Draw a graph of the man’s height up the mountain on day one. Starting at 8am, it slowly works its way up until he reaches the summit. Now plot the same graph for the second day, starting at the summit at 8am and finishing at the bottom of the mountain at some point.

On this graph, the lines must cross… else the man didn’t ascend/descend the mountain all the way. At that point, time $t$ (running along the bottom) is the same, and the heights are the same, so we have our theorem! The same height up the mountain, at the same time on each day.

Just magical!

The Recursion Theorem Statement

Due to Kleene (that’s how mathematicians say “Kleene did this… blame him…”), this is the statement of the simplified form of the theorem - specifically, Roger’s Fixed Point Theorem - taken from Barry Cooper’s book (Computability Theory).

Theorem: If $f$ is a computable function, then there exists some positive whole number $k$ for which

\[\phi_{f(k)} = \phi_{k}\]

on any input.

[NB - if the notation is unfamiliar, read part one of this series! M.]

Proof:

Well, this probably has people a little stumped as to what it says. Well, it says that for any computable function $f$ (read: any Turing Machine) two things hold:

At some point, the output of $f$ codes a computable function.
At least one of these functions (yes, there’s more than one) is the same as the input used to get the coded function in the first place.

mind_blown.gif

So how do you prove such a thing? Well, it’s actually quite easy.

Let’s build a weird little tool called $h(x)$. What does it do? It takes input $x$, and some fixed second input $y$. It first computes the Turing Machine $\phi_{x}(x)$, and if this halts, continues to attempt to compute $\phi_{\phi_{x}(x)}(y)$. We say that $h(x)$ is:

\[\phi_{h(x)} = \phi_{\phi_{x}(x)}\]

Now, we know that $f(h(x))$ must have some $e$ such that $f(h(x)) = \phi_{e}(x)$ for each $x$ (think about it… it’s computable, remember? So there must be a number $e$ that is the Godel number coding thingy for this computation). Now set $x=e$ and realise that we get $f(h(e)) = \phi_{e}(e) = h(e)$ when they are subscript indices (in helvetica, probably), which means that:

\[\phi_{f(h(e))} = \phi_{\phi_{e}(e)} = \phi_{h(e)}\]

…which is precisely what we’re after!! :-D $\square$

But what the hell does it mean?? Well, that’s actually up for debate. We know that the result holds, and we know that it holds for every computable $f$ that we choose to plug in. But beyond this, it’s a fairly mystical result.

Think about what it says (a useful question to ask of any mathematical statement); Every computable function $f$ has some input $k$ that gives output $k$. Furthermore, this $k$ is the coding of some computer program itself. It states two very important and fascinating facts that are very useful to computability theorists…

Useful?!? Yes, the Recursion Theorem pops up a lot. So long as $f$ is computable, we can find the relevant input $k$ such that we can do without $f$, and just use $k$ in its place. Veerrryyy handy, as you’ll hopefully come to see.

Tune in next time…

… for some bonus stuff aimed at those who are as excited about this as I am :P