Maths for Hackers - Enumeration and Insight
Bonus Material
This bit is like the extra bits on a DVD, but really, it’s just more maths. For those who care, WHICH SHOULD BE ALL OF YOU but I appreciated that
- Maths is not everyone’s cup of tea
- If this material is new, the following stuff may be too far down the rabbit hole
- People genuinely don’t care about this stuff
The last point is the only one I have an issue with - Turing asked the question ‘is the Halting problem the hardest problem for computers?’ If yes, we have a discovered the limit of what we can do computationally, which is a profound statement! If not, then we can always build on our computational complexity indefinitely.
This is important for AI work - AI engineers just don’t understand the notion of ‘information exchange’ when writing a program. I’ll explain more on this at the end.
Tl;dr - computation and computability of problems means we have to be sensitive and understand not just the what, but also the where about the information we are working with. This is a key, and oft misunderstood point in computer science, that computability theorists know so well!
Enumerable vs. Computable functions
We have so far talked only about computable sets - that is, a set (think ‘bag of things’) where we can decide what is in/out of the set (bag) using a Turing Machine. We can ask ‘Is 53 in set $A$?’ by asking $\phi_e(x)$ and getting a ‘1’ if it is in $A$ and a ‘0’ if not. The name for such a computer function is the Characteristic Function of the set, in case you see it in books/papers/wikipedia.
But there is an idea called ‘Computably Enumerable’ that is somewhat different from this - a set (bag) $A$ is enumerable if there is a computer program $\phi_e$ where if we ask it $\phi_e(294)$ then we will get the 294th member of $A$.
We can write this more ‘mathemtically’ by saying that we can represent $A$ as $A = \{ a_1, a_2, a_3, \ldots \}$ (this is actually the usual way of notating sets, with $\{$ and $\}$ brackets). Here, $a_1$ is the first member of $A$, $a_2$ the second member, and so on. If a set $A$ is enumerable, then we can use the computer program that gives its members to define the set as $A =\{ \phi_e(1), \phi_e(2), \phi_e(3), \ldots \}$.
But what use is this? Well, this gives us yet another way to talk about functions. The idea we want to show is that if a set it computable, then it is enumerable. This may not seem immediately doable, so let’s run through it.
Let $f()$ be the function that gives ‘yes/no’ as to whether the number we input is in our set, and let $g()$ be the function that gives you the member of our set at position $n$ (so $f(24)$ gives you the 24th member of our set). For ease, we’ll call our set $A$.
First, we want to get $g()$ from $f()$; so we have our $f$ and we construct a program like so:
Hopefully this is quite clear as to how it works - we run through all the numbers (positive integers) and every time we get a ‘hit’ from $f()$ (that is, $f$ returns a 1) we add one to a counter. When the counter equals the input number, then we output the number we got to, and we have made $g$ from $f$.
This is actually a theorem - and we just did a proof. WAT? yeah, really. We had an idea, we showed that it is consistent and always works, and then we demonstrated it, and now we have a proof. That’s maths, yo…
But we can also show something else that’s interesting. Take my little program above, only on line 6, replace if (f(numbers) == 1):
with if (f(numbers) == 0):
. What have we got now? Well we’ve made a version of $g()$ that outputs every element that isn’t in $A$. That is, they are in the bag of everything not in $A$ - we call this the compliment of $A$ (although, only desribing everything you’re not doesn’t seem that complimentary to me, but hey, it’s the word we use…), and we write this as $\bar{A}$
So what have we found here? Well, we can always do this action, so we’ve found that if we have a computer program that decides if a given number is in/out of $A$, then we always have a computer program that describes $\bar{A}$ (not-in-$A$). This is very useful - think of it the opposite way; as if we can describe everything not in a set with a computer program, we automatically get a program for everything in that set. So if describing $A$ is hard, maybe describe $\bar{A}$ instead, and you get the program for $A$ automatically! ;-)
Bonus Theorem
Theorem: If $A$ and $B$ are enumerable, then so is $A \cup B$.
The proof of this is quite easy - suppose $f_A$ gives all members of $A$, and likewise for $f_B$. We can run each program concurrently to get $A = \{ f_A(1), f_B(1), f_A(2), f_B(2), \ldots \}$, and remove any duplicates along the way. Thus, $A \cup B$ is enumerable.
Oracle Turing Machines - solving the Halting problem
I’ve read some articles (not going to embarrass people by pointing htem out) who have said “Turing Solved the Halting Problem!” (look in some old issues of New Scientist, for example). He didn’t. What Post, Kleene, Turing et al. showed was two things:
- The Halting Problem is enumerable
- The Halting problem isn’t the hardest problem for computers
The first one is quite straigtforward to see (the formal version can get a bit involved, so I’ll do it with words and ideas); you make a Super Universal Turing Machine that runs every program on every input. (If you aren’t convinced this can be done, just realise that every computer program has a number, $e$, and so we find every number, find every input, and run every input on every number as $\phi_e(x)$.
We then wait for things to Halt. As soon as something halts, we clear it’s computation from the tape, and leave the answer in its place. We’ll never get the full answer to the Halting Problem without infinite time spent on it, but we will get so many answers that we can use in a reasonable amount of time (within a few lifetimes). As such, the Halting Problem is enumerable.
But you showed it couldn’t be solved??
Ah yes, but this is a key idea, all computable functions are enumerable, but the converse is not true - not all enumerable functions are computable. The Halting Problem is not computable, but we can, in theory, with enough time, work out initial pieces of it. We can never have a computer program that just answers “Does a computer program $P$ halt on input $x$?”, but we can get answers for so many $P$’s on so many $x$’s (NB - $P$’ing on an $x$ is not a nice thing to do.)
Side note - we showed that the Halting Problem is undecideable. This argument shows that the Halting Probem is unsolveable; The argument for this is currently beyond the scope of this article.
Glass Ceiling
What Turing in particular wanted to know was this - is the Halting Problem it? If we solve it do we get all the computable things??
tl;dr No.
Turing asked the question - suppose we just ‘answered’ the Halting Problem. Just suppose we did. And we then wrote this answer to an infinite tape and gave that to our Turing Machines as an second tape. Kind of a ‘cheat sheet’ - like the ‘oracle at Delphi’ in ancient legends, it has answers, and we don’t care where it got them from.
This is the construction for Oracle Turing Machines - Turing Machines that have a second tape with answers to some problem already written there. But what use is this?
Well, Turing managed to show that there are, in fact, problems that are HARDER than the Halting Problem. ‘Harder’ is a very subjective term - dancing like I’ve not just OD’d on Ritalin is very hard for me, whereas getting Domain Admin through a website flaw is something I find very easy. (Don’t worry, it’s my day job… I’m a pentester remember?)
But which problems?
Well, the easiest one is $Fin$. We first need to introduce two ideas around functions - each function $f$ has two pieces, the domain, or, set of things that go in, and the range, or, set of things that come out. If $f(x)$ always equals one, then the range is just $range(f) = \{ 1 \}$, for example. With computable sets, we say that the domain of a computer program is the set of values $W_e$ such that any value $x \in W_e$ is a value on which $\phi_e(x) \downarrow$ (that is, $P_e$ halts on input $x$).
$Fin = \{ e | W_e \text{ is finite}\}$
What is this mystical notation? Well, this is a standard set prototype - we say that $Fin$ is every $e$ such that the set of values for which $e$ halts, $W_e$ is finite. The argument is too long to put here, and we’d need a lot more mathematical ‘machinery’ in order to do a full argument, but this is one of a small group of problems shown to be problems that the Halting Problem is ‘weaker’ than.
Turing Degrees and Information
I’m not going to even define the Turing Degrees, but just siffice to say that where you have a hierarchy such as I’ve described, mathematicians call these ‘degrees’. That is to say, the sets of all the sets (usually called ‘classes’) which have some equivalence under the halting problem (see Turing Reducibility on google) form a degree.
Intuitively, they all share some level of information that allows them to be reducible to the Halting Problem in some way. There are many ways to show this, as you can probably start to imagine.
The key point to all of this is as follows - computation and computability of problems means we have to be sensitive and understand not just the what, but also the where about the information we are working with. This is a key, and oft misunderstood point in computer science, that computability theorists know so well!
Halting Problem and Artificial Intelligence
This is the next article, wherein all that I’ve talked about will show a fundamental flaw in the way most people understand AI.