2016-12-27

I'm wary of using a fancy term to describe the method Veedrac and I used to find solutions to the enum-labeling problem, but I think it's accurate and potentially useful.

What does it mean? *Bespoke* means "custom-made" or "made-to-order" with
respect to say clothing, but it's also descriptive of software. For example,
in this interview and the book Coders at Work, Donald
Knuth argues against reusable code and in favor of bespoke data structures and
libraries:

I also must confess to a strong bias against the fashion for reusable code. To me, "re-editable code" is much, much better than an untouchable black box or toolkit.

He's referring to the practice of copying code and adapting it to fit the application.

In the same vein, bespoke code generators are useful and common, but "reusable" code generators like ANTLR or yacc haven't lived up to their promise. I'll return to this topic in the future, but for now here are examples:

- Python has a bespoke ASDL implementation. (What is ASDL?)
- Oil now has one too. The generated code is different, and the schema language will also be different once I enhance it for oil's requirements.
- The pgen.c parser generator in the Python source tree .
- The Lemon parser generator in the sqlite source tree.

*Superoptimization* is a technique for generating short code sequences using
brute-force search. It has the property that the optimized code may use a
**different algorithm** than the original. In our case, we turned a lookup
table into an expression of bitwise operators.

So a **bespoke superoptimizer** is one written for a specific application,
rather than one that's part of a compiler. Exploiting application-specific
knowledge can make a problem more tractable.

As mentioned in the problem description, `osh`

now has 221 token
types, which broke my scheme of using 4 bits for the token `Kind`

and 4 bits
for the `Id`

.

I switched to an explicit lookup table, but in the back of my mind I suspected
there may be a fancy trick to "compress" it into a mathematical expression.
We just want to look up an `enum Kind`

value using an `enum Id`

value, and we
don't care about the concrete integer values.

This thread on bitwise operator tricks popped up on Reddit, and I thought it would be a good audience for my question. Veedrac quickly surprised me with two elegant solutions.

I was honestly puzzled by them, so I replicated his results. Here is the func_search.py program I used to do it. I've left it messy and slow so you don't have any illusions about how principled this process was.

I asked Veedrac about his thought process, and he mentioned an intuition about
the formula `x ^ -x`

. This was important for getting started, but I don't
believe it was necessary in the end. Instead, what he taught me is that the
problem is less constrained than I thought.

Paradoxically, the key insight is that there's **no insight** necessary to
solve this problem. I thought that a mathematical expression could be derived
using some trick or theory like binary decision diagrams, but we didn't
need anything of the sort.

Not only can the problem be solved by brute force, it can be found in a few
seconds using **unoptimized, single-threaded Python**. Here is the search
strategy I replicated:

- Generate random functions of
`id`

using a few bitwise operators and a few single byte constants,- The bitwise operators are selected at random and hard-coded, e.g.
`f4_5()`

. - The constants are exhaustively or semi-exhaustively searched, e.g.
`Form4()`

.

- The bitwise operators are selected at random and hard-coded, e.g.
- Evaluate every candidate function on all inputs, 0 to 255, and make a
histogram of output values using
`Hist()`

. - Test if the output values match the desired distribution using
`ScoreFunction()`

.

`ScoreFunction()`

does three things:

- Throw out functions which don't have at least 21 distinct output values (the kinds)
- Throw out functions that give values outside the range [0, 255]. (This is too conservative, but it doesn't matter in the end.)
- Calculate the number of groups that are too small, i.e. a "deficit" to be
fixed up as in the first two solutions here. If there are zero
deficits, then the solution is
**exact**, like the last two solutions.

```
def ScoreFunction(hist):
func_range_size = len(hist)
if func_range_size < 21:
return
dist = [freq for _, freq in hist]
func_range = [result for result, _ in hist]
if any(v > 255 for v in func_range):
return
if any(v < 0 for v in func_range):
return
num_deficits = NumDeficits(dist, DESIRED_DIST)
if num_deficits <= 1:
print('Kind that are too small: %d' % num_deficits)
d = Deficit(DESIRED_DIST, dist)
Show('func range', func_range)
Show('want dist', DESIRED_DIST)
Show('have dist', dist)
Show('deficit', d)
return num_deficits
```

Using an arbitrary guess of the function form `f4_5()`

, this program finds an
exact solution in less than 3 seconds. The search space I chose has only
2^{15} = 32,768 functions.

$ time ./func_search.py search ... Kinds that don't fit: 0 func range 0 1 32 3 8 9 11 33 35 4 64 19 20 36 67 65 128 68 129 131 132 want dist 43 25 20 16 15 13 12 12 11 10 10 8 8 4 4 3 3 1 1 1 1 have dist 45 27 20 19 16 16 16 12 12 10 10 8 8 8 6 6 5 4 3 3 2 deficit 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 num deficits 0, i = 12, j = 6, k = 14 ABORT: Found exact solution real 0m2.805s user 0m2.799s sys 0m0.004s

The solution found is:

```
kind = (-id + 12) & (id ^ 6) & (id + 14)
```

The `func range`

row shows every unique value that the function yields over the
integers 0 to 255. The second row is the distribution we want, and the third
row is the distribution we get.

Because this solution is exact, the last row is all zeros. We have a group of
output values big enough for every `Kind`

we desire.

I formulated a problem in the `osh`

parser called function-directed enum
labeling. Small instances can be trivially solved with what I call
**enum stuffing**: using the high bits of integer labels to store a lookup
table over the enum, thus storing it for free.

Veedrac from Reddit solved a harder instance of this problem with semi-exhaustive search. I reproduced his results with func_search.py, and published a runnable solution in ik7.cc. The shortest solution so far is:

```
kind = 175 & id & ((id ^ 173) + 11)
```

So if you're very stingy about bits, you can use this **bespoke
superoptimization** technique to replace lookup tables with short functions
over relabeled enums. To introduce the term, I mentioned the idea of bespoke
data structures and bespoke code generators.

Tomorrow I will ask some questions that came out of this experience. I can't quite tell now whether it was merely a fun diversion, akin to recreational mathematics, or if it can be used to appreciably optimize real software.

Stochastic Superoptimization by John Regehr has good background material, discusses speeding up the search with randomization, and potential future directions for superoptimization.

1992 Synthesis OS -- An efficient Unix-like OS written in assembly, which uses some exotic techniques including runtime code generation. By the author of the first paper on superoptimization.

The ryg blog -- A cool blog I found while Googling for superoptimization. In retrospect, it shouldn't be surprising that demo scene programmers use techniques like this. Not many programmers are interested in removing 221-byte lookup tables. :)