In my first blog post, I explained why Oil is written in Python: so I have a chance of getting it done! I want to implement not just the bash-compatible OSH dialect, but also the Oil language, and that's a lot of work.
Bash alone is ~160K lines of C code, while OSH is ~16K lines of Python as of the last release, which is nearly feature-complete.
Of course, there's a problem: Python is slower than C, and I wrote benchmarks to show that it matters. For example, the OSH parser is 40-50 times slower than the bash parser, even after some optimization.
So I'm now working on making it even faster and smaller. My plan involves OPy, a Python bytecode compiler written in Python.
This post shows what I've done with OPy, recaps what I wrote about it last year, and maps out future work. If you've implemented a VM, and especially if you've modified CPython, I'd love your feedback in the comments.
I've released Oil 0.5.alpha2, which you can download here:
It has the same features as OSH 0.4, but its bytecode is built with OPy.
OPy generates slightly different bytecode than CPython, but it appears that OSH is unaffected. For example, these benchmark results are roughly the same, at 6-7 lines/sec on a slow machine and 13-14 lines/sec on a fast machine:
(The 0.5.alpha1 release is built with the CPython bytecode compiler, like
all prior releases.)
However, the bytecode is larger:
I'm not sure why this is, but I'll look into as I optimize for both size and speed.
oil/opy$ ./count.sh all
LEXER, PARSER GENERATOR, AND GRAMMR
[ ... snip ... ]
579 pgen2/tokenize.py
827 pytree.py
2574 total
COMPILER2
[ ... snip ... ]
410 compiler2/symbols.py
764 compiler2/pyassem.py
1547 compiler2/pycodegen.py
1578 compiler2/transformer.py
4909 total
OPy is around 8,000 lines of Python code, which I consider small and malleable. This is why I think it's feasible to fork Python and optimize Oil.
Note that ~16K lines of Oil code and ~8K lines of OPy code is still a lot less than the ~160K lines of C code in bash.
Before explaining how I made this work, let's review what I wrote about OPy last year.
(A) The Riskiest Part of the Project. I listed six reasons why a shell shouldn't be a Python program:
Two more reasons:
In addition to the fact that Python programs inherently allocate often, Python's garbage collector isn't "fork-friendly". Objects that are read-only at the Python level are mutated at the C level, in order to update their reference counts. This inhibits virtual memory page sharing. Ruby addressed this issue in 2012. It might not matter for some Python programs, but it matters for a shell.
(B) Cobbling Together a Python Interpreter. I describe the components of a Python front end in Python:
2to3
conversion tool.(C) The OPy Front End is Working. I describe a couple attempts to make these components work together. I abandoned Python 3 and ported Oil back to Python 2.
(D) OVM will be a Slice of the CPython VM. Rather than writing a small C or C++ VM to complement this front end, I decide to hack off a chunk of the Python interpreter and call it "OVM". This shortcut let me make the first release back in July.
(E) Rewriting Python's Build System From Scratch. Oil release binaries have two parts:
.py source
code with the OPy bytecode compiler, rather than CPython's
built-in compiler.(F) How I Use Tests: Transforming OSH. In summary, the idea is to:
Also, it technically doesn't matter how fast the OPy compiler runs. I compile bytecode ahead of time rather than on-demand. This opens up more space for optimization.
(For those curious about details, the two appendices in this post may be interesting.)
Admittedly, this strategy is odd. I don't know of any other programs that were almost unusably slow in their original implementation, and only sped up by writing a new compiler.
I was recently asked how I consistently get things done, and my answer my shed some light on this. Part of it was:
Use Python. Python lets me explore new problems quickly. If where were a C++ compiler in my edit-run cycle, many corners of the shell language would remain unexplored.
Being able to mold the language with metaprogramming was another unexpected benefit. I learned OCaml specifically to write compilers and interpreters, but I decided not to use it for Oil. In retrospect, I suspect this was a good decision. (We'll know more once I get further into OPy!)
Don't get stuck. I've made continuous progress for nearly two years, and this strategy of incrementally optimizing Oil also reduces the likelihood of getting stuck.
I'll also add: don't go backward. With tests, I have confidence making big changes, like completely changing the bytecode compiler. I know that the OPy compiler works because the spec tests for 0.5.alpha2 did not regress. The bottom of the page records the version:
$ _tmp/oil-tar-test/oil-0.5.alpha2/_bin/osh --version Oil version 0.5.alpha2 Release Date: 2018-03-02 02:13:34+00:00 ... Bytecode: bytecode-opy.zip
So that's the reasoning. I'll also admit that I'd like to prove a point about high level languages vs. gobs of C++.
Though I was honestly surprised by how slow the initial version turned out to be. Python is not a good language for writing efficient parsers, but perhaps OPy will be.
I had already done most of the work last year, and the main things I did in the last few weeks were:
bytecode-opy.zip for oil instead of
bytecode-cpython.zip.2to3 --fix print to upgrade some of the
Python 2 standard library.I noted some differences between OPy and Python in the OPY README.md.
I have several dozen ideas for OPy. They fall roughly in these categories:
PYTHONHASHSEED change to
Python 2.7.These changes will lead to changes to OVM. For example, ASDL data structures can be represented more efficiently in memory. Unlike Python data types, ASDL types are statically declared.
I released a version of Oil built with OPy, and showed benchmarks and metrics. Then I recapped what I wrote about OPy last year, and described recent progress.
It might take a long time to optimize Oil, but I have no doubt I'll learn a lot in the process. And I won't wait until it's fully optimized to release "carrots".
I've been asked these questions when I've written about OPy in the past.
Because I'm taking ownership of the code, Python 2 vs. Python 3 isn't a meaningful question from the user's point of view.
For those curious about the development process, Oil started off in Python 2, was ported to Python 3, then back to Python 2. (It was easy both times.)
Python 3 emphasizes Unicode strings, but in a shell, you almost never know what
the encoding of a string is. File system paths, argv, getenv(), stdin,
etc. are all bytes in Unix.
The bytes can of course be UTF-8-encoded. UTF-8 was designed to work with
many existing C functions like strstr(), rather than separate Unicode
versions.
This blog post discusses the issue of internal string encoding. It notes that Perl, Ruby, Go, and Rust use UTF-8 internally. Oil will follow that example, rather than the example of Python and bash, which used fixed-width multibyte characters.
This comment explains why manipulating UTF-8 text in memory is awkward with Python 3.
The other issues with Python 2 were:
I wasn't excited about PyPy, but I tried it anyway. OSH under PyPy is slower than OSH under CPython, not faster.
JIT speedups depend on the workload. My understanding is that string-heavy workloads are dominated by allocation, which the JIT doesn't touch. Even when it's faster, PyPy uses more memory than CPython, which is not a good tradeoff for a shell. My goal is for OPy to use less memory than CPython.
In summary, PyPy optimizes unmodified Python programs, which is very hard. In contrast, OPy is optimizing just the subset of the language that Oil (and OPy itself) use. I'm also free to change the semantics of the language, e.g. make it more static.
Implementation trivia: OPy started from the same place that PyPy did. PyPy is also based on tokenize, pgen2, and compiler2. Writing a Python front end is a lot of work, so it's best to reuse existing code.
I didn't try Cython, but I don't see any evidence that it speeds up string-based workloads. I believe it also has the tradeoff of bloating the executable (which likely increases memory usage.)