blog | oilshell.org
oil prototype, I used Python 3 rather than Python 2 to take advantage
of both type annotations and type checking with mypy. The
mypy type system seems fairly similar to the C++ type system, so these
annotations should help when porting the parser, whether by hand or
Type annotations would also describe the all-important AST more clearly. Though I lean on the side of dynamic languages, I admit that prior to type annotations, Python offered you essentially no help in the task of structuring and describing an AST. It looks particularly bad if you compare it to ML-stye languages.
Unfortunately, before mypy provided any value, I hit a snag:
$ mypy osh/lex.py osh/lex.py:51: error: "Id" has no attribute "Lit_EscapedChar" osh/lex.py:52: error: "Id" has no attribute "Ignored_LineCont" osh/lex.py:60: error: "Id" has no attribute "VSub_Name" ... more than 100 errors ...
This is because the core/id_kind.py file, which I wrote about
yesterday as the backbone of the AST, uses metaprogramming.
Specifically, it's a simple form of dynamic code generation: construct a
variable name as a string, and then call
setattr(Id, id_name, id_value)
setattr(Kind, kind_name, kind_value) to attach constant values to an
I use this technique not only for consistency and readability, but also:
[[operator tokens with operand types: e.g.
-ztakes a string;
-ftakes a path
I like that Python lets me do these things dynamically, using Python as
its own meta-language. Unfortunately,
mypy restricts Python and removes
It appears that I can silence these errors by generating mypy stubs. But I would prefer a solution with a better user experience, i.e. one that doesn't require modifying a working program purely to satisfy the type checker.
I found an interesting mechanism in Bob Nystrom's blog post Type Checking a Dynamic Language. In it, he writes:
Magpie is primarily a dynamic language, so it follows in Python’s footsteps. There is no special definition/statement dichotomy, and classes are created, extended, and modified imperatively at runtime.
Then he describes the evaluation model like this:
Once the interpreter has finished evaluating the scripts, it looks in the global scope to see if you’ve defined a function called
main(). Doing so is the trigger that says, "I want to type-check."
In other words, code run at the top level is dynamic metaprogramming, and
then code called from
main() is statically checked.
I like this approach a lot, and as far as I know, it's novel. I can also see other use cases for it. A few years ago, I wrote a toy protobuf library in Python that uses metaprogramming instead of textual code generation, and it would benefit from exactly this model.
I speculate that it would also be useful for typed data analysis in the style of the R language, where the names of fields and their types may come from data, and thus can't be known at "compile-time".
Shell is neither statically- nor dynamically-typed because it's untyped. To a first approximation, everything is a string.
Although oil borrows its semantics from shell, it will have at least these types:
Str-- a string
Array-- because arrays of strings are fundamental to the problem domain
Int-- because bash uses arithmetic expressions in nine places
Bool-- because the boolean expression language
[[is part of the shell language
And because it has the philosophy of catching errors early (static parsing), optional type checking is a natural fit.
In summary, this post showed a downside of static checking: state of the art type checkers can't understand an elementary metaprogramming pattern without help. I hope to use the idea from Magpie in the design of the oil language itself.