Oils 0.19.0 - Dicts, Procs, Funcs, and Places

2024-01-16

This is the latest version of Oils, a Unix shell. It's our upgrade path from bash to a better language and runtime:

Oils version 0.19.0 - Source tarballs and documentation.

We're moving toward the fast C++ implementation, so there are two tarballs:

The reference implementation in Python. See INSTALL.txt in oil-*.tar.gz.
The C++ translation. See README-native.txt in oils-for-unix-*.tar.gz.

If you're new to the project, see the Oils 2023 FAQ and posts tagged #FAQ.

Intro

This announcement should have happened weeks ago!

Version 0.19.0 was released on November 30th. And it contains almost 3 months of work — everything since version 0.18.0 in September.

What's happened lately? These blog posts may answer some of your questions:

Oils Winter Status Update (November)
Interactive Shell Screencasts (December)

In short, we've been deep in the nuts and bolts of YSH. It's been been enhanced in fundamental ways, and this includes breaking changes, based on experience with the language.

This announcement is long, with many details and code samples. I hope it will make YSH less mysterious!

We also got a third grant from NLnet, and are looking for contributors. If the details in this post interest you, then you might be a good person to work on Oils.

Contributions

These contributions might give you a feel for the work we're doing, and where you can jump in. The codebase is more stable — taking its "final" form — though I still want to make the dev setup more portable.

Aidan Olsen:

Implement List => join()
- As of this release, we write pure methods as mylist => join() (fat arrow) and mutating methods as mylist->pop() (thin arrow). More on this below.
Implement Expression Literals ^[1 + 2]
- I mentioned this change in the appendix to the Oils Winter Status Update. It could be a good change to learn from. I also started a #help-wanted Zulip channel with more such changes.
Implement first cut of declarative arg parsing, in args.ysh
- This exposed some language holes, also mentioned in the Oils Winter Status Update
Implement read -N, with fixes to read -n
Fix crash when comparing non-comparable types
Fix build failure due to missing std::fmod

Melvin Walls:

Re-implement our garbage-collected Dict<K, V> as a real hash table!
- Prior to this release, it was an array of key-value pairs, with O(n) lookup. This was surprisingly OK for awhile, but it's indeed slow for some CPU-bound workloads. The more common I/O-bound workloads don't appear to be affected.
- OSH has nearly doubled in speed on a synthetic Fibonacci benchmark. Details in the appendix.
- We're using Python's "Hettinger Dict", which conveniently required no changes to our GC. More on this below.
Optimize GC Rooting
- Split _Dispatch() into separate functions, so this hot function doesn't have so many roots.
- Unroll StackRoots bookkeeping, another speed increase.
Implement precise location info for errors during func calls

Ellen Potter:

Implement shopt -s nocasematch, used by Nix and others
Add spec tests for type -a - originally reported by Simon Michael
- Implementing spec tests is half the battle, I followed up with an implementation of type -a!
Add spec tests for GLOBIGNORE (not yet implemented)

The steady trickle of feedback continues to be useful:

#1731 - C++ tarball errors - reported by Peter Debelak. Surprisingly, duplicate arguments to tar create a tarball with errors, but only on some platforms.
#1759 - C++ build fails without GNU readline - reported by Andrey Andreyevich Bienkowski. Now fixed!
Great Prompt API feedback on #oil-help from Stephane Desarzens.
- This release contains a new prompt API, described below.

More acknowledgments:

Samuel Hierholzer, for great feedback on Zulip, including testing the YSH language design.
Simon Michael, for reporting issues based on real usage, and verifying the fixes.
bar-g for great testing and feedback.

You can also view the full changelog for Oils 0.19.0.

Ideas for Contributions and Feedback

Last year, a few people wanted to help implement the "standard library" for YSH.

Unfortunately, the code wasn't yet ready for that! We had to get rid of the "metacircular hack", and figure out a good style to implement builtin functions.

Now that this is done, please take a look at this list of Python-like Str, Int, Float, List, Dict methods, as well as free functions:

https://www.oilshell.org/release/0.19.0/doc/ref/toc-ysh.html#type-method

We need help with the ones with the red X! As usual, the first step is to write spec tests.

Tests alone are a big contribution, because they force design decisions. Usually we look at what Python and JavaScript do, e.g. with [].index() and Array.indexOf().

https://github.com/oilshell/oil/wiki/Contributing

We can also use feedback on all the changes below. In particular, the thin arrow -> vs. fat arrow => distinction is pretty unfamiliar, but I think it's justified. At least one person on Zulip likes it, but we can use more feedback.

#blog-ideas > How many YSH features don't appear in Shell, Python, or JS?

OSH

We're mostly working on YSH, but OSH still gets attention. Repeating some of the above, we implemented these bash features:

shopt -s nocasematch
read -N, with fixes to read -n
type -a (resulted in some nice refactoring)

Please test OSH on your shell scripts, and let us know what's missing or broken.

YSH

Now let's discuss the core of this release: big and breaking changes to YSH. If you want to refresh your memory about the language, these docs may help:

A Tour of YSH (long!)
YSH versus Shell Idioms

I've updated them for this release.

Procs and Funcs

The biggest feature is an overhaul of procs and funcs. We have a new doc, mentioned in the Winter Status Update:

Guide to Procs and Funcs

There's a big table of comparisons:

shell-like vs. Python like
I/O vs. computation
Exterior vs. Interior
Unix vs. math
...

And there's some practical advice: start with neither procs nor funcs. Then refactor to procs. Add funcs later if you need them.

Background

Why the big update to procs and funcs? Here's some background.

Until this year, YSH was called Oil, and it had a weak form of proc. The idea was to make a modest language that fixes the "warts" in shell. But

This design was mismatched with our powerful GC data structures.
Users often ran into the issue of how to "return" a typed value.
Pure functions also have obvious use cases, like terminal escaping and HTML escaping.

In the summer and fall, Aidan and Melvin implemented func, and tested it by writing new functions in the standard library.

With this release, procs and funcs have become more powerful, and more consistent with each other, along all these dimensions:

Evaluation of actual args at the call site
- myproc word (x, named=42) and call f(x, named=42)
- Including splats ...pos and ...named
Evaluation of default args at the definition
- proc p(x='foo') and func f(x='foo')
- We disallow mutable default args, a well-known wart in Python
Binding args to params - for builtin procs and funcs
Binding args to params - for user-defined procs and funcs
Up to 4 kinds of args and params
1. Words that evaluate to strings
2. Positional-typed
3. Named-typed
4. A value.Command block

So the language is now very rich! Procs and funcs match our GC data structures and data languages.

The design is largely motivated by the 16 use cases in Sketches of YSH Features (from June).

`&myvar` is a `value.Place`

A nice result of procs having typed params is that I got rid of 2 ugly special-case features.

Shell scripts can use dynamic scope to "return" values by mutating the locals of their caller. Bash goes further with declare -n "nameref" variables. The more minimal "Oil" tried to clean this up with:

Declared "out params" like proc p(:out) { ... }
A setref keyword

These are now gone in favor of value.Place, which is just another typed value. To create one, use an expression like &myline:

var myline           # optional declaration
my-read (&myline)    # call proc, passing it a Place
echo result=$myline  # => result=foo

The &myline should look familiar to C programmers, and possibly Rust programmers. To set a place, you use the setValue() method on the place:

proc my-read (; out_place) {
  call out_place->setValue('foo')
}

There could be a keyword like setplace, but I decided to keep the language simple for now.

You'll see more of value.Place in the section on read and json read. A motivating feature was to allow YSH users to write something like Bourne shell's read myvar.

In summary, value.Place generalizes these shell mechanisms:

Builtins like read and mapfile, which set "magic" variables.
Dynamic scope
- Procs disable dynamic scope because it's unfamiliar to Python and JS users. So we need an alternative.
declare -n aka nameref variables.
- Most shell users probably don't use namerefs, but I've seen them very often in foundational shell scripts.

Rich `proc` call sites

The doc on procs and funcs shows that "simple commands" are now very rich. All of these are YSH commands:

cd /tmp

cd /tmp {
  echo $PWD
}

cd /tmp (myblock)

other-command ([42, 43], named=true)

other-command ([42, 43], named=false]) {
  echo 'block arg'
}

This section describes related changes.

Breaking: `_` is now `call`

YSH has both command and expressions, and _ was the expression evaluation "command":

var mylist = []
_ mylist->append('foo')  # method call, which is an expression

my-command append        # compare: shell-like command

I've changed it to a keyword call, which I think is more readable:

call mylist->append('foo')

(A discarded alternative was two colons, like :: mylist->append('foo') )

Procs have Lazy Arg Lists

We now have square brackets (shopt --set parse_bracket) to pass unevaluated expressions to procs:

ls8 /tmp | where [size > 10]  # if 'where' were a proc

The above is equivalent to passing a value.Expr quotation:

var cond = ^[size > 10]
ls8 /tmp | where (cond)  # one typed arg

This builds on top of Aidan's work implementing value.Expr, mentioned above:

var size = 42
var cond = ^[size > 10]
var result = evalExpr(cond)  # => true

Lazy arg lists aren't used much now, but I expect them to be common. In addition to filters on streams, they should allow assert [42 === x] to provide good error messages.

This subtle parsing took a couple tries, but I'm happy with the result!

Unified block arg parsing

YSH commands that take a block literal can also take a value.Command object. These are now two syntaxes for the same thing:

cd /tmp {
  echo hi
}

var b = ^(echo hi)
cd /tmp (b)

So we have:

value.Command quotations ^(echo hi) - looks like shell's $(echo hi)
value.Expr quotations ^[size > 10] - looks like YSH $[size > 10]

The ^ forms won't be common in real YSH code, but they're useful for testing and metaprogramming. Usually, you'll pass literal expressions and blocks.

Fat arrow `=>`

Pure vs. Mutating Methods

In the summer, we settled on the thin arrow -> for method calls:

var last = mylist->pop()  # use the return value
call mylist->pop()        # throw away the return value

We now also accept => for methods, and I want to use it to distinguish pure methods that "transform" and methods that mutate.

This gotcha has always bugged me in Python:

mylist.sort()          # sort in place
mystr.strip()          # BAD: it throws away the result!
                       # Strings are immutable.

mystr = mystr.strip()  # probably what you meant

In other words, the same syntax is used for wildly different semantics. When I explain it to new programmers, I cringe a bit.

So I propose that in YSH, we have:

call mylist->sort()           # sort in place
var mystr = mystr => strip()  # transform

Right now -> and => are interchangeable, but I think we should enforce the distinction (and Samuel agreed). Feedback is welcome.

Free Function Chaining

Another thing that fell out pretty easily is using => to chain free functions.

Here's an excerpt from the commit that implemented this:

The expression obj => f attempts to create a value.BoundFunc, which you then call with obj => f().

First, do a method lookup on the type of obj.
- If it succeeds, we're done.
- If no method is found, look up a variable named f.
  - If it's found, and it's a BuiltinFunc or (user-defined) Func, we create a BoundFunc.
  - If there's no such variable, you get a normal "Undefined variable" error. (This error should be improved.)

So this behavior makes free functions chain like methods. An example from spec/ysh-methods.test.sh shows the benefit. If dictfunc() returns a dict with keys K1 and K2, then you could have written this code:

$ echo $[list(dictfunc()) => join('/') => upper()]
K1/K2

The new way is nicer and more consistent:

$ echo $[dictfunc() => list() => join('/') => upper()]
K1/K2

Because => can be used for both methods and free functions, it's like "uniform function call" syntax, which I've wanted for many years.

Return Type Annotation

We also parse => in function return types, but these values aren't used yet:

func f(x Int) => List[Int] {
  return ([x, x + 1])  # parens required around expressions
}

Future Work

(1) We should probably enforce that funcs are really pure

I believe this can avoid the "function coloring" problem with proc vs. func, a design issue that I've written about. I think of these as Perlis-Thompson problems.
#language-design > func evaluator without redirects and $? - a pure evaluator should be faster.

(2) Clean up implementation of "closed" vs "open" procs

proc p () {  # closed, no params to bind
  echo
}
proc p {  # open, args are automatically bound
  echo
}

The difference can now be expressed with a rest param ...ARGV.

(3) Unify the runtime representation of value.LiteralBlock and value.Command.

(4) ARGV should be a regular variable, rather than using shell's separate "$@" stack.

New Prompt API - `func` and `value.IO`

The interactive shell and the YSH language are converging!

Now that we have functions, we can express a nicer prompt API than bash's $PS1, which has very "exciting" quoting rules:

$ PS1='\w\$ '          # custom PS1 language
$ PS1='$(echo \w)\$ '  # same thing, note single quotes
                       # and delayed $() evaluation

In contrast, YSH now uses a func that takes a value.IO instance. You can build up a plain old string, using methods like io->promptVal():

func renderPrompt(io) {
  var parts = []
  call parts->append(io->promptval('w'))  # pass 'w' for \w
  call parts->append(io->promptval('$'))  # pass '$' for \$
  call parts->append(' ')
  return (join(parts))
}

This is "normal code", and it should be better for complex prompts. But YSH still respects $PS1, so you can copy and paste from existing sources, or use that style if you prefer.

Help Topics:

More YSH Improvements and Breakages

Builtins

Several YSH builtins have been changed to use the new style of typed args to procs. These are all breaking changes.

`read` takes `value.Place`, with default var `_reply`

The read builtin has been simplified by optionally accepting a value.Place. There are now 2 ways to invoke it:

echo hi | read --line       # fill in _reply by default
echo reply=$reply           # => reply=hi

echo hi | read --line (&x)  # fill in this Place, var x
echo x=$x                   # => x=hi

Likewise with the --all flag, which reads all of stdin:

echo hi | read --all
echo hi | read --all (&x)

(The --long-flag style lets you know that you're using YSH features.)

`json read` is consistent with `read`

The json builtin now follows the same convention:

echo {} | json read         # fill in _reply
echo {} | json read (&x)    # fill in this Place, var x

`append` builtin

The append builtin no longer takes an arg like :mylist. Instead, it simply takes a typed arg:

append README.md *.py (mylist)   # append strings to mylist

This is equivalent to calling methods on the value.List:

call mylist->append('README.md')
call mylist->append(glob('*.py'))

# Make it a nested list -- not possible with the command-style
call mylist->append(['typed', 'arg', 42])

`error` builtin

The syntax has been tweaked to reflected the new separation between word args and typed args. Old style:

error ("Couldn't find $filename", status=99)

The new style has a word arg, and an optional named arg:

error "Couldn't find $filename"
error "Couldn't find $filename" (status=99)

Method Name Changes

We're still tweaking the API names for consistency. There's a new YSH Style Guide as well.

snake_case() → capWords()
startswith() → startsWith()
strip() and family → trim() and family

I think this set of APIs:

trim()
trimLeft()       trimRight()
trimPrefix()     trimSuffix()

could be nicer than Python's:

strip()
lstrip()         rstrip()
removeprefix()   removesuffix()

Initializing and Setting Variables

`var` destructuring

You can now initialize multiple variables at once:

var flag, i = parseArgs(spec, ARGV)

I had disabled that feature because I thought this would be confusing by differing from JavaScript:

var x, y = 1, 2    # YSH
var x = 1, y = 2;  # JavaScript

But I think we can simply avoid that usage, writing this instead:

var x = 1
var y = 2

Implicit `null` initialization

Sometimes you want to initialize a variable after declaring it with var. Rather than

var x = null
echo hi | read --line (&x)

You can now leave off the right-hand side:

var x  # implicit null
echo hi | read --line (&x)

`const` must be at the top level

The YSH const keyword inherited its behavior from POSIX shell's readonly. This is a dynamic check, which works poorly in loops:

$ for x in 1 2; do readonly y=x; done
-bash: y: readonly variable

I decided that dynamic const is "weak sauce", and if anything, we should have a static const.

For now, we're de-emphasizing const, so it's illegal inside proc and func. You can only use var.

const can still be at the top level, since the dynamic check is still useful there: it can prevent source from clobbering variables. (We'll probably introduce namespaces / modules in the future, so that source doesn't have this pitfall.)

Thanks to Aidan for feedback on this.

The rest of augmented assignment

Previously we only had:

setvar x += 3

Now we have all of:

setvar x /= 2
setvar a[i] *= 3
setvar d.key -= 4

The augmented assignment operators are listed in the YSH Table of Contents under Assign Ops. (And now I notice a broken link to fix.)

Optional colon for type annotations

This is now valid syntax:

var x: Int = f()  # colon looks better

But again we don't do anything with the Int annotation. We may omit the colon in signatures, because they conflict with Julia-like semi-colons:

proc p (word; x Int, y Int; z Int) {  # no colons, a bit like Go
  echo hi
}

Compared with having both:

proc p (word; x: Int, y: Int; z: Int) {  # : and ; noisy?
  echo hi
}

Expressions

Eggex capture syntax is more explicit

This change came from using Egg expressions myself. It adds Python-like keywords, which I think makes capturing more readable.

Old syntax:

var pat = / <d+> /                   # positional capture

var pat = / <d+ : month> /           # named capture

New Syntax:

var pat = / <capture d+> /           # positional capture

var pat = / <capture d+ as month> /  # named capture

I also reserved syntax for type conversion functions, which are fully implemented in version 0.20.0 (the next release):

var pat = / <capture d+ as month: Int> /

This makes Eggex a bit like C's scanf()!

Range syntax is now `0 .. n`, not `0:n`

Originally I thought slices a[0:n] were like ranges 0:n, but they're different.

Slices are for indexing, and both end points are optional. A slice can only be evaluated relative to a Str or List.
Ranges are for iteration, and both end points are required. They stand alone.

Floats can't end with `.`

42. is no longer a valid float; an explicit 42.0 is required. This prevents ambiguity with ranges like 1..5.

Misc Fixes

Floats
- Handle overflow, i.e. Inf and -Inf
- Fix some crashes in C++
Fixed several related parsing bugs, which relate to how we switch from command mode to expression mode.
- For example, you can now write if(x > 0) (no space) in addition to if (x > 0), though it's not the recommended style.

Docs - New and Updated

I pointed out several docs in Oils Winter Status Update > Please Review These Docs.

I forgot to mention the nascent YSH Style Guide.
- YSH is a mix shell, Python, and other languages, so it uses many naming styles. Feedback is welcome.
We're deep in the middle of creating and updating the Oils Reference, which now has around 500 topics.
I finally renamed "Oil" to YSH or "Oils" in all docs.
- Let me know if you notice any stragglers. We still need to rename the domain oilshell.org.

Designs That Took More Than One Try

While writing these notes, I noticed that we need iteration to get some features right.

This is a major reason YSH hasn't been fully documented: we need to try it first!

Here's a little retrospective:

setref and :out became value.Place in this release.
- I'm happy with how this turned out. It removed one items from the Warts doc. (And Oils 0.20.0 will remove the $'\n' wart.)
QSN will become J8 Notation.
- The problem was that QSN was inspired by C-like string syntax in bash and Rust, but it wasn't "harmonized" with JSON. Oils had two different string notations, which is bad.
- J8 Notation is fully compatible with JSON. I mentioned it back in June, and it's almost fully implemented in Oils 0.20.0.
Hay needs an update, with some breaking changes.
- Instead of the SHELL blocks, I think we can simply attach value.Proc to the Hay data structure.
- We should try to unify flag parsing specs with Hay.
"Tea" bootstrapping → "Yaks" (see below)

What other design issues are there?

Modules / Namespaces - we ran into this when implementing the standard library.
Objects and a meta-object protocol - a reflective language allows users to write libraries and tools (flag parsing, test frameworks, tree-shaking, etc.)

Related Zulip threads:

Performance / C++ / Under the Hood

Here are some details on the contributions in the first section.

Real Hash Table

As mentioned, Melvin implemented a real hash table, inspired by CPython's "Hettinger dict". Compared with the earlier Python dict, it's more compact in memory and preserves insertion order.

A primary motivation for YSH was to be able to round-trip JSON messages without shuffling the keys:

{"z": 99, "y": 42, "x": [3, 2, 1]}

Some references we used:

[Python-Dev] More compact dictionaries with faster iteration (2012)
- http://code.activestate.com/recipes/578375/
Faster, more memory efficient and more ordered dictionaries on PyPy (2015)
Modern Dictionaries by Raymond Hettinger (YouTube, 2016)
#performance > "Compact Dict" Notes (our Zulip)

As a result of these optimizations, we're now beating bash on a couple cases of benchmarks/compute! I think this is pretty impressive, because our source language is typed and garbage-collected Python, while bash is written in C.

So I have more confidence we can be as fast as bash. It's not clear how much effort it will take, but it should be fun nonetheless :-)

GC Rooting details

After optimizing _Dispatch, Melvin added a report to flag functions with too many GC roots.
Unrolling StackRoots made the code faster, but the executable code much larger. This is a good tradeoff now.
- I think we still have some optimization left with a "hybrid rooting scheme".

More Progress

We're now running the perf tool in our Soil CI, and generating profiles automatically.
- It runs in a VM, not a container, because perf is tightly coupled to the Linux kernel version.
I tuned the List<T> growth policy — how big reallocations are.
I also tuned the Dict<K, V> growth policy, but the results weren't conclusive.
- We got rid of modulus % with power of 2 index_len_, but it didn't quite show up in wall time. In contrast, the hashing changes resulted in obvious improvements.
TODO: mylib.BufWriter() has some tuning work left, to avoid tiny allocations. We should add some micro-benchmarks.

Code Cleanup - Removed Tea Experiment

A few years ago, I mentioned a "Tea" experiment for bootstrapping. I implemented a parser for Tea, reusing some of the "Oil" parser.

But this made the code more complex, and the parser now seems like the wrong place to start.

So I've deleted it, and started a #yaks experiment in a separate repo. Yaks is more about reusing the mycpp runtime in a "bottom-up" fashion, with an IR, rather than starting from a parser.

In any case, we no longer have this distraction in the code.

Summary

This was a huge release, with changes from September, October, and November!

I showed many code samples, and tried to justify each change. YSH is rapidly improving, but it's not done yet.

What's next? Oils 0.20.0 is well underway, with

A powerful and convenient Eggex API (balancing Perl and Python styles)
JSON and J8 Notation - fixing the JSON-Unix Mismatch
A few more breaking changes. But not as many as in this release!

Let me know what you think in the comments!

Appendix: Closed Issues

#1759	Str* raw_input(Str*): Assertion `0' failed
#1758	Implement command -V (POSIX compatibility)
#1732	Crash When Comparing Functions (and Other Values)
#1731	Oils 0.18.0 tarball gives errors when extracting with bsdtar
#1727	Error building 0.18.0 on MacOS: std::fmod not found
#1702	[breaking] Change _ prefix to 'call' keyword
#1289	append builtin can take typed args
#1112	Design for Python-like functions in Oil
#1024	Implement binding of typed params to procs
#957	Implement setvar x -= 1
#770	Support read -N, etc.
#498	Provide a prompt hook in bin/ysh
#259	type builtin doesn't handle -p/P/a

Appendix: Metrics for the 0.19.0 Release

These metrics help me keep track of the project. Let's compare this release with the previous one, version 0.18.0.

Spec Tests

OSH passes more tests due to the features mentioned above.

It also fails more tests, because at least one of them is unimplemented. But remember that adding failing spec tests are half the battle!

OSH spec tests for 0.18.0: 2100 tests, 1869 passing, 90 failing
OSH spec tests for 0.19.0: 2135 tests, 1893 passing, 99 failing

You can write Python, and everything "just works" in C++:

C++ spec tests for 0.18.0 - 1870 of 1872 passing
C++ spec tests for 0.19.0 - 1894 of 1896 passing

New YSH behavior is reflected in the spec tests:

YSH spec tests for 0.18.0: 630 tests, 571 passing, 59 failing
YSH spec tests for 0.19.0: 661 tests, 616 passing, 45 failing

Some of the new behavior doesn't work in C++, largely due to JSON. This has already been fixed in Oils 0.20.0!

YSH C++ spec tests for 0.18.0: 492 of 569 passing, delta 77
YSH C++ spec tests for 0.19.0: 526 of 616 passing, delta 90

Benchmarks

The parser is more efficient, I think due to the growth policy:

Parser Performance for 0.18.0: 16.3 thousand irefs per line
Parser Performance for 0.19.0: 15.7 thousand irefs per line

Small reduction in memory usage:

benchmarks/gc for 0.18.0: parse.configure-coreutils 1.83 M objects comprising 65.0 MB, max RSS 69.3 MB
benchmarks/gc for 0.19.0: parse.configure-coreutils 1.81 M objects comprising 63.9 MB, max RSS 68.8 MB

Huge speedup on Fibonacci due to Melvin's work on Dict<K, V> and GC rooting:

benchmarks/gc-cachegrind for 0.18.0 - fib takes 65.4 million irefs, mut+alloc+free+gc
benchmarks/gc-cachegrind for 0.19.0 - fib takes 33.1 million irefs, mut+alloc+free+gc

I/O bound workloads remain the same speed. But we still have to figure out the delta with bash here:

Runtime Performance for 0.18.0: 13.5 and 17.2 seconds running CPython's configure
Runtime Performance for 0.19.0: 13.4 and 17.1 seconds running CPython's configure
bash: 11.6 and 14.5 seconds running CPython's configure

Code Size

I improved the accounting of lines between OSH and YSH, which means that OSH went down in size:

cloc for 0.18.0: 21,025 significant lines of Python and C, 416 lines of ASDL
cloc for 0.19.0: 20,809 significant lines of Python and C, 442 lines of ASDL
- We're now measuring 4,681 significant lines in YSH. In Oils 0.12.0, I broke the codebase down further into three metrics: OSH, YSH, and data languages.

There's a bit more code in the oils-for-unix C++ tarball, much of which is generated:

oil-cpp for 0.18.0 - 104,155 physical lines
oil-cpp for 0.19.0 - 107,543 physical lines

The compiled binary got much bigger due to inlining GC rooting. This is the tradeoff for the speed increases above:

ovm-build for 0.18.0: 1.70 MB of native code (under GCC, on Debian 12)
ovm-build for 0.19.0: 1.90 MB of native code (under GCC, on Debian 12)

As mentioned, I have an idea for a "hybrid rooting scheme" to make the code both smaller and faster.