Why Sponsor Oils? | blog | oilshell.org

Unix Shell: Philosophy, Design, and FAQs

2021-01-26

This post collects #comments on the philosophy and design of shell.

On the one hand, I want to give these big ideas a careful treatment. On the other, a dialogue can sometimes be the best way to explain something.

(Note: Why Create a New Unix Shell? addresses some of the same topics. This post may make more sense if you've read it first.)

Table of Contents
Recap
Philosophy
Shell is a Language That Grows
A Slogan / An Evolving Set of Tools
The Lindy Effect / Designs That Have Lasted 50 Years
Design
Language and Problem Diversity
Shells Should "Shell Out"
An Important Caveat
FAQs
The Biggest Misconception: Shell XOR Python
Situated Languages: PowerShell vs. Oil
Transpiling to Bash?
Next

Recap

Let's first review where we are in this blog series. The last two posts were:

Philosophy

Shell is a Language That Grows

Here is a famous talk on language design by Guy Steele:

The top comment has a useful way of looking at shell:

Why do people swear by command-line invocations and bash? Bash is awful. But it's an environment that grows in a way that I've never seen with GUI apps. A small set of simple tools and a universal way to compose them is remarkably powerful ...

I agree, and this is the main point of Oil! We want to preserve the good parts of shell and a huge corpus of existing code.

Then we smoothly upgrade shell into a familiar and predictable language. It's not obvious that this is possible, but everything is working so far. Please try it and send feedback!

A Slogan / An Evolving Set of Tools

This comment has a related, pithy way to think about shell:

When you program in shell, gcc, git, pip, npm, markdown, rsync, diff, perf, strace, etc. are part of your "standard library".

A few decades ago this set of tools was different. For example, this 1986 article discusses pic and troff, which are rarely used today:

Aside: HTML also has this flavor. In the first couple decades, it embedded Flash and the JVM. Now it more often embeds mp4 videos, WebAssembly, and more. It's a language that grows.

The Lindy Effect / Designs That Have Lasted 50 Years

Here's another long term view of the shell language:

A related idea is the "Lindy Effect".

I’m working on a shell because it’s been around for 50 years now, so I expect it to be around in 50 more years.

There’s an unbroken chain from Thompson shell in 1970, to Bourne shell, to Korn shell, to bash to, Oil.

This viewpoint might seem self-centered, but it's clear that bash is the most popular shell today, and Oil is the most bash-compatible shell by a mile. Every release for over 3 years has included spec test results that show this.

There is wisdom you’re abandoning when you break things.

When foundations are stable, you can make progress on top. The alternative is what I call “motion without progress”.

I also think shell will be important in 50 years, not just extant. I'll explain this in future posts on distributed systems and what I call the "shell-YAML antipattern".

Design

Language and Problem Diversity

Here's a rant about diversity in programming. Summary: if a programmer doesn't use a particular language or feature, it doesn't mean that the problems it solves don't exist! Every language is domain specific.

I find this true and very common: programmers underestimate the diversity of software.

 

Shells Should "Shell Out"

Another quote from the same rant:

Some [alternative shells] don't even shell out conveniently to processes in a different language! In other words, the other languages are treated as "second class".

That defeats the whole purpose of shell and polyglot programming. The purpose of shell is to bridge diverse domains.

You could argue that a two-tiered design like "first class" PowerShell cmdlets and "second class" external processes is still useful, but I would argue against that.

That design creates problems of composition. I'll elaborate on this in future posts about O(M*N) problems.

Summary: Shell is a form of coarse-grained reuse. It's not always what you want, but having this option is invaluable from a systems perspective.

An Important Caveat

I want to acknowledge the downside mentioned in this comment from cle:

The Bash script actually has more dependencies, it relies on a number of external programs (ps, kill, mkdir, sort, ls, cp, echo). What versions are on your system? What versions on the systems of the people running the script?

This is the same issue I mentioned at the end of Shell Scripts Are Executable Documentation: scripts need a known environment. I think the best way to address this problem is with containers, which have become very popular in the last 5 years. I believe that Oil should have some notion of containers.

Counterpoint: A recent bug reminded me that this caveat also applies to Python! The packaging tool itself is a dependency which can break. Our continuous build started failing because different versions of pip installed MyPy in /usr/local/bin/mypy or ~/.local/bin/mypy. Improving shell will also improve Python!

FAQs

This was the most popular post of the year:

Below are three replies to frequently asked questions.

The Biggest Misconception: Shell XOR Python

A debate about shell vs. Python seems to erupt in many threads about Oil. But this is a false dichotomy.

Excerpt from my comment:

I'm often asked this about Oil: Why do you want to write programs in shell?

That's not the idea of shell. The idea is that I write programs in Python, JavaScript, R, and C++ regularly, and about 10 different DSLs (SQL, HTML, etc.) And I work on systems written by others, consisting of even more languages ...

The Oil repo itself has dozens of examples of this. Most of those shell scripts invoke a Python program! That is working as intended.

Why not use only Python? Because I'd rather write 100 lines of shell + 100 lines of Python than 500 lines of Python. This is basically the Unix Philosophy, and I didn't understand it until after I'd been programming for many years.

It's also important to remember that Oil is designed for Python and JavaScript programmers who avoid shell. There seem to be a lot of you out there!

Situated Languages: PowerShell vs. Oil

PowerShell also comes up frequently. Excerpt of my comment on the difference:

PowerShell is natural on Windows, where the OS provides objects (either via the .NET VM, or COM and .DLLs, etc.)

A Unix shell like bash or Oil is natural on Unix, where the OS uses text files. And in distributed systems where data is JSON, YAML, XML, protobuf, msgpack, etc. not objects.

In other words, shell is a situated language. It interacts deeply with the operating system. A shell on Unix looks very different than a shell on Windows.

Three Comics For Understanding Unix Shell explains how a Unix shell is a thin layer over the kernel.

Porting shells between Windows and Unix is possible, but you lose a lot of value. This is why Microsoft developed WSL. It makes more sense to port an entire Linux distro to run on the Windows kernel, rather than just a shell! This is related to the caveat above about a script's "environment".

Other points:

This is OK! Shell is the language of diversity and heterogeneity. It's Unix-y to write a shell script that invokes a .NET VM running a PowerShell program :-)

Related: Notes on Postmodern Programming (PDF, 2002). For example, the section on "pervasive heterogeneity". TODO: There are a few papers like this that I'd like to comment on more thoroughly in the future.

Transpiling to Bash?

The question of Oil being an interpreter vs. compiler has also come up more than once. Except of my response:

There are some things you can fix with a transpiler, and some things you can't.

...

You don't want to end up with the JavaScript problem: you have a dynamic language, but also a build process, which is basically the worst of both worlds.

Next

Whew! I've wanted to get this dense set of ideas out for a long time. Let me know if anything is unclear, and if you think a particular point should be expanded upon in a longer post.

I want to write about these topics next: