blog | oilshell.org
This post is part of the Summer Blog Backlog: Understanding and Using Shell. It has links and #comments about the history of shell.
The most important topic is Ken Thompson's paper on the first Unix shell. I quote it below and will refer to it in future posts.
It motivates the Perlis-Thompson Principle, an important idea in #software-architecture. This principle relates to both the Kubernetes-Multics analogy and the design of the Oil language.
I had been working on a shell for 5 years and had never seen this paper! (Though I have read The Unix Time-Sharing System (PDF) by Ritchie and Thompson many times.)
Apparently a scan of Thompson's paper was uploaded to the Internet Archive in
February 2018, and
susam circulated it last year. Thank you!
This paper was striking because of the "lectures" on software design at both the beginning and the end. From the introduction:
A program is generally exponentially complicated by the number of notions that it invents for itself. To reduce this complication to a minimum, you have to make the number of notions zero or one, which are two numbers that can be raised to any power without disturbing this concept. Since you cannot achieve much with zero notions, it is my belief that you should base systems on a single notion.
The "sermonette" in the conclusion:
Many familiar computing 'concepts' are missing from UNIX. Files have no records. There are no access methods. User programs contain no system buffers. There are no file types. These concepts fill a much-needed gap. I sincerely hope that when future systems are designed by manufacturers the value of some of these ingrained notions is reexamined. Like the politician and his 'common man', manufacturers have their 'average user'.
There is clunkiness in these quotes, but also profound truth with practical consequences. These ideas are difficult to explain, which is why I've been circling the issue for 6 months with posts tagged #software-architecture.
One way to explain them is with what I'm calling the Perlis-Thompson Principle. The definition will be something like this:
Consider using fewer concepts, data structures, and types in foundational software.
This style allows for more composition and ad hoc reuse. It evolves and scales more gracefully.
When introducing a new concept, define a way to reduce it to an existing concept.
This is a softer statement than the ones by Perlis and Thompson, but I think it captures the same fundamental truth. Importantly, this is a "soft" principle and tradeoff, not a hard rule.
The related Unix-y concepts I want to write about are:
My takeaway from this paper that that Thompson made intentional design decisions that contributed to the successful evolution of Unix, and that are still misunderstood. These decisions under our noses and under our fingers every day, but we still misunderstand them.
My comment on text as a narrow waist was in response to a typical misunderstanding. The kernel does not need records or types. And they should be added to a shell with care. We don't want to break the compositional properties of shell.
I reached the stage where I felt that commands should be usable as building blocks for writing more commands, just like subroutine libraries. Hence, I wrote "RUNCOM", a sort of shell driving the execution of command scripts, with argument substitution. The tool became instantly most popular
I knew that the shell as a user space program didn't originate with Unix, but I wasn't familiar with this specific work by Louis Pouzin on the Multics shell.
Unix was both positively and negatively inspired by Multics, and that includes the shell. I recently got an e-mail from Multics engineer and historian Tom Van Vleck (regarding the Kubernetes-Multics analogy), and he notes these similarities:
- Both systems' text files are ASCII byte streams separated by NL characters.
- Both systems support I/O redirection. ([Bell Labs employees] designed much of the Multics I/O streaming system.)
- Both systems' command languages have options, active functions, pathnames, etc.
- Both systems have verb-modifiers-object syntax with items separated by spaces.
Though shell pipelines were notably a Unix invention. He gave an interesting reference to a 1987 design for adding Unix-like pipes to a Multics shell:
The appendix has more on this enlightening exchange.
The initial draft of this post started with fun #comments on shell history. But I ended up inserting serious comments regarding #software-architecture, because these historical references will play an important role in future posts.
We have to understand the past in order to build the future! Oil is not a retro-computing project :-)
As mentioned above, I got an e-mail from Tom Van Vleck about the Kubernetes-Multics analogy. I said that Kubernetes and Multics are both "serious, respectable, but overly complex" systems.
This is a coarse statement, and there is subtlety. Here's a list of Multics myths to balance the argument:
In particular, Multics had comercial customers for decades.
I knew that Multics pioneered the shell as a user program, and that it was an influential system in many other ways. See Fernando Corbato's 1990 Turing Award for a description of its innovations and achievements.
Nonetheless, it appears that Multics does not follow the Perlis-Thompson principle. And it doesn't use language-oriented composition — at least not to the extent that Unix does.
Again, I claim that these principles enable foundational software to scale and evolve for decades, and future posts will elaborate on this. We're still missing the Unix of distributed operating systems.
Here are recent comments I want to elaborate on:
The Bourne shell has the well-known problem of automatically splitting substitutions in unquoted words. I wrote about how we fix this in Oil Doesn't Require Quoting Everywhere.
whetu on reddit recently dug up some interesting history on word
with references to old Multics and Unix manuals.
Though, to be honest, I can't tell from these quotes if word splitting
originated from Multics or Unix. It's possible to require quotes around
'with spaces and | operator' without splitting unquoted
Leave a comment if you know where "dynamic" word splitting originated!
This comment might be useful to people working on a hairy bash script. I would advise others to ignore it -- I didn't know this trivia before I started working on OSH!
[ "$a" == "$b" ] # string equality [ "$a" = "$b" ] # string equality [[ $a == $b ]] # fnmatch [[ $a = $b ]] # fnmatch [[ $a =~ $b ]] # regex match [[ $a -eq $b ]] # numeric equality (( a == b )) # numeric equality (( a = b )) # numeric assignment!
It also gives color on the design of the Oil language. Oil gives you Python-like typed expressions instead of this cacophony of syntax. See The Simplest Explanation of Oil.
a1) syntax here caught my eye:
I didn't read this whole article, but it appears that shell's case syntax came from some variant of ALGOL.
This isn't surprising, because Steve Bourne, author of the Bourne Shell, invented a "Bournegol dialect" of C.
Related: 2009 Interview with Steve Bourne on Unix shell.