Why Sponsor Oils? | blog | oilshell.org
This post is a continuation of Summer Blog Backlog: Distributed Systems, published 10 days ago. It reviews Oil blog posts on the cloud and #distributed-systems.
These posts sketch arguments, without much detail, but I like to ensure that each post has a central message. The bold claim from the previous post was:
Kubernetes is our generation's Multics. A better design would have fewer concepts and be more composable, following the Perlis-Thompson Principle.
Likewise, this post also has a claim:
A distributed OS can — and should — be made of shell scripts.
If that sounds crazy, read on for details!
Here are excerpts from older posts, which I've now tagged #distributed-systems.
On an experimental project I worked on before Oil:
I came away with the belief that a distributed OS should be just be a pile of hypothetical "shell scripts".
I want to again tip my hat to the Heroku-inspired Dokku. It apparently evolved from literal shell scripts into a very capable project! (I think it manages a single node rather than many, but that's a surprisingly big part of the problem.)
Pipelines of MapReduce jobs are not unlike shell scripts. Maybe they can literally be shell scripts.
Again, there have been efforts along these lines, so it's not hypothetical.
There's also very recent work in this direction in addition to PaSH and POSH. I just read the 2020 paper PB&J: Easy Automation of Data Science/Machine Learning Workflows, mentioned in the HotOS Notes last month.
I really like the code comparisons between their distributed shell utilities
(P
, PU
, B
, etc.) and Swift/T, Apache AirFlow, Beam, Spark, etc. They
also mention the limitation of a single machine "reduction" vs. the MapReduce
framework.
The conclusion mentions that a "JSON shell" would remove some limitations of the framework. Well that's exactly what Oil is :-)
(Unfortunately, the paper isn't freely available; I e-mailed the authors directly and got a copy. I look forward to the open source release of the code!)
I gave a well-received presentation on Oil, but this material was cut for lack of time.
Slogans to Explain the Project:
Shell should be the language for describing the architecture of distributed systems
A distributed system is a bunch of heterogeneous processes and ports.
I also say that we should solve the Shell-in-{YAML,Docker,systemd,Ruby,Python} problem, which I again mentioned in June's post on the Oil language. I want Oil to solve this problem by adding the missing declarative part to shell.
I commented on my experience porting Oil's continuous build to multiple cloud providers with a shell program called "Toil". I described this programming style as distributed shell scripting with concretions, not abstractions.
I'm not sure if this the right slogan, since Rich Hickey uses the term concretion to mean something bad: an inflexible typed wrapper for data like the Java HTTP Request class.
In contrast, I'm using concretion to mean something good: data that's not wrapped at all! Instead, it's expressed in a versionless interchange format like JSON or TSV.
I'm drafting these ideas on the Slogans, Fallacies, and Concepts wiki page, discussed in a recent post. More possible concept names: Distributed Shell Scripts and Parasitic Shell Scripts. These names are meant to get at the idea that shell can be an independent control plane, while cloud providers are the dumb data plane.
Excerpt:
CI services are surprisingly general distributed systems. They're very much like operating systems — with storage, computation, and users. They're concerned with performance (scheduling, scale), security (authentication, trust), and running heterogeneous software.
This end of this
post
linked to an article relevant to containers. It shows how the new cloud
platform fly.io
"deconstructs" OCI images (standardized Docker images)
with shell and some hacked up Go.
Docker without Docker (fly.io
via Hacker News)
689 points, 210 comments - 3 months ago
This idea goes back earlier, to 2015:
Show HN: Bocker – Docker implemented in 100 lines of bash (github.com/p8952
via Hacker News)
662 points, 87 comments - on July 21, 2015
Here are a couple #comments on container tooling.
My comment on Joyent Manta's Solaris Containers (circa 2012?) on
Papers I love: gg (buttondown.email
via lobste.rs)
39 points, 9 comments on 2020-12-06
I think they just had a fixed Zone that resembled the host Zone. It was a canned set of packages.
... I just tried NearlyFreeSpeech, which uses FreeBSD jails, and it’s kind of similar.
Solaris supported containers long before Linux, and they were used in cloud products like Joyent's Manta. But they weren't as flexible as Linux containers. As usual, Linux proceeds by evolution rather than design, but you can often "rescue" something good from the mess.
A comment on Docker's shortcomings:
My Comment on
It’s Time to Say Goodbye to Docker (towardsdatascience.com
via lobste.rs)
41 points, 59 comments on 2020-10-15
There are several criticisms here. A critical one is that Docker's design doesn't follow the Unix philosophy. It's code-centric, but Unix is data-centric. The Open Container Initiative is a step toward making containers data-centric. I still need to understand and use these alternative tools — there are many of them like crun and bubblewrap.
Further down the thread: Building containers from scratch pre-Docker was one of the primary motivations for Oil.
I used to have the same question as you… but then I tried to build containers from scratch, which was one of the primary motivations for Oil.
... the short answer is "Do Linux From Scratch and see how much work it is".
The posts above state that:
It's not too far from that to another bold claim:
A distributed OS can — and should — be made of shell scripts!
A shell coordinates processes, and a distributed system is literally a bunch of processes running on a bunch of computers (with few exceptions). This is true both at build time and runtime.
On the other hand, I can see why this won't be an appealing slogan:
Both of these views are simultaneously true. I hope Oil can change the connotation of what a "shell script" is. I'm looking for a style of software that's drastically simpler.
Here's part of an e-mail I sent to Stephen Kell that adds color on this "distributed OS as shell scripts" idea. It's based on experiences at Google and my experimental "PaaS" / OS project:
One metaphor I use is "plants vs. animals". Animals are your big iron written in C++ — like the the index servers that serves up posting lists, the batch jobs that process images from maps and satellites, etc.
And then "plants" is everything else -- the dev tools like build and test tools, and the production tools like configuration and monitoring.
Based on my experience, the plants have a lot more bearing on the health of distributed systems than the animals. Not only do they sit at the connection points, but their "biomass" is greater.
The animals are using more "known" software engineering techniques (threads, SIMD, etc.), while the plants are sort of neglected and crappy. One reason that they're crappy is because they use crappy languages! Google uses an haphazard mix of Python and neglected custom DSLs for that purpose, sort of like the open source ecosystem uses a neglected mix of shell, make, awk, m4/autoconf, Perl, etc. (i.e. terrible textual metaprogramming)
Recall that one of the slogans I mentioned was Old Unix Sludge vs. New Sludge: make/awk/m4 vs. YAML/Go templates. The experience of porting Toil to Github Actions was another exercise in "YAML programming".
I've sketched some arguments around the shell and #distributed-systems. Let me know if they made sense!
Now I want to get back to comments on #software-architecture, especially The Perlis-Thompson Principle. It has practical implications for the design of both languages and systems.
The appendix has a few more #comments on the cloud. Tackling these problems is still in the future, but the continuous build work is a natural gateway into it.
My Comment on
Using Github Issues as a Hugo frontend with Github Actions and Netlify (shazow.net
via lobste.rs)
24 points, 14 comments on 2020-12-02
Shell scripts can reuse entire cloud services! Shell is the language of ad hoc reuse.
I mentioned this comment in yesterday's section on Fallacies.
My comment on
On the merits of low hanging fruit. (/r/ProgrammingLanguages)
67 points, 49 comments - 02 Jun 2021
The bigger the distributed system, the more heterogeneous the code [is] ...
It's a fallacy / language design mistake to assume that you "own the world". More likely is that the program written in your language is just a small part of a bigger system.
Tweet I referenced:
Why is it that people get so into linting and type-checking within services, while they're okay letting latent dumpster fires burn across services? 😱🔥
— ⚡️ Jean Yang ⚡️ (@jeanqasaur) May 18, 2021
Shell is a "lowest common denominator" language: it combines programs written in languages with incompatible type systems.