blog | oilshell.org

Summer Blog Topics

I went to the Recurse Center so I could talk to people about computers in person, rather than on the Internet. That's why I haven't written here lately.

But I've gotten several ideas for blog posts this summer, which I sketch here. Let me know which ones sound interesting.

Lexing: Theory vs. Practice

Last December, I started series on #lexing. Before leaving for RC, I wrote a long draft of everything I wanted to say, but didn't publish it.

I got good feedback on this draft from RC alumnus Jacquin, so I'm motivated to polish and publish it.

Phrases to Explain the Oil Project

Not everyone understood the project, but certain phrases seemed to resonate.

"Make is harder to learn than C"

More than one person at RC Center was relieved / amused by this observation.

Here, "Make" is shorthand for crappy 70's-style, everything-is-a-string glue languages. Shell, autoconf (m4), and CMake are also in this category.

Make is defined in terms of shell, so if you don't know shell, then you don't know Make.

I've seen programmers write "shell scripts" in C++ because they know C++ and not shell. Hopefully Oil can address this inversion.

"Failure by Success"

I'm fascinated by the evolution and adoption of software, and this pithy phrase explains a lot. Examples:

Unix is so popular that we're stuck with shell. We're bound by decisions made 40 years ago. Not only is Unix on essentially every cloud server, it's also on phones and "IoT" devices too. Mac OS became Unix in 2001; Windows is currently evolving into Unix.

is so popular that we're stuck with shell. We're bound by decisions made 40 years ago. Python is so good for quickly writing large applications that many billion dollar companies run on it. Dropbox and Instagram are now adding types to their Python code. I think it's important to understand that this doesn't imply they should have started with types.

is so good for quickly writing large applications that many billion dollar companies run on it. Excel is so productive that many small businesses run on it. They may not even need "programmers". I had a great conversation about Excel with Tal at RC. I learned that the Excel expression language and Visual Basic are entirely different but overlapping languages which coexist in the same runtime. This reminds me of Shell, Awk, and Make.

is so productive that many small businesses run on it. They may not even need "programmers".

I've long thought of shell, PHP, and R as crappy languages that get a lot of work done. I've just added Excel to that mental box :-)

On the other hand, Python is a good language which gets a lot of work done! That's why Oil is written in Python.

Five-Minute Talks

Every Thursday, there are 5-minute talks at Recurse Center. I'd like to turn the talks I gave into blog posts, but they were very demo-based, which doesn't easily translate into prose. However, feel free to ask me for details in the comments.

What is xargs -P and when is it useful?

For productionizing data science and research.

For running tests in parallel.

As a quick-and-dirty way to speed up an arbitrary command.

Three Tips on using Regular Expressions

Write unit tests.

Use ERE syntax: grep -E , sed --regexp-extended , awk without flags.

, , without flags. Use re.VERBOSE in Python and write comments.

Theory vs. Practice: Regular Languages vs. "Regexes"

I prepared material on this longer talk, but haven't given it yet. It's partly a distillation of Russ Cox's articles, which I think need "Cliff Notes", but I have additional material on re2c, which I use in Oil.

(Confusingly, re2 and re2c are entirely separate projects. The former is an regex interpreter and the latter is a regex compiler.)

Core Concepts of Unicode Demonstrated

The difference between bytes and code points .

and . Declaring the encoding in HTML, HTTP, and Python source files.

UTF-8 the best encoding. Why?

Transliterating a Subset of JavaScript to C

See the README in my javascript-vs-c repo.

Mandrelbrot performance benchmark: JavaScript is as fast as C in this constrained case.

A tip for learning C: treat it like Python or Ruby. Write shell script wrappers and use ASAN.

Book Reviews

Philosophy of Systems

Thinking in Systems vs The Systems Bible. This came out of a good conversation with Venkatesh. These books are roughly about the same subject, but one has a positive viewpoint and one has a negative viewpoint:

Modelling systems to find leverage points, vs.

Systems actively thwart their operators' intentions. They are rarely used for their intended purpose, and they always operate in a semi-degraded mode.

Books on Implementing Programming Languages

The Recurse Center has a small but surprisingly good library. I'm familiar with four out of eight compiler/language books, and I may be able to orient programmers who are starting to learn about this topic.

The Dream Machine

The The Dream Machine: J.C.R. Licklider and the Revolution That Made Computing Personal is also in the RC library.

I read it earlier this year, and it was one of the best computer history books I've read. (However, it's too detailed to be a good intro; it's more for people who have already read some computer history.)

I should publish this mini-review I posted on lobste.rs, and perhaps elaborate on it.

Next

Let me know which blog topics sound interesting. I plan to write the posts on lexing and at least one book review, but I'm not sure about the others. This post may have been enough.

The next post will announce OSH 0.5, a release with many new contributors!