Type-safeness in Shell

Link post

Since writing the post on a hypothetical hull language as an alternative to shell I cannot stop thinking about the shortcomings of shell.

And one think that comes to mind over and over is type-safeness. Shell treats everything as a string and that’s the source of both its power and its poor maintainability.

So when I ask whether shell can be improved, the question is actually more subtle: Can it be improved without compromising its versatility? Can we, for example, be more type-safe without having to type Java-like stuff on the command line? Without sacrificing the powerful and dangerous features like string expansion?

I mean, you can write shell-like scripts in Python even today and use type hints to get type safeness. But in real world this practice seems to be restricted to writing more complex programs, programs that require actual in-language processing, complex control flow, use of libraries and so on. Your typical shell script which just chains together a handful of UNIX utilities — no, I don’t see that happening a lot.

To put it in other words, different “scripting languages” managed to carve their own problem spaces from what once used to be the domain of shell, but almost none of them attacked its very core use case, the place where it acts as a dumb glue between stand-alone applications.

But when writing shell scripts, I observe that I do have a type system in mind. When I type “ls” I know that an argument of type “path” should follow. Sometimes I am even explicit about it. When I save JSON into a file, I name it “foo.json”. But none of that is formalized in the language.

And in some way, albeit in a very hacky one, shell is to some extent aware of the types. When I type “ls” and press Tab twice a list of files appears on the screen. When I type “git checkout” pressing Tab twice results in a list of git branches. So, in a way, shell “knows” what kind of argument is expected.

And the question that’s bugging me is whether the same can be done in a more systemic way.

Maybe it’s possible to have a shell-like language with actual type system. Maybe it could know that file with .json extension is supposed to contain JSON. Or it could know that “jq” expects JSON as an input. Maybe it could know that JSON is a kind of text file and that any program accepting a text file (e.g. grep) can therefore accept JSON as well. And it could know that “ls -l” returns a specific “type”, a refinement of “text file” and “file with one item per line”, with items like access rights, ownership, file size and so on.

But how would one do that?

In addition to the language implementing a type system it would require some kind of annotation of common UNIX utilities, adding formal specification of their arguments and outputs. (With all programs not present in the database defaulting to “any number of arguments of any type and any output”.) Maybe it can be done by simple type-safe wrappers on top of existing non-type-safe binaries.