Toward a New Unix Shell

Journal Tetsujin's Journal: Toward a New Unix Shell 2

Journal by Tetsujin on Wednesday April 01, 2009 @03:58PM

One thing that's interested me a lot lately is the idea of improving upon, or reinventing the Unix command shell.

Now, there's a certain class of Unix users who have stuck with the command shell while GUIs have come and gone, been released, overhauled in various releases, and so on until new versions barely resemble old ones. One of the problems I face in this kind of plan is that many of these users are fairly attached to the tradition. They use the current command shells because it's the environment they like, and many of them don't want to use a "reinvented" shell. When in my design I start thinking about giving the shell its own type system to be used over pipelines, or devising a set of rules that tools would have to follow to "play nice" in this environment - I know that a lot of my target audience just flat out isn't going to like it. This concerns me. But still the idea of this environment appeals to me. Somehow I want to make it work, and make it stick.

Now, the first question one has to ask about something like this is, why? Why should the shell change? There are a few reasons I have in mind:

If the shell could know things like the type of a piece of data, or the kinds of command-line options a tool supports, it could offer context-relevant help.
In present shells, when connecting the output of one tool to the input of another, the user must insert parser/serializer steps. I feel it's appropriate for the shell to be able to offer a greater degree of help than that.
I think there's real value in giving the shell a way to deal with "objects" and higher-order programming in general. For instance, an object could be an XML parser or an open network connection or a window into a running application: for the lifetime of that object the shell should be able to issue commands to that object - and when the shell's last reference to that object is destroyed, some action (like closing a connection or killing a process) may be appropriate. There are some command-line tools that implement this sort of behavior themselves, but I think it would be very nice if it were a real feature of the shell.

Also, I believe the current model of the shell has fallen behind how people actually use their computers. For instance:

Modern GUIs offer a lot of useful functionality - but the extent to which this functionality is integrated into the command shell is rather limited for various reasons. For instance, why isn't the volume manager or the wi-fi manager that I used inside the GUI also available outside the GUI? The basic answer is that command-line tools aren't well suited to that kind of usage profile in which they are started as a service and then, while running, receive and respond to outside commands. The framework for such a thing simply isn't in place.
Scripting languages like Perl and Python offer large and useful libraries that perform all kinds of different features. Why can't shell scripts access these? The basic answer at present is that the programming language provided by the shell lacks the constructs necessary to usefully interface with these utilities. The lack of "object" support (and, specifically, lack of a good mechanism to start something, keep it running, and interact with it, and shut it down when finished), the lack of any sort of namespace support, and the fact that any data going into or coming out of such a library has to be arranged in some ad-hoc format for which the shell provides no specific support - all of this severely hampers the ability to expose these libraries and the practical benefits of doing so.

Microsoft has already come up with their own solution: "Powershell" - a command shell somewhat similar to cmd or a Unix shell, but with support for "commandlets" - commands on the search path which are actually .NET classes which are dynamically loaded and run as part of the shell's process. These "commandlets" exchange .NET objects as their input and output. The shell can then store these .NET objects in environment variables for later recall (and keep track of when the object is no longer referenced, and delete it) - this goes a long way toward usefully exposing Windows API functionality within this shell.

My goal is a bit different. Linux has no standard representation for "objects" (and I'm not in a hurry to embrace Mono, let alone encourage others to do the same) so Microsoft's approach isn't suitable for my goals. Furthermore, without the ability to restrict the behavior of a piece of code within a single process, it becomes more important for the stability of the shell to continue to have tools be separate processes. Therefore, whether these outside processes communicate via shared memory or pipes, either way they need to respect a few common conventions about how data is formatted, and (in the case of "objects" - data in which it's important to know when it's time to destroy it) how to manage object lifetime.

One of the typical complaints is that a plan like this requires all shell tools to agree upon and use the same set of rules for how they format their data. This would be a real problem: people would be slow to move to this format, which in turn would make the shell less useful (since it would lack the tools to run in its "enhanced" environment). No one would want to write a tool that runs only in a new, unproven shell, and no one would want to either shoe-horn their problem into an uncomfortable data format, or waste CPU time by translating their optimal data format into the one the shell wants.

So clearly putting everything into a single data format wouldn't work. And in general, the fewer things I "mandate", the better. So my idea is this: tools can go on communicating with whatever encoding makes sense for them, but there should be some shared means of identifying what that encoding is. In order to do this without requiring, you know, every program in the world to be changed for compliance, there has to be a way to provide this information out-of-band - and, presumably, statically. That way, even if the binary itself has no provisions for working nicely in my shell, the end-user can work around that, without changing source code or recompiling, and without needing the authority to install such a workaround system-wide.

Of course, there's still all kinds of problems with this plan. Among other things, this new shell somehow has to provide a nice new environment, while simultaneously working nicely with things not made for it. For instance, if I provide my own version of "find" - do I have to give it a different name so it won't conflict with the GNU find everybody's used to running? There seems to be a never-ending supply of small problems that need to be solved before this design can really go anywhere. But if all goes well, then maybe someday I'll get it written and you'll give it a try. :)

This discussion has been archived. No new comments can be posted.