Sunday, August 16, 2015

Let's Go P Code!

For those who have read a lot of my writing it won't be a shock that I'm not an Arduino fan.  I have also made it clear that I don't think C, C++, or the Arduino bastardization of both are the right language(s) for the Arduino target user.  When I preach these things the question is always asked, "what is the alternative?"  My response is usually something like "I don't have a good one.  We need to work on one."  Today I had somewhat of a fledgling idea.  It is far from complete, but it may be a path worthy of following.  Let me explain.

If you are familiar with Java, you probably know that it is a compiled / interpreted language.  That sentence alone will fire up the controversy.  But the simple fact is, Java compiles to a "bytecode" rather than native machine language, and "something" has to translate (interpret, usually) that bytecode to native machine code at some point to do anything useful.  What you may not know is that Java was designed originally to be an embedded language that was an improvement, but somewhat based on, C++.  I'm not fond of Java for several reasons, but it does indeed have some advantages over C and C++.  However, for small embedded systems, like a typical Arduino, it still isn't the right language.  And, though it has some improvements, it still carries many of the problems of C and C++.  It also isn't really all that small as we would desire for a chip running at 20 MHz or less with 32K of program memory.

But for this exercise, let's look at some of the positives of Java.  That interpreted bytecode isn't all that inefficient.  It is much better than, say, the BASIC interpreter on a Picaxe.  And the language is lightyears beyond Picaxe.  (Picaxe and BASIC stamps and the like should result in prison sentences for their dealers!)  Maybe we have a slowdown of 2 to 4 times as typical.  For that, we get the ability to run the (almost) exact same program on a number of processors.  All that is needed is a bytecode interpreter for the new platform.  We also get fast, relatively simple, standardized compilers that can be pretty good and helpful at the same time.  And we can use the same one(s) across the board.  The bytecode interpreter is rather simple and easy to write.  Much simpler than porting the whole compiler to a new processor.  And the way the code runs is standardized whether you are using an 8, 16, 32, or 64 bit processor: an int is always 32 bits.

So Java isn't the right language, but it has some good ideas.  Let's step back in time a bit before Java.  James Gosling, the inventor of Java, has said much of the idea of Java came from the UCSD P system.  Now, unless you have been in the computer field as long as I have or you are a computer history buff, you have probably never heard of the UCSD P system.  To get they story we need to step back even further.  Set your DeLorean Dash for 1960 and step on the gas.

Around 1960, high level languages were pretty new.  Theories used in modern compilers were only starting to be developed.  But some people who were very forward thinking realized that FORTRAN and COBOL, the two biggest languages of the day, were the wrong way to go.  They created a new, structured, programming language called Algol 60.  Most modern languages (C, C++, Java, Objective C, Python, etc.) trace much of their design back to Algol 60.  Although it was never very popular in the USA, it did gain some ground in Europe.  but it was developed early in the game when new ideas were coming about pretty much daily.  It wasn't long before a lot of computer scientists of the day realized it needed to be improved.  So they started to develop what would become Algol 68, the new and improved version.  One of the computer scientists involved wanted to keep it simple like the original.  Niklaus Wirth pushed for a simple but cleaner version of the original Algol 60 (well, there was an earlier Algol 58, but it had and has very little visibility.)  Wirth proposed his Algol-W as a successor to Algol 60, but the Algol committee rejected that and went in line with the thinking of the day: more features are better.  What they created as Algol 68 was a much larger and more complex language that many Algol lovers (including Wirth) found distasteful.  Wirth was a professor of computer science at ETH Zurich and wanted a nice, clean, simple language to use to teach programming without the complexities and dirtiness of the "real world."  He, for whatever reasons, decided not to use any of the Algol languages and created a new language, partly in response.  He called it Pascal (after Blaise Pascal, a mathematician.)

Pascal is a simple and very clean language, although in original form it lacks many features needed to write "real" programs.  But it served well for teaching.  The first compiler was written for the CDC 6000 line of computers.  But soon after a "portable" compiler was created that produced "machine" language for a "virtual" computer that didn't really exist.  Interpreters were written for several different processors that would interpret this "P code" and run these programs.  Aha!  We have a single compiler that creates the same machine language to run on any machine for which we write a P Code interpreter!  But Pascal is a much smaller and simpler language, with much better facilities for beginning programmers than Java.  It was, after all, designed by a computer science professor specifically to teach new programmers how to write programs the "right" way!  Pascal, and especially the P Code compiler, became quite popular for general use.  Features to make it more appropriate for general use were added, often in something of an ad-hoc manner.  Nevertheless, it became quite popular and useful.  Especially with beginning programmers or people teaching beginning programmers.

Step now to University of California at San Diego.  Dr. Kenneth Bowles was running the computer center there and trying to build a good system to teach programming.  He came up with the idea of making a simple system that could run on many different small computers rather than one large computer that everyone shared.  He found the Pascal compiler and it's P Code, and started to work.  He and his staff/students created an entire development system and operating system based on P Code and the Pascal compiler.  It would run on many of the early personal computers and was quite popular.  IBM even offered it as an alternative to PC-DOS (MS-DOS) and CP/M.  I even found a paper on the web from an NSA journal describing how to get started with it.  Imagine writing one program and compiling it, then being able to run that very same program with no changes on your Commodore Pet, your Apple ][, your Altair running CP/M, your IBM PC, or any number of other computers.  Much as Java promises today, but in 64K and 1 or 2 MHz!

Pascal itself has many advantages over C for new programmers.  C (and descendents) have many gotchas that trip up even experienced programmers on a regular basis.  Pascal was designed to protect from many of those.  The language Ada, devised for the US Department of Defense and well regarded when it comes to writing "correct" programs, was heavily based on Pascal.  There are downsides to Pascal, but they are relatively minor and could be rather easily overcome.  Turbo Pascal, popular in the early 80s on IBM PCs and CP/M machines (and one of the main killers of UCSD P System) had many advances that took away most of the problems of Pascal.

So here is the idea.  Write P Code interpreters for some of the small micros available, like Arduino. Modernize and improve the UCSD system, especially the Pascal compiler.  Create a nice development system that allows Pascal to be written for most any small micro.  Many of the problems of C, C++, and Arduino go away almost instantly.  The performance won't be as good as native C produced machine language, but close enough for most purposes.  Certainly much better than other interpreted language options.  Niklaus Wirth went on to create several succussors to Pascal, including the very powerful Modula 2 and Oberon.  Modula 2 was implemented early on with M code, similar to P code.  He even built a machine (Lilith) that ran M Code directly.  If Pascal isn't right, perhaps Modula 2 would be.

In any case, I think this is a path worth investigating.  I plan to do some work in the area.  I would be very interested in hearing what you think about the idea.

Oh!  And what about the title of this blog post?  Well, the P Code part should be fairly obvious at this point.  But the rest of it may not be.  My alma mater, Austin Peay State University, has a football fight slogan of "Let's Go Peay!"  I love that slogan, so I thought it would fit well with this idea.

Let's go P Code!

4 comments:

  1. I'm game. I've always wanted to write something in Turbo Pascal; but as I was learning to program I was told, "Kid, don't bother, it's a dead language." Sad stuff.

    ReplyDelete
    Replies
    1. It's funny. Pascal was all the rage for about ten years, from the mid 70s to the mid 80s. Then C took over. I suspect part of that was because many Pascal compilers produced not-so-fast code. But since then, people have been looking for an alternative to C that avoids many of C's flaws. C++, Java, C# are popular largely because of that. Ada, which was modeled on Pascal, has been increasing in use for that very reason. But Pascal is very similar to C with safety features. It really is a good language to start with, and continue with. Alas, a few modifications to the language would make it even better.

      Delete
  2. Nice to see the wheels are still turning at full speed, sounds like an interesting idea.

    ReplyDelete