Scripting Languages as a Step in Evolution of Very high Level Languages

Dr. Nikolai Bezroukov

Version 0.90

Copyright 1998-2017, Dr. Nikolai Bezroukov. This is a copyrighted unpublished work. All rights reserved.



Rule: "Don't Try To Force People" Programmers are smart people. They are engaged in challenging tasks and need all the help they can get from a programming language as well as from other supporting tools and techniques. Trying to seriously constrain programmers to do "only what is right" is inherently wrongheaded and will fail. Programmers will find a way around rules and restrictions they find unacceptable. The language should support a range of reasonable design and programming styles rather than try to force people into adopting a single notion. -- Stroustrup "The Design and Evolution of C++", page 113

Contrary to popular delusion, the programming world is a lot more diverse than merely Open Systems, @pple, and Micro$oft. Perl Power Tools Webpage

The paper argues the compactness of the code is a very important metric by which to judge programming languages and that other things equal the language in which the solution to the problem can be expressed in the most compact way will be winner on the marketplace. In this sense scripting languages represent the new generation of languages, called very high level languages, the class of languages pioneered by SETL. In best scripting languages the size of the codebase for the solution of the same problem can be up to ten times less than in Java. That increases the productivity of programmers and reduces the number of bug in the code. The immense number of pretty complex LAMP-based e-commerce WEB sites (including Yahoo) and wide usage of Python by Google suggests that scripting languages, not Java or OO represent the most promising way for developing complex software applications today, and in the foreseeable future. Java isn't going to die any time soon, but we are probably far beyond the hype part of the curve, so it will gradually lose developers to more productive languages.

Scripting languages are the main achievement of the open source movement. Moreover the scripting languages are probably the last bastion that can at least partially protect us from the current software "overcomplexity" push and resulting bloatware be it Microsoft or Linux style. And bloatware kills the idea of open source more effectively that anything else: when implementation language become "a new assembler" like in many C-based open source projects, the level of openness of the codebase is open for discussion.

Anybody who participated in large scale reengineering projects know the feeling when you get thousands of pages of badly documented source code... That's not an open source, that's open mess. Modification, refactoring and rewriting costs grow exponentially with the growth of the size of the codebase: a fairly trivial changes in a large codebase can cost tremendous amount of money (and effort) to implement. As infamous Year 2000 problem showed all too well. And we all know that Java is a very verbose language, and that refactoring is expensive in Java due to static types. That suggests that Java is a very expensive language to work with and opens possibilities for alternatives, and first of all, scripting languages, to enter enterprise environment. Progress in hardware favors scripting languages too.

It still possible to write compact useful programs (where source really can be modified by a person without spending a month in a closet staring on the pages ) in scripting languages and as such they (along with Forth) might represent the last refuge for those who want to adhere to KISS principle. Scripting languages such as REXX, TCL, Perl, Python, PHP and Javascript (JavaScript is prototype-based and very fun to work with when you discover that) are definitely new and very interesting development in software engineering, the first practical examples of "very high level languages" (VHL).

The concept of "very high level languages" was pioneered more then 30 years ago by SETL [Schwartz1970, Bacon2000]. In his doctoral dissertation David Bacon explains the value of SETL in the following way [Bacon2000]:

First of all, SETL strives to put the needs of the programmer ahead of those of the machine, as is reflected in the automatic memory management, in the fact that flexible structures can be employed as easily as size-constrained ones can, and in the presence of an interface to powerful built-in datatypes through a concise and natural syntax. This high-level nature makes SETL a pleasure to use, and has long been appreciated outside the world of distributed data processing. Flexibility does not itself ensure good discipline, but is highly desirable for rapid prototyping. This fills an important need, because experimentation is a crucial early phase in the evolution of most large software systems, especially those featuring novel designs [ 135 , 173 , 66 , 70 , 71 ]. Second, SETL's strong bias in favor of ``value semantics'' facilitates the distribution of work and responsibility over multiple processes in the client-server setting. The absence of pointers eliminates a major nuisance in distributed systems design, namely the question of how to copy data structures which contain pointers. SETL realizes Hoare's ideal of programming without pointers [ 114 ]. Third, the fact that every SETL object, except for atoms and procedure values, can be converted to a string and back (with some slight loss of precision in the case of floating-point values), and indeed will be so converted when a sender's writea call is matched by a receiver's reada , means that SETL programs are little inconvenienced by process boundaries, while they enjoy the mutual protections attending private memories. Maps and tuples can represent all kinds of data structures in an immediate if undisciplined way, and the syntactic extension presented in Section 2.15 [Field Selection Syntax for Maps] , which allows record-style field selection on suitably domain-restricted maps to be made with a familiar dot notation, further abets the direct use of maps as objects in programs, complementing the ease with which they can be transmitted between programs. A similar freedom of notation exists in JavaScript, where associative arrays are identified with ``properties'' [ 152 ].

Open source is the most valuable when you still are able to change the source to better suit you need and still are left with time to work on the solution instead of just maintenance of the modified code. The key advantage of scripting language is common for any VHL language: it is the compactness of the code, the compactness that gives a possibility to write the same applications as a fraction (sometimes 1/10) lines of code in comparison with traditional compiled (C, Pascal) or semi-compiled (Java) strongly typed languages. I would like to stress it again that the mere fact that a typical scripting language allow to shrink the number of lines of code for a typical application several times means lower development costs, lower amount of bugs and potentially better architecture (with additional saving on "software architects": highly paid lucrative positions that proliferated in Java based application development :-). The example of the project in which reference C-code exists and direct comparisons of the sizes of codebase can be made is Perl Power Tools Project.

Please note that with the current level of complexity of applications, C almost deteriorated to the assembler level. Java is just marginally higher and the number of lines you need to write for even a trivial program in Java suggest that the designers of the language did not understand the trend toward very high level languages and the difference between the programming in the large and programming in the small.

As for the number of lines (or more correctly lexical tokens) to express a particular algorithm Java looks like C++- : if this is called a progress, that would be a very strange definition of progress. Java designers tried to create "C++ done right." but "done right" in not enough. It is essentially C++-: C++ with garbage collection and without features like pointers and manual memory allocation. automatic memory allocation and garbage collection is now standard feature of most modern languages so nothing new here. While Java managed to displace Cobol, it did it not without some help of pure luck (timing is all in the software world) and deep valets (huge amount of money spend by Sun and IBM to promote the language and to create the necessary infrastructure ).

Still Java with its "class-based" OO has a serious problem that is only notable as system size grows. Programs rarely remain static, and invariably the original class structure becomes less useful. That results in more code being added to new classes, which largely negates the value of OO systems and lead to "class hell": the number of classes grows to the level when nobody can see the whole picture and due to this different class libraries are reinventing the bicycle. Moreover often the amount of class libraries grow to the level when just loading them at startup consumes considerable time making Java look very slow despite significant progress on JVM side. It looks like Gossling was fixed on C++ and badly missed prototype-based programming ideas, the ideas that found its way into JavaScript. In a recent blog ently he even mentioned:

Over the years I've used and created a wide variety of scripting languages, and in general, I'm a big fan of them. When the project that Java came out of first started, I was originally planning to do a scripting language. But a number of forces pushed me away from that.

James Gosling, Dec 15, 2005

Of course, there is no free lunch and sometimes you pay the price for using VHL instead of plain vanilla high level languages, but with the current 2.4-3.5 GHz CPUs and 8 or 16G of memory on desktops (and even laptops), it is not an unreasonable price for probably 80-95% of the most applications. Small critical part can always be rewritten in lower level language. if necessary, is this approach, not using lower level language for everything is the optimal solution to so called "scalability" problems. The classic observation is that premature optimization (and choice of lower level language is a premature optimization) is the source of most problems in large programming projects.

At the same time I think that it's a little bit naive and premature to search for a scripting Eldorado. None of the existing scripting languages is perfect. I also think it's a good idea to be wary of "the one true way" of anything. Let's accept the fact that programmers benefits from the use of multiple paradigms and multiple languages.

I think that it's a little bit naive to search for a scripting Eldorado ;-).

None of the existing scripting languages is perfect.

I also think it's a good idea to be wary of "the one true way" of anything...

And you do not necessary need a scripting language with the most sophisticated OO system, the key idea of scripting languages is that everything is a string. Therefore while OO is a nice thing to have as a way to partition namespace and provide some useful primitives for iterators (like in Ruby) it is not end in itself. Moreover, if one really wants an OO language, other things equal it might be beneficial to use a prototype based OO implementation, which better suits scripting languages then a static class-based OO implementations. In this case each object is essentially a hash that contains slots for dynamic code and methods, which can be assigned to existing slots or additional slots dynamically. Among major scripting languages only JavaScript uses prototype-based OO model.

But again, the key is powerful scripting handling capabilities. As Ronald P. Loui aptly noted, paradoxically, even one of the simplest scripting language in existence (GAWK) that has decent string processing capabilities can be a powerful tool for complex software development tasks like AI prototyping and among his students those who used GAWK for the class project turned out the best work:

Most people are surprised when I tell them what language we use in our undergraduate AI programming class. That's understandable. We use GAWK. GAWK, Gnu's version of Aho, Weinberger, and Kernighan's old pattern scanning language isn't even viewed as a programming language by most people. Like PERL and TCL, most prefer to view it as a "scripting language." It has no objects; it is not functional; it does no built-in logic programming. Their surprise turns to puzzlement when I confide that (a) while the students are allowed to use any language they want; (b) with a single exception, the best work consistently results from those working in GAWK. (footnote: The exception was a PASCAL programmer who is now an NSF graduate fellow getting a Ph.D. in mathematics at Harvard.) Programmers in C, C++, and LISP haven't even been close (we have not seen work in PROLOG or JAVA).

Such a paradoxical advantage in productivity might be partially explained by the fact that for a simple scripting languages students struggle less with the language and can devote more time to the task itself. Python, Perl, Ruby and other more complex scripting languages have a much steeper learning curve, but are more suitable tool for professionals. Still it is interesting to note that AWK while one of the first scripting languages invented is simultaneously "the last of Mohicans": language without feature creep that even on old hardware could be used very efficiently with pipes. Old DOS AWK interpreters were as small as 160K of uncompressed executable (mawk.exe). That's simply incredible in the current world of bloated 100K "Hello world" programs :-)

More compact and cleaner code that is achievable by using scripting languages often helps to achieve higher quality in ways that are not immediately obvious. Paraphrasing Greenspun's Tenth Rule of Programming we can suggest that:

Any sufficiently complicated C or Java program contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of TCL.

Of course scripting languages still need a lot of work. There are several key areas here:

The strings operations set and fine details of implementation. In a sense as for Unix everything is a file for scripting languages everything is a string. And the idea of "string as an ultimate method of unification and expression of complex data structures " proved to be very powerful and flexible paradigm. At the same time strings are rather complex to implement efficiently and they require garbage collection to be present in the language; that's why there were left out of C despite the fact that the major C prototype language (PL/1) has very rich support for strings. And that's why C++ is really deficient in this area: string class is too little too late. Implementation details also are non-trivial. For example, mutable strings might be an advantage for certain very frequent operations like chop in Perl. Here you can see that not everything is actually an object, sometimes string is simply a string :-). Not everything is actually an object, sometimes string is simply a string :-). The set of string operations needs to very carefully designed as shortcuts do matter and orthogonal approach is rather naive in this area (think about implicit value of Huffman encoding here: the most frequent operation mush have the shortest representation :-). All existing languages are deficient in this area and set of string manipulation functions looks like student diploma work when the person who wrote it never studies previous generations of languages (or were too preoccupied with other things) and/or did it in a haste to get rid of the task. In many cases the set of operations lacks real expressive power and flexibility. For example this set of functions should be a almost perfect superset of operation on arrays, as every string can be viewed as array of characters. Here good old Perl is still competitive with newer offerings despite all its warts. In Perl strings and arrays have completely distinct sets of operations, for example substr in stings corresponds to splice for arrays, but the correspondence is rather fuzzy; functions like chop is alias for substr(string,-1,1)=1 , but lacks the ability to chop chunks larger then one byte, in a way chomp is a caricature of trim from REXX, etc).



Exceptions and coroutines should be transparently integrated into the language. So far only Ruby does this and provides based on co-routes concept of iterators: File.open(file, mode) do |f|

# do something with f (= file handle)

end # f is closed automatically at this point

Respect toward predecessors . A language should adhere to principle of "least surprise" and do not break with previous languages (and first of all C/C++ family as the dominant family of languages) unless it is justified by some gains in power or transparency. One counterexample is Perl redefinition of additional control statement in loops. Also many language designers understood that C-style code blocks ( { } ) waist important symbol and are not shorter notation the Algol style blocks (do - end, with do optional). But they never fully implement Algol-style notation either. For example Ruby does not address the problem of multiple closure of several blocks with one end statement like PL/1 did many years ago with labels.



Simplicity and transparency of connecting to high level language (C or C++). This is important for usage in large projects. Many programmers (for example Richard Stallman) just do not understand the difference between programming in the large (scripting) and programming in the small (creation of the component set) and think that a single language should be used for both activities (look at the discussion related to TCL vs Guile for more information). This is not true. Components can be produced in other language, the language that has lower level and which is more flexible in operation on the level of detail required for some of them. And the larger the project is, the more components are specialized enough to benefit from implementation in the second language. Here TCL is really good (here you just need to look at the implementation the such masterpiece of software engineering as Expect for inspiration), but Python and ruby should be considered too. Both provide clean interface to C++ and C respectively. The advantage of Microsoft .NET framework is that it permits using the same runtime engine for multiple languages. The same advantage can be achieved using scripting langue that is compiled into JVM.



Level of integration with OS needs to be improved. Quality and availability of "connectors" that permit using OS API (both built-in in the language and external libraries)) might make 80% of the usability of the language in a large and complex programming projects. In this (limited) sense libraries are more important then the language itself. And it takes a lot of time (or money or both) for a language to get a quality libraries.



Debugging tools are still insufficiently developed ; paradoxically debugging for scripting language can be more complex then in mainstream languages like C/C++ for which the level of language is lower and the tools are definitely more mature, feature rich and often commercially supported. Debugging tools available even for popular scripting languages such as Perl, PHP and Python are still rather crude. This has a real impact when working on non-trivial programs. Paradoxically for complex application development the quality of the debugger is often as important as the quality of the language implementation. It's is actually an important part of the quality of the language implementation. It is not accidental that Donald Knuth, who probably is one of the greatest computer scientists of all times, preferred to work with the language that is best integrated with the OS and has the best debugger. For a viable scripting language the debugger should be part of the language design and the key part of the implementation not an afterthought. Scripting language designers are still slow to realize this shift of the paradigm. In this area significant progress is needed. IDE environments like Active State Komodo can help too (I can attest that they manage to eliminate problems that hunted earlier versions and version 3.1 and later are usable for Perl).



Warts needs to be removed. While each of the scripting language has innovative features in the design , the strong points that helped wide adoption of the language, in certain areas each of existing scripting languages has problems that need to be recognized and rectified ASAP. For example I cannot explain why Perl does not (yet) allow coroutines and multiple (labeled) closures of nested blocks (like PL/1); or why the problem of scalar variables comparison (like in if ($a == $b ) if both are strings) was not treated more seriously with interpreter warnings or some pragma constructs (like in (string)if ($a == $b ) for the prev example ) . Also there is mismatch of built-in functions between string and arrays, although many of them have very similar semantic. See my older paper for the discussion of those issues in Perl [Bezroukov2005]



While each of the scripting language has innovative features in the design , the strong points that helped wide adoption of the language, in certain areas each of existing scripting languages has problems that need to be recognized and rectified ASAP. For example I cannot explain why Perl does not (yet) allow coroutines and multiple (labeled) closures of nested blocks (like PL/1); or why the problem of scalar variables comparison (like in if both are strings) was not treated more seriously with interpreter warnings or some pragma constructs (like in for the prev example ) . Also there is mismatch of built-in functions between string and arrays, although many of them have very similar semantic. See my older paper for the discussion of those issues in Perl [Bezroukov2005] The performance of virtual machine and garbage collection can be improved; better profiling tools badly needed. While it is true that an virtual machine based scripting language will rarely equal the performance of optimized native code and the quality of the virtual machine implementation often can be significantly improved, this emphasis on ultimate performance is often very naive and is completely misplaced. Only a very small part of the application (less then 20%) have significant influence on the total time consumed in performing a particular function. Detection and selective optimization of those critical parts and if necessary rewriting then in complied language can help to archive parity or even beat Java and C++ based applications in raw performance. But here we need an adequate profiling tools. This is especially important because all scripting and languages have automatic memory management with garbage collection. The latter can have a significant impact on the performance profile of a script if large volumes od data need to be processed. Much depends on the nature of the program: for real-time applications this might be an additional concern that needs to be thoroughly investigated with profiler, but for many other programs it's not. Each language's garbage collection implementation differ, so the scripting language selection may need to take this into account to find a better match with the application needs. One way to bypass this problem is mimicry: usage of .NET or JVM opens really significant tool chest that might be too expensive and/or time consuming to develop for a particular language alone.



While it is true that an virtual machine based scripting language will rarely equal the performance of optimized native code and the quality of the virtual machine implementation often can be significantly improved, this emphasis on ultimate performance is often very naive and is completely misplaced. Only a very small part of the application (less then 20%) have significant influence on the total time consumed in performing a particular function. Detection and selective optimization of those critical parts and if necessary rewriting then in complied language can help to archive parity or even beat Java and C++ based applications in raw performance. But here we need an adequate profiling tools. This is especially important because all scripting and languages have automatic memory management with garbage collection. The latter can have a significant impact on the performance profile of a script if large volumes od data need to be processed. Much depends on the nature of the program: for real-time applications this might be an additional concern that needs to be thoroughly investigated with profiler, but for many other programs it's not. Each language's garbage collection implementation differ, so the scripting language selection may need to take this into account to find a better match with the application needs. One way to bypass this problem is mimicry: usage of .NET or JVM opens really significant tool chest that might be too expensive and/or time consuming to develop for a particular language alone. Level of safety needs to be improved. Typically scripting languages are typeless. While this is definitely a more reasonable compromise that type safely straitjacket of Java it can create some additional problem which can easily be rectified by high quality cross reference tool, name space diagrams, pretty printers etc. Actually exactly because of weak typing, high quality cross-reference tools should be considered as a part of any decent scripting language implementation, not an add-on tool. Unfortunate support for those tools are horrible and urgently needs to be improved. IMHO currently only Perl and JavaScript have more or less adequate pretty printers and cross reference tools.

But scripting languages quickly evolve and each year they become more and more competitive with Java and C++ for developing mainstream enterprise applications. In its turn Java tried to adapt by adding regular expressions and coroutine emulation (via threads) to the mix. while I was a critic of Java from the beginning, now I started to realize that Java can serve as a lower level implementation language for scripting languages and that usage of common virtual machine environment represents a significant advantage that should not be overlooked.

Due to tremendous push and amount of money spent on creating Java infrastructure (supported by huge amount of money from Sun and IBM) on the current stage of development of scripting languages it might be wise to use scripting languages that try to utilize JVM. At least for large and demanding software projects usage of JVM (and all corresponding infrastructure) is a big plus. This way you have the space to retreat in case things go wrong and can switch back to Java on certain parts where scripting language proved to be less suitable and "programming in the small" language is needed. A dozen of such languages already exist with Jython as probably the most prominent example. As there is a JavaScript implementation in Java (Rhino) it should be seriously considered too. Other implementations like Beanshell and Groovy have their advantages too. Groovy is probably the most fashionable of JVM based scripting languages.

It's still unclear which scripting language prevail in a long run, therefore right now one should probably diversify and experiment with several of them. Moreover, if they are all using .NET or JVM, then different languages can be optimal for a different parts of the project. But still any large project should have the "principal" language, the language that you feel best match the majority of the project's needs. It's just impossible to learn several scripting languages to an equal degree. I currently consider Perl to be my primary scripting language, but there is no JVM based implementation of Perl and that affects scalability. I also use Python for tasks that benefit from coroutines. Python also has a distinct advantage of having a JVM-based implementation (Jython). Still Python puts more restrictions than Perl and in this sense is a little bit lower level language. Python's innovating "indentation reveals real block structure" solution partly compensates for that as it produces more vertically compact programs. moreover you can chose your style of braces and prettyprinting as it's easy (and probably necessary) to imitate C-style curvy braces using comments and a pretty printer. In this sense the Python is the most modern language, the language where the editor in IDE should contain pretty printer by default.

But the main reason I prefer Perl is that I think it is a little bit higher level language then Python with better connection and "physiological compatibility" with shell and Unix utilities; and for me that's extremely important consideration. It also has an excellent debugger. Really excellent. Which while not apart of the language is a part of programming environment for this language and a very important one.

Perl provides advantages when I need the maximum power for rapid prototype development and I am ready to pay for this power with some inconveniences. Also unlike Python, Perl has mutable strings and that means that operations like chomp does not create the whole new string just to cut off the last byte out of the old string. At the same time I would like to stress that everybody who likes Unix needs to know TCL at least to the level that is necessary to use Expect, a really brilliant, breakthrough application based on TCL.

Everybody who likes Unix needs to know TCL

at least to the level that is necessary to use Expect

TCL and REXX are probably the most underappreciated scripting languages in existence. Both have almost zero learning curve which gives then advantages that Basic enjoyed in the past. REXX syntax was by-and-large borrowed from PL/1 and is more readable, while TCP has really minimalistic syntax. A little know fact is that in REXX the functionality of the ADDRESS command was expanded to include running applications with the Commodore-Amiga line of computers, thus allowing ARexx (the Amiga's specific dialect of the language) to control not only resident but also currently running programs equally well. This is a unique capability that to this day did not found way to other scripting languages. Also REXX was the first language that served as both macro language and regular scripting language in two important, although abandoned OSes: Amiga and OS/2. It implements a very innovative approach: any command that is not recognized by REXX interpreter is considered to be an application command (if it used as a macro language for an application, for example an editor) or shell command. For those interested in scripting languages I strongly recommend to play with THE editor that can serve as a classic example of an application which uses REXX as a macro language.

TCL interpreter implementation (or for smaller projects AWK inteprter implementation) actually can as set of libraries in any large and complex project that is based on C. To rephrase famous Greenspun's Tenth Rule "Any sufficiently complicated C or Fortran program contains an ad hoc, informally specified, bug-ridden, slow implementation of half of Common Lisp" one can say that "Outside of kernel programming any sufficiently complicated and large C-based programming project usually reimplements 50 to 80% of TCL interpreter".

"Outside of kernel programming any sufficiently complicated and large C-based programming project usually reimplements 50 to 80% of TCL interpreter".

In this case TCL can be used for programming-in-large and C for programming in the small. Few people understand that TCL+C is underappreciated and a unique development technology. Generally combination on any scripting language with clean interface to C or C++ (or any other suitable high level language) and C/C++ is a very powerful software development paradigm. This paradigm is different and probably in some ways superior in many real life scenarios to overhyped single complex OO languages be it Java, Ruby, C++ or C with some STL-like library like glib and other "Swiss army knife" libraries for C. Actually premature abstraction of a problem into classes is similar in its long term destructive effects to premature optimization.

Premature abstraction of a problem into classes is similar in its long term destructive effects to premature optimization

Actually combination of JVM based scripting language and Java is also a very promising development that can compensate several weakness of Java as a system implementation language. At least one classic scripting language now have Java-based implementation (Jython) and there are also active development on new scripting languages explicitly designed to be macro languages for Java.

It is interesting to note that a lot of important open source projects like MC are now got into a tar pit of the combination of too many C libraries and because of limited manpower might stagnate because nobody is able to see the whole picture and drive the architectural development. I think that the deterioration of the quality of the C-based codebase and termination of development of many promising open source projects due to exceeding of the critical size and the lack of manpower to support the codebase of this complexity is more common reason of stagnation of open source projects then many people would like to accept.

The deterioration of the quality of the C-based codebase and termination of development of many promising open source projects due to exceeding of the critical size and the lack of manpower to support the codebase of this complexity is more common reason of stagnation of open source projects then many people would like to accept.

If you look, for example, at MC, then it is clear that mono-language programming automatically runs into serious difficulties after a certain size, and experience periodic crisis's coused by the change of the guard. For some periods it can even became "open source abandonware". With very complex codebase and without funds and discipline typical for commercial programming environment further development became a road to nowhere. In such cases it might make sense to switch to scripting language + compiled language combination just to shrink codebase to a manageable size.

In dual-language ( scripting language + C/C++, or scripting language +Java) implementations each language offers support that is useful in non trivial ways. For example C-programs can use libraries that were developed for the scripting language interpreter (and those libraries are usually higher quality then a typical C library) and even polish them based on their understanding of the project; this understanding that also increases their understanding of the scripting language implementation. It is interesting to note that this crucial advantage of dual language programming with "programming in the large" performed by a scripting language and "programming in the small" by a lower level strongly typed language was never understood by Richard Stallman and probably was one of the reasons behind stagnation of GNU project: scripting language are the essence of open source, but all major scripting languages originated outside the project and non of major GNU projects uses this approach. BTW none of major scripting languages adopted "pure" GPL license (which is a good thing ;-). As John Ousterhout aptly put it:

I think that Stallman's objections to Tcl may stem largely from one aspect of Tcl's design that he either doesn't understand or doesn't agree with. This is the proposition that you should use * two * languages for a large software system: one, such as C or C++, for manipulating the complex internal data structures where performance is key, and another, such as Tcl, for writing small-ish scripts that tie together the C pieces and are used for extensions. For the Tcl scripts, ease of learning, ease of programming and ease of glue-ing are more important than performance or facilities for complex data structures and algorithms. I think these two programming environments are so different that it will be hard for a single language to work well in both. For example, you don't see many people using C (or even Lisp) as a command language, even though both of these languages work well for lower-level programming. Thus I designed Tcl to make it really easy to drop down into C or C++ when you come across tasks that make more sense in a lower-level language. This way Tcl doesn't have to solve all of the world's problems. Stallman appears to prefer an approach where a single language is used for everything, but I don't know of a successful instance of this approach. Even Emacs uses substantial amounts of C internally, no? I didn't design Tcl for building huge programs with 10's or 100's of thousands of lines of Tcl, and I've been pretty surprised that people have used it for huge programs. What's even more surprising to me is that in some cases the resulting applications appear to be manageable. This certainly isn't what I intended the language for, but the results haven't been as bad as I would have guessed.

This might be one reason why Microsoft managed to beat open source development tools with the introduction of .NET line of languages and development tools. And that happened despite the fact that development tools benefit from open source development more then other projects. It just have shown that right architectural decisions can supersede advantages of particular development methodology and chanting the words "Freedom, freedom" is not a replacement for architectural abilities.

Actually all the posts in the infamous Stallman-initiated attack on TCL, called "TCL wars" deserve to be read just to understand complex interplay of factors between programming in the large and programming in the small and the level of misunderstanding of the difference between those two that is typical even for pretty gifted programmers like Stallman and Gosling. BTW the resulting Stallman-inspired alternative to TCL -- GNU scripting language (Guile) proved to be stillborn. So far no major GNU projects adopted it as a macro language and no important Linux application uses it. So by abandoning TCL (and failing to produce a viable alternative) Stallman essentially undermined the long term viability of GNU project...

Perl and Python can be considered as attempts to provide a "compromise" language that is usable for both programming in the large and programming in the small. Paradoxically for Perl I say several successful attempts to use bash as the language for "programming in the large" and Perl as "programming in the small". Those attempts are probably inspired by widespread usage of AWK scripts in large shell programs. also with version 4 bash became more viable programming language then before.

But here Python has an important advantage: unlike Perl, Python have more or less usable interface to C++, so it can be used for dual language programming, although such cases are still infrequent ( Python philosophy is generally that same as Perl ). Despite difficulties with the managing huge and very complex interpreters both Perl and Python have a very strong following and nothing succeed like success. Python generally overtook Perl after, say year 2000, and now became the most common language for writing applications. several large and important system application in Red Hat are written in Python. Still they differ only in nuances, not in principle and both represent an approach opposite to TCL: an attempt to replicate PL/1 approach to language design on a new level. Whether the "Right thing" approach (Python) is better then "New Jersey" or "worse is better" approach (Perl) is debatable. Perl still hold its own as the second language (after shell) for sysadmins, because it is closer to bash and has an excellent debugger. None is a silver bullet that solves all the software-engineering problems. Python used enjoys all the hoopla connected with object oriented programming, but in 2017 object oriented programming is an old hat.

One test of whether someone is a good programmer is to ask him about the shortcomings of the tools he uses. Watch if he talks only about language constructs. He/she probably is a mediocre programmer. The "total" language environment (language + IDE + debugger + libraries) is as important, or more important, then the language itself. Someone who do not understand that flaws and limitation of their favorite language can be compensated by the environment (Java is a nice example here -- it is a mediocre language with an excellent environment) , who cannot view the language as a part of a larger development environment, is either unable to think analytically and thus cannot be a good programmer, or is blindly partisan (i.e. a zealot) like many participants of Perl vs. Python debate; but please note that even the worst participant of Perl vs. Python debate is usually heads above participants of Linux vs. Windows advocacy wars...

Programming, at least as I understand it, is both art and science, and inability to see a larger picture of environment in which the language is imbedded as a part of it as well as to view implementation of a programming environment as continuation or at least an important "feature" of the language design is a serious intellectual limitation.

An interesting question is why "worse is better" approach is so successful. Why can complex, non-orthogonal and far from being elegant languages make it to mainstream ? I think a partial answer might be that pure luck (of which timing is one important dimension and the place where the language was born another). It certainly plays more important role in the language success that one might think. Early comers that managed somehow to grab the niche and the critical mass of users have a tremendous advantage: one thing is to read language manual and appreciate how good the concepts are, and another to bet your project on new unproved language without good debuggers, manuals and, what is even more important, libraries.

I suspect that the quality of the debugger and level of integration with the underling OS (libraries) are probably as important or more important then the language itself. Think about Perl debugger and CPAN (Comprehensive Perl Archive Network) as the major parts of ensuring the language success. BTW, paradoxically, for Java only Microsoft used to have a Java development environment (J++) that was well integrated with the OS and had a good debugger. From this point of view, it's clear that Sun's management in its infinite wisdom killed with its lawsuit a little bit more then just "deviations from the standard language implementation". As the result Microsoft created its own version of java -- C# -- contributing to balkanization of programming (not that they care much about it).

From my point of view languages are much like cars. For many people car is the thing that they use get to work and shopping mall and they are not very interested in such facts as is the engine inline or V-type and the use of fuzzy logic in the transmission. What they care is safety, reliability, mileage, insurance and the size of trunk. In this sense success of the "Worse is better" approach should not surprise anybody.

A popular belief that scripting is "unsafe" or "second rate" or "prototype" solution is completely wrong. And due to sucees of Pyton is by-and-large history. Still it is importantto understand that if your project had died, then it does not matter what was the implementation language, so for any successful project and tough schedules scripting language (especially in dual scripting language+strongly-typed language combination, for example TCL+C, .NET or JVM-based pairs) might be more optimal blend than a single bulky OO language like C++ or Java.

Flexibility and higher level that scripting languages provide is a strategic advantage for any complex software project because the experimentation is the crucial stage of the development of any large software project, especially those featuring novel designs. In this sense any large and complex programming project includes tremendous amount of prototyping. Only experimentation can help to move the project toward adopting a solid architecture that significantly increases the chances a complex software project chances to succeed.

At somne point Groovy programming language got traction for the Java Platform. But later this momentum dissipated. As Richard Monson-Haefel put it [Richard Monson-Haefel blog]

"Groovy represents the beginning of a new era in the Java platform, one in which the Java community embraces language diversification and harnesses the full potential of the Java platform." ... "Groovy is the best choice because it was built from the ground up for the Java Platform and uses syntax that is familiar to Java developers, while leveraging some of best features that Python, Ruby and Smalltalk have to offer."

For programmers in general and application programmers in particular the most productive years when they can make their mark are rather short (say 25 to 45 or even 20 to 40) to make such mistakes as selection of wrong, too low level, implementation language. Even if you you are bound to Java you can save some time on tests [Practically Groovy- Unit test your Java code faster with Groovy] and that's still important saving.

If the project had died, then it does not matter what was the implementation language

Moreover architecturally such an approach helps to separate architectural decisions from implementation details much better that any OO model with huge amount of beautifully looking UML diagrams (and especially with huge amount of completely detached form reality UML diagrams, which is the most common paradigm of UML usage ;-). In this sense firing people who overemphasize the UML usage might be not a bad idea of solving programming project manpower shortage problems; at least the manager gets a space for new people in critical areas without getting over the budget :-).

Paradoxically but with 3GHz or better CPUs and 1GHz or even on desktops even tasks that handle a fair amount of computations and data (computationally intensive tasks) became more viable for such languages as Python and Perl. In some cases (but not often !) such solutions might be even competitive with C++, C# and, especially, Java. The reason is that when you are operating at a higher level, you often are able to find a better, more optimal, algorithm, data structures, problem decomposition schema or all of the above. That's the same argument that many years ago was promoted by high level languages adherents against assembler and it is still true on a new level.

Actually if you know the history of language development, then OO languages will remind you so called compiler-compiler approach -- the class of languages that extends itself by accommodating new constructs. The problem with this approach is the diagnostics is either runtime, or sucks, or both. That mean that for complex projects the direct construction of a specialized language with YACC+TCL+C as a poor man compiler-compiler implementation tools can be a better approach that might exclude a lot of run time overhead inherent in OO.

Design of a set of classes can be (and often is) as time consuming (and politically charged) as compiler-construction approach to software design (each large software project has a specialized language buried in it) and actually share a lot of similar challenges. At the same time even well-designed set of classes is inferior to a specialized compiler/interpreter in several major aspects, first of all in compiler-time error checking and efficiency. In the absence of specialized complier one can use TCL or Python to glue low level C modules that implements specific language constructs to produce more or less clean and debuggable "pseudo-compiler" solution.

In this respect I see a general trend toward more expressive, "very high level" solutions, the trend that initially helped to dirve Perl into prominence to continue. It is this trend that launched LAMP (Linux-Apache-MySQL-Perl/Python/PHP) toolset is a tremendous achievement, the crown jewel of the evolution of open source programming ecosystem. Here neither Linux not MySQL play a significant role and can be substituted with other OS (FreeBSD, etc) or database (Postgress, etc). For this reason LAMP should probably more correctly called WDS (Web server-database-scripting language). Solaris, FreeBSD or even Windows can be used instead of Linux with the same tool set. The same is true for MySQL, which is just one database out of several possibilities.

Note that LAMP became the cornerstone of Web site development despite complete ignorance of this topics in all major universities CS curriculums. That suggests that we should treat Java and OO with skepticism they deserve as any proponent needs to explain paradoxical fact that most commercial WEB sites (which are actually a pretty complex software applications) are now driven by LAMP (or more correctly by WDS). Despite the fact that PHO is badly designed, inferior to iether Perl or Python or Ruby programming language, the fact that it was first language to intregrate with Web server and later database ensured its lasting success. Even Yahoo now uses PHP for the development of its huge franchise, which (although it's a complex mix of informational and e-commerce sites) both in complexity and traffic requirements probably belongs to the top dozen of world e-commerce sites. Moreover the trends in hardware probably will help to preserve and extend scripting languages dominance in WEB applications despite paradoxical inroad of Java on the server side in large enterprise environments as a new Cobol (and like was the case with Cobol, not without some help from IBM I think ;-)

Now let's briefly discuss debugging issues. One of the best things about scripting is that it encourages to create a dramatically more compact programs then compiled languages or, god forbid, OO languages. The length of some trivial Java programs might lead to a suspicion that progress in CS simply stopped. And the length of the program is highly correlated with the number of bugs in the code. We all should remember from the assembler vs. high level languages debate. Actually John Backus Fortran still stands out as a real breakthrough. It is interesting to note that Fortran vs assembler was ~ 1:20 improvement in line of code metric (LOC). It was never never exceeded since I think (Perl to C is probably 1:5). And they fought and won against good assembler coders on a computer that has less power (and memory) then Palm I. Much less then the cheapest smartphone (which looks like a supercomputer in comparison with early vacuum tube based machines). Actually Fortran compilers used to have the most sophisticated optimizations and even in the first compliers they were far from trivial. Almost all major early optimization papers were devoted to Fortran.

I would like to stress it again that in line of code metrics scripting languages hold indisputable advantage over Java. For example, Python programs are typically 3-5 times shorter then equivalent Java programs. Like JavaScript (and unlike Java), Python supports a programming style that uses simple functions and variables without engaging in class definitions. Perl is even shorter. Both Perl and Python comes "with batteries included": standard modules that connects to sockets, parses HTML, etc. In other words both have enormous standard libraries, which was lately enhanced for numeric processing, string processing and database interface modules.

In line of code metrics scripting languages hold indisputable advantage over Java

The shorter code not only lead to much less number of bugs in the program (the complexity of the program grows as least as a square of the number of lines of code in the program); more expressive language prevents reinventing the bicycle and thus might save both execution and debugging time. In certain cases when a small part of the program needs really top efficiency and consume the largest amount of time, scripting language can be used to generate C code for a particular special case, then compile it and execute this generated specialized for this particular case module on the fly instead of writing generic C-code and paramerizing it to death. Time to compile and link a small C-program on a typical modern server with several 3Ghz CPUs is less then a second. It can be made much less by using specialized instead of general purpose compiler. In this sense "on the fly" compilation of computationally intensive parts of the applications is a viable optimization strategy on today's hardware.

As for manageability of the code high-level coding that scripting language promote is easier that writing millions of classes and even can be fun. One can also rather quickly become expert in any scripting language, at least sooner then Java expert with its nightmare of class libraries that can do everything but at the same time can do nothing properly. String-oriented languages such as TCL and Perl also encourage uniform treatment of complex data formats and naturally blend with XHTML and XML.

As for competition inside scripting languages family, I would like to note that the scripting language landscape mirrors "winner-takes-all" mentality of a larger culture: in a set of competing languages, the one with the largest user community and more financially sound sponsors will gain size, at the expense of smaller ones, regardless all but the most blatant discrepancies in quality of the technology. That effect, clearly visible with PHP success, can be called a "Softpanorama law of language design".

In a set of competing languages, the one with the largest user community and more financially sound sponsors will gain size, at the expense of smaller ones,

regardless all but the most blatant discrepancies in quality of the technology.

With the maturity of the WEB the size of the community was the major driving force behind the scripting languages. The days of great surprises and surprise winners (as for example PHP victory over Perl in WEB site scripting) are over. Python success partially can by explained that Google sponsored the language, while O'Reilly abandoned Perl, leaving the development in limbo, because of the complexity of the codebase, which exceeded the size for a volunteer software project. Despite being open source efforts, the development of scripting languages now became a cruel, unforgiving area ruled by the merciless dynamics of the marketplace. Of course, there are other valuable scripting languages like REXX, LUA, Erlang, etc and they also deserve study and might be successfully used for certain projects. But I think that "big seven" ( bash/ksh93/PowerShell, Javascript, PHP, Perl, Python, Ruby, and TCL) are here to stay. Please note that each of them has particular set of strong points, the set that makes it uniquely valuable:

JavaScript main attraction is a very clean syntax and the ability to be used both as macro language and as regular shell (actually JavaScript is an underutilized as shell in Windows environment despite the fact that it is available for many years via WSH). It also has superior object model. JavaScript is the only mainstream language that uses a prototype-based OO implementation (pioneered in Sun Self language). With typical class-based OO languages like C++ and Java objects come in two general types. Classes are templates that define set of variables and methods for the object, and instances are populated classes -- "usable" objects with memory allocated and values of variables filled. They cannot be extended dynamically at run time. In prototype-based OO the structure of the object is dynamic at run time and new objects are mainly constructed via c loning by copying the variables and methods of an existing object (its prototype). And the new object can be modified dynamically (extended) without affecting its parent. Reverse is not generally true. In most prototype-based OO implementations the child object maintains an explicit link (via delegation ) to its prototype, and changes in the prototype cause corresponding changes in all its clones. But in some prototype-based OO languages there is no such link. This is the foundation of prototyping: defining classes is unnecessary, every object is just a dynamically extended instance of parent. Procedures are first class data objects that can be assigned to slots of the object dynamically. It's like "programming by example" instead of premature mapping of the problem in hand into the rigid set of classes and then later regretting about naive decisions made :-). Furthermore, every object is essentially in Perl-terms a hash where you can set the values of the hash as both variables and methods for that object. See Wikipedia article Prototype-based programming for more details. While not widely used as shell the possibility of using JavaScript as both shell and macrolanguage reminds me REXX in OS/2. It's the only second scripting language that is supported by major Microsoft Office applications and Lotus Notes. Also it is used in Macromedia Flash and as such probably became the most important language for computer animation. JavaScript is embeddable language and as such can be used instead of TCL. Now it became part of AJAX.

Perl is a interesting attempt to create "the next generation shell" for sysadmin, but Perl 5 proved to be a good general purpose scripting language. It is unique in several areas: one is that it offers explicit pointers -- a very useful, but very rare construct in scripting languages (OO provide some form of imitation, thouth). It probably the easiest to learn and use for Unix system administrators who are already familiar with Unix shell programming environment. I clearly see its disadvantages (convoluted semantic, absence of coroutines, inability to use pipes for connecting loops and subroutines like in ksh93). Still it's a great language with many interesting and non-trivial ideas (for example very flexible set of control statements, powerful open statement where one can use pipes, etc). At the same time Larry Wall already has an established reputation among language designers as a man capable of fixing one thing and simultaneously breaking two others ;-). Just note how brilliantly he introduced the possibility of slick, subtle errors into Perl scripts by using two different sets ( ==, >, < vs. eq, gt, lt ) of comparison operators (the first set casts both operands into numeric representation and the second set casts them into strings. This is especially dangerous for developers who use several languages. For example, if today I programmed in C and then switched to Perl, I would automatically write: $a='abba'; $b='baba'; if ($a==$b) { print "Strings are equal

"; } with quite interesting results... That's probably the most impressive progress in this area after famous if (i=1) ... design solution in C :-). But C designers tried to eliminate as many symbols as possible in order to avoid low reliability of typewriters. There is no such justification for Perl. In some sense Ruby can be considered to be Perl 6 and might benefit from mass defections from the mainstream Perl community, but still it is to early to tell.

PHP pioneered higher level of integration with web server and database. IMHO its slogan can be "Life is too short and it's silly to spend it on solving configuration problems" ;-). On Unix VBscript is not attractive and outside of Alice in Wonderland world of mega-corporations Java proved to be more the part of the problem, then part of the solution. PHP was the first scripting language which successfully addressed this need for higher level integrated tool set for Web site developers on Unix. It was designed explicitly as an integral part of the Webserver/database/scripting_language troika often called LAMP. This proved to be a very powerful toolset, suitable for solving wide range of tasks. That's why PHP successfully deposed already entrenched Perl in this particular area. This seamless integration with MySQL database and Apache WEB-server really makes PHP an important pioneer in the scripting language world. Although initially it was an open source server-side HTML-embedded scripting language, it is evolving beyond its HTML roots into more advanced services like remote procedure calls (see an XML-RPC client and server).

Python is probably the best of the breed of the important class of languages that support coroutines (Ruby and Icon support coroutines too, but both are much less known). It has Google support so this is the only scripting language with multi-million corporation behind it. I am convinced that scripting languages that support coroutines have substantial advantages over those who do not. Python is now compiled into Pycode. It's relatively easy (in comparison with Perl) to mix and match Pycode and regular C or C++. In this area only TCL has more transparent "dual language" programming model. Befor it aqusition by Oracle Sun hired Jruby developers, but thos ideas disappered after aqusition. So far Python has the most of corporate support. Microsoft has its own implementation (IronPython) before abandoning it in 2010. that shows that Python got some traction outside traditional scripting community.



Jython (JPython) has a unique advantages for Java and is "politically correct" scripting language for a large enterprise development. It allows to use Java explicitly as the "programming in the small" language. It has the ability to extend existing Java classes, optional static compilation (allows creation of applets, and servlets), beans, bean properties and makes the usage of Java packages much easier then any other scripting language. Still the development felled behind mainstream Python and there is no significant financial support to launch the project on a new level.

TCL primary importance is that from the very beginning is was designed with dual-language paradigm of programming in mind. It has the best integration with C (that was the design goal). It promoted important view that we need to distinguish the use of scripting language as a glue for components (shell style usage) and its use as an application macrolanguage. Along with REXX TCL shares the achievement of being the first universal macrolanguage. And it gave us such a wonderful tool as Expect. TCL was the first (and may be the last ;-) scripting language that has a simple, clean interface with C. I feel that combination of TCL+C in many cases is a more powerful development platform then C++ or other complex OO language like Java, C#, etc.

Conclusions

There was a lot of "political correctness" about how to program in those days...

All through history, people have taken ideas and misunderstood the limitations of them. Donald E. Knuth[1996]

In several major dimensions and first of all due to compactness of code scripting languages are more modern approach to software development then regular high level languages (C, Modula, etc) or C++ style OO high level languages including Java. In this respect we should generally expect the repetition of the battle between high level languages and assembler on a new level with a predictable result.

Like in previous case there will be no quick success and the battle can take several decades (it took almost 30 years for high level languages to completely displace assembler in software development. It might even be a long as total replacement of the current generation of programmers but still I think it is inevitable. In this sense Java is as doomed as Cobol was in 1960 but remember it took more then 40 years before this prognoses for Cobol became reality.

The author argues that the compactness of the code is the crucial dimension by which scripting languages can be compared with alternatives and between each other. If highly correlates with programmers productivity and ease of maintenance: two crucial metrics for modern software development. As Vassili Bykov once put it "User interfaces are not only the things we see on the screen. A programming language is also a user interface. It presents us with an imaginary world of entities, and the ability to act in that world, that we use to build solutions to our problems."[ The Weekly Squeak]

People often think that the most important factor in software development is not the tools and techniques used by the programmers, but rather the quality of the programmers themselves. I respectfully disagree. Best programmers need best tools to fully realize their talents. Otherwise a large part of their productivity will disappear due to struggle with inadequate tools. Right now one of the most important tool for a talented application programmer is a good scripting language and no effort should be spared on selecting the one that has the best fit with the abilities and environment in which the particular programmer works. That does not mean that those who work with Java in enterprise environment should start looking for scripting jobs ;-) but still it might be wise to supplement Java programming with a suitable Java friendly scripting language. Recently Sun decided even support some. Sun's implementation of Java SE 6 includes a JavaScript interpreter based on Rhino. Groovy will be included in the J2SE at some point in the future.

Remember that in 10 years computer will be faster, have more memory and disk space then today and in 20 year even faster than they was 10 years ago. Just compare regular desktop or laptop PC in 1987 and regular desktop or laptop PCs on 2007 and you will instantly feel the scope of possible changes. Nobody can claim that they will be as dramatic as there are natural limits in semiconductor technology but the scope of changes still will be tremendous.

The immense number of pretty complex LAMP-based e-commerce WEB sites (that includes Yahoo) suggests that scripting languages, not Linux kernel represent the most promising and the most innovative area of open source development.

Webliography

Society

Quotes

Bulletin:

History:

Classic books:

Most popular humor pages:

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D

Copyright © 1996-2020 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) in the author free time and without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine. FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes. This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree... You can use PayPal to make a contribution, supporting development of this site and speed up access. In case softpanorama.org is down you can use the at softpanorama.info Disclaimer: The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Created May 16, 1998; Last modified: January 02, 2020