1. I'm Sadek Drobi, I’m here with Ted Neward at QCon 2008. Ted, why don't you introduce yourself, tell us what you've been up to lately? Hi, my name is Ted Neward and I'm up to about 6 ft 2 inches. That’s probably all there is. That and the fact that I've been talking a lot about different programming languages but people are probably not interested in that aspect.

2. I have a concern about Scala: it has a Java-like syntax. Don't you think that has some implications in a language like semi-functional programming language with a Java-like syntax for developers? I think to classify Scala as a functional language would be a mistake. To classify it as an object language would be a mistake. In many respects, what Odersky and some of the other folks working on Scala are trying to do is they try to take the best of both worlds. We don't want to abandon everything we know about objects, we don't want to abandon everything we like about objects, but we also want to incorporate a lot of these functional concepts directly into this language as well. As a result, the syntax on the surface of things can look very Java-esque and, in fact, that can sometimes be exactly what you would expect and, in some cases, it can lead to some very interesting subtleties that will surprise the Java programmer. As a matter of fact, in the process of working on the 3rd article for the IBM developerWorks series that I am doing on Scala, I'm talking about the "for" construct in Scala.



It looks on the surface very similar to the traditional for loop in Java. It turns out that there is a tremendous amount of stuff hiding behind that very simple syntax, which is very powerful and potentially could be very confusing, but in many respects the power is good because a lot of people have complained about the fact that in Java I have to be able to do so many different things in order to be able to express one basic concept. I think what we are looking for is more power out of our language and Scala has that. So I don't know that it's bad that Scala has Java-like syntax, but I mean let's make no mistake about it: it's not Java; I cannot take my .java files and compile them with a Scala compiler. It's different enough that you know you are in a foreign land.

3. My other concern: I feel that we are searching for another language. We got Java and we like some kind of modeling concepts. What are the other modeling concepts we find in F# or Scala or the other languages? Specifically, what we are seeing now in terms of F# and Scala is modeling concepts like you mentioned in the sense of higher order functions. Specifically, I think what we are really reaching for here, in many respects, is we are looking back to some of the older languages - Lisp and SmallTalk - of that era and we are saying "You know, these languages have some really implicitly powerful features most notably that we can build up what I call a compositional language." We can start from a very core set of primitives and we can build out.



Based on that core set of primitives to take the language in directions that we had not necessarily intended the time we created the core set of primitives but it's still genuinely useful. This notion of the domain specific language like what Neal Ford and Martin Fowler and those guys are talking about - the little language from the previous generation of Unix programmers and so forth - all of these are an attempt to be able to create a language that expresses at the level of abstraction that the user of the language wants to operate.



And the problem is that the languages that we use commonly today - and when I say is that I'm thinking most notably of C# and Java. It's very difficult for you to layer a syntax on top of those languages, that is not somehow intrinsically Java and C#. I really have a hard time putting a different style of syntax on top of Java, C#. Arguably, it's easier to do this in C++ because the use of the preprocessor. We don't have in either of those languages a preprocessor. Building an internal DSL where I'm not explicitly creating a new parser and a new code generator and a new language. It's actually really hard to do on Java and the C# platforms.



Languages like Scala, languages that are functional in nature, tend to follow this compositional notion very cleanly, because of the notion of querying , of high order functions, etc. It makes it easier for us to compose many of these elements together, to create something that looks and feels like it's language syntax, but in fact it's simply application of core language features that we pulled out of a library, and that's powerful. I remember in C#, when we introduced the using construct - for those who don't know C# the idea of using is { and you put some resource here } that needs to be explicitly closed prior to it being garbage collected - database connection, files, etc.



Normally, we have to put a try-finally block and we use the resource inside the try block and finally make sure that we close it on our way out. using works with an interface from C# called IDisposable, and IDisposable has one method called "Dispose" that gets called inside the final block. Adding this construct which occurred in one of the late betas of C# - C# 1.0 - required Microsoft engineers to go through and make the modification of the language, write the necessary unit tasks, etc. It was a very big deal because of garbage collection. To do that same construct in Scala, it takes me about 5 minutes and about 5 lines of code.

4. That's because you got some abstraction tools that you can use on higher order functions. We've got higher functions, we've got querying, we've got a whole bunch of things, these cores of primitives that I can build on to create those additional language constructs.

5. I remember the arguments; even foreach of C#, today you can re-implement using high order functions. What about C# 3.0? Does it offer a lot like what Scala does offer today? What C# 3.0 does is it takes the existing syntactic elements of C# and it extends them in some very interesting directions, but, at the end of the day, what's happening here is still Anders and Microsoft are making modifications to the language. If you want anything that looks like language construct, you have to go lobby Anders and the C# team at Microsoft to say "Hey, it would be really cool if we had a construct here that, for example, is a great one for the Java community". The solution would be the same one that we have in the C# community - you have to go talk to the compiler editors.



We have java.util.concurrent, which has this incredible wealth of very rich powerful locking schemes and conditions and so forth. If you want to use the lock construct in Java 5.0, you can't say "synchronized(lock)" and expect it to do the right thing. We have to say "try(lock.lock()) finally(lock.unlock())" to make sure that whatever we lock gets unlocked regardless of how we leave the block of code. It would be really nice if we could create, could extend, could modify, whatever you want to call it, the Java language to say synchronized() if it takes a java.util.concurrent.Lock type instead of actually meeting Monitor.Enter()/Monitor.Exit() in the try-finally block; instead it calls lock.lock(), lock.unlock(). In a language like Scala, where we have these compositional primitives, I could do that myself. In Java and in C# even in C# 3.0, I still have to go back to Anders and I have to say "I would like this new linguistic feature. Can you give it to me, please?"

6. I can go on re-implement using through a function and then I pass a block of code. I can do it today in C#. I can do it. Anything that allows me to take a function primitive as a parameter it's going to do it, but it's going to look like API call. It's going to be something that says Util.Using(resource, - and here is your anonymous delegate, here is your lambda expression etc.). Java we could arguably do the same thing with an anonymous inner class. All the debate in Java over closures is really over how can we make it easier to write what we have traditionally had used in anonymous inner class form, and certainly we can do this, but it's awkward and difficult.



You look at it as a Java programmer and you go: "Why are you doing this? Why wouldn't you just put a try-finally block around us?" If you come from a functional background, you can recognize what the programmer is trying to do and you can say "Oh, yes, of course, this makes total sense!" But, like the foreach construct hanging off of a collection like we see in Ruby, like we see in Scala, like we see in other functional languages, does just not make sense in a language that does not support these things as first class constructs.



In some respects, what you are arguing is "We didn't really need C++, because I could do objects in C. All I had to do was create a struct and then pass around a pointer to the struct and the struct contains a bunch of function pointers and I am going to invoke and - yes, you could, but just because you can, it doesn't mean you want to.

7. In C# we don't have higher order functions. Not in the same sense that we think about it, at least from the academic perspective. We have lambdas, no argument, but that doesn't necessarily mean we have high order functions.

8. I would like to know how does it differ. For example, an important part of this would be functional querying - what they call partially applied functions. That permits me a degree of flexibility in terms of how and when functions are applied that C# currently does not support. You can implement is as an API, but now you're back to building an API, you are building a library.

9. What you are talking about is having a multi-paradigm language that would support the law of paradigms at the same time, but C++ was doing that. Why did it fail? It didn't fail. The notion that C++ failed, the notion that C++ is out of the mainstream, I think is a misnomer. I think it is fair to say that C++ is not the media darling that it once was 10 years ago, but surprisingly, although not so to the people who've remained in this space, there are a lot of people writing a lot of C++ code and being very happy and productive doing so. Certainly the people responsible for the C++ language have not shared their responsibilities.



I was just talking to Bjarne Stroustrup and Herb Sutter literally last week in San Jose at the SD West Show, and they were talking about how C++ acts. The Standards Committee has just (at the end of last week) approved a couple of features for C++, one being a full memory model, so that C++ can now talk about creating a standard thread API and a standard set of synchronization primitives. Number two, they've approved lambda support, closures. Full lambdas in the C++ language environment, what took them so long?



C++ is not at a point where it's trying to go out and attract a bunch a new programmers to the language because they figured that their bigger responsibility right now is to do the right thing for the people who are already using it, not to try to go out and attract a whole bunch of people out of college. There is a tremendous amount of power in C++ language, I mean you just have to look at the boost libraries and realize that maybe we did give up on this train too soon.



Certainly, the fact that it's not garbage collected makes it more difficult then say Java, but you grab the latest C++ compiler, the Bome garbage collector, which is available off the web for free and Voila!, you now have basically what Java gave us back in 1997 plus a whole bunch more. If the issue here was just one of the automatic memory management, then we abandoned C++ way too soon. If the issue here is because lots of programmers couldn't handle a lot of the syntactic power that the C++ offered us, then those people ...

10. We might have this in Scala. We might very well. And to those people who cannot handle that degree of power I say "Look, there is a phrase you need to learn. It's 'Would you like fries with that?'" At the end of the day, if you don't want the power inherent in a powerful language, don't program in one, but then don't complain to me when the language is not powerful. You can't get it both ways. Anytime we have tried to create a language that somehow satisfies the desires of both the power users and the newbies, we have ended up with something that satisfies nobody.



A compositional language, which to some degree Ruby is, which to some degree Scala is, which to large degree Lisp is and so forth - they have the greatest history of success and try to work at both levels. Languages like C++, Java, C#, Visual Basic are running into the same basic problem - how much do we cater to entry level programmer and keeping the rails up so that they don't wonder off into the ditch, and how much do we relax all those constraints and let them do whatever they want. That's really hard to do in a fashion that works for everybody at every level.

11. We will switch to another topic. It's about dynamic typing and type inference. I've heard a lot about making a mix of both and that's what Visual Basic is trying to do as a lot of other languages. Do you think that dynamic typing will break the inference typing model? Obviously, I can't answer that question because it's going to depend on the language in question. Certainly there are several languages out there that exploit this concept in some detail and there are a lot of people who believe that dynamic typing and static typing can coexist simultaneously inside the same language, inside the same environment successfully. There is a language that I was introduced to at this year's Language .NET Symposium on the Redmond Campus, in January. The language is called Cobra and it's basically a Python-like syntax, but the guy who is creating it is specifically looking to be statically typed when possible, dynamically typed otherwise.



This is a favorite phrase of Eric Meijer, by the way, who is well-known in the Haskell community and of course is well-known to a lot of the .NET community because a lot of the ideas that made their way in the C# 3.0 he had been flirting with since C? which was called X# and Xen and a couple of other things on the way. There are two basic variations of that quote and I don't know which one Eric is fond of saying. One says "Statically typed where possible, dynamically typed otherwise" and the other one says "Dynamically typed where possible, statically typed otherwise", depending on which side of the spectrum do you want to come from.



There are a lot of research efforts going on - like around Cobra, around what we are going to see in VB X, VB 10 - that are really trying to find the best of both worlds. And it's really interesting that VB is coming back to this, because for years, VB was this dynamically typed byte code driven model that everybody said you can't write real code in an environment like that. Right at the time VB abandoned that model, Microsoft turned around and released the .NET framework, which was all statically typed and so forth.



Meanwhile, over here, in the Java space, in the Unix space, people started looking at things like Ruby and Python and all these other dynamically typed byte code driven interpreter models. Suddenly it's exciting, which is what led me to one conference and almost got me lynched - to say that Ruby is basically the Visual Basic for Java programmers. That did not go over well! But in a lot of respects it's a fair analogy, because Ruby provides a lot of the same things that the early Visual Basic environments provided and Ruby is hailed for a lot of the same things that the early Visual Basic environments provided, which is the same high productivity.



That comes with a degree of discipline, of responsibility that the VB community arguably never quite understood, but to be fair none of us were doing it back then, so I don't know whether we can really crucify VB for that. At the end of the day there is a lot to be said about the Visual Basic language that was good and useful and that we walked away from far too soon.

12. That is about static and dynamic. Type inference you just got in and then suddenly you find a dynamic object that breaks the model. In Visual Basic today, when it will see a dynamic return object, all the code is dynamic, it's not static anymore. To a degree. It depends on what the compiler can infer from it. Just because you declared this is returning object if the compiler can figure out that, yes, in fact this is returning a string, the compiler at that point, unless you've just left it off, you've just said "OK, function block return string", will just go ahead and say "OK, you want to return a string". It depends basically on what your options are at the time. Options static are on/off basically.



In VB it's not by any stretch of the imagination the last word on type inference language. I would be much more willing to look at what some of the Scala guys are doing or ML, or some of those environments before I start to say "OK, this is where the model breaks". Realistically speaking, the model is not broken in type inferencing if we hit a space that's dynamically typed. What happens at that point is additional code has to be admitted to provide necessary scaffold to see that the necessary requirements are in place.



Imagine what you would do as a Java or C# programmer with reflection around this thing being an object when you want to invoke a method - it's basically what the compiler ends up doing it for you. But it turns out there is a surprising amount of awareness, of knowledge that a compiler can dig out, that a type inferencing system, not just a compiler, can dig out by following all the possible permutations down and realizing that it has to be something from here down in the class hierarchy. Therefore, I am going to make that presumption and we'll go from there.

13. Can you give us your take about the differences between the the Haskell model and the Erlang model? Haskell is a functional language, you call a function and you get the return. Erlang is a messaging language, so it's quite different. TN: There are certain debates you just don't enter into, like for example: where should the curly braces appear? Should they be at the end of the line or should they be at the beginning of the next line? That's one debate you just don't enter into. Another one is: Hungarian notation for local variables. This is another one of those debates where I think lots of very reasonable people can have very reasonable arguments about this, and strangely, they don't.



We end up with a lot of shouting back and forth. Suffice is to say the Haskell model, the pure functional model certainly works where you can apply it. Because we are in a purely functional world, where there is no shared state, no mutable state, concurrency becomes just a footnote, it's really not that important because there is nothing you have to worry about protecting.

14. But it's not only in the functional world. Of course. We could do this at the Java level. If we just made every object in Java intrinsically immutable, we would have a very smooth concurrency experience. Having said that, how realistic is it to declare by fiat that all objects in Java should be immutable? This is a lot easier to say then it is to do for a lot of practical Java programming. For starters, much of your Hibernate code goes away.

15. There is a very important notion - it's not only about concurrency, it's also about behavioral coupling. When you've got mutation, you've got behavioral coupling - a lot of companies depend on the behavior or others. I am sensing a bias in your question here, as you wear the Haskell shirt. I don't want to get into this debate. Both models have their place. It's in some way sort of going back to the old EJB debates between stateless and stateful session beans: "Things are always scaled better if you have stateless session beans." and "Oh, no, stateful session beans are scaled better then you imagine". I don't care, I really don't care. The actor's model, the Erlang's model you talked about earlier, scales really well. The Haskell model scales really well.

16. Do you mean it scales in performance or in team members? I don't know what it means scaling in terms of team members. Can I replicate members and have more people on my team? That would be great!

17. It means you have a big team and you need to share code, and everyone should understand the code that was written by the other - that kind of scaling I am talking about. That particular issue has nothing to do with the language we are using. That has to do with the degree of communication that goes on between your team and that can either or not work whether it's Java, C#, Scala, Pearl, Python or an assembly language. That's a cultural issue, that's not a technical issue.

18. What about Ruby? Ruby lets you do a lot of things, it gives you a lot of possibility, you can do a lot of cheats in the language, but then, your language doesn't look like itself. How is supposed the other developer to understand what has been done by the other developer? I think the answer here is "Let's hope the developers talk to one another." Let's say I create an API and this API is going to be used by another developer on my team and I have to make this API incredibly brain-dead, because I have to assume that the developers on my team don't talk to one another, then the problem is with your team and lack of communication, not how you design the API. This is purely a communication issue - it has nothing to do with technology.

19. What about domain specific languages? In Java, in Ruby, in any language, I know where to start from. I got a blank page and then I instantiate an object or I call a static method of a class. When did you figure out which object to create or which method to call? You read something - you read a book, you read the documentation or you talked to another developer on the team and said "Where do I start?" and he told you. It's the same thing with the DSL, with an internal or external DSL. None of us started with just an empty terminal screen and the compiler waiting at the command line shell and we just typed characters at random until we found something that worked and build up from there. We all started from some bootstrap mechanism. The same thing is true of a DSL. You are going to operate off of the documentation that the DSL author will create.

20. I'm not about textual, I'm about this mix: I mean it's not the language and it's not that textual, it's something in between. Don't I lose something as coherence in the language? No. You don't. We can argue this back and forth. I can tell you that no, I believe that you won't, and you cannot believe me, and we're done. I mean at that point we've just reached an impasse. If your programmers have a hard time getting started with an internal or external DSL, then I submit that the same programmers will have a hard time getting started with an API. I think it's easier to learn the rules around the language, the documentation around the language is better than it is around the API, but other people can choose to disagree with that. I can't really gain anything. At the end of the day this becomes more of an aesthetic thing than a hard and fast rule of technology.

21. You were talking about how some languages have Rails associated with VB and then other languages are as close to the machine as C++. With something like the virtual machine model, do you see a space there for multiple languages that work on the same basic platform and you can step from one to the other during the development process? Certainly the VM offers a tremendous amount of abstraction away from the underlying machine, which makes things easier for us as programmers - we don't have to worry about so many details, those details are deferred until runtime. The VM is arguably better at making those right decisions. Certainly, there is a place here to say "Let's use different languages for different problems" and that's part of the reason why Microsoft designed the CLR the way they did - they wanted to encourage different languages on top of the same platform, they wanted to allow underlying machine specific, platform specific to bubble up through the VM.



That's why, for example they've always had a very rich platform interoperability story like PInvoke and COM Interop and so forth. That's not necessarily running without the rails in the sense that the VM still does a lot of things to prevent you from doing certain scary tricks. For example, in the CLR right now I cannot just define a block of memory and then immediately jump to it and start executing. I could do that in C++, if I wanted to do that. Or, inside the CLR I cannot just acquire the address of an object and then figure out the offset of the field inside that object to start parting directly on that memory.



The CLR will prevent that unless you go through one of those escape patches - the PInvoke code, which means now you have to be marked with an appropriate switch that says "This is unsafe code" and there is a whole bunch of permissions that go with that. I mean you really have to want to do scary little tricks like that, which generally mean you are either a sadistic or you try to do something that the CLR doesn't want you to do because you are resting on assumptions that may not be true in the next version. Some of those kind of things that I did in the C++ era don't apply much anymore.



I don't know that's saying we are running with rails on, we are running with safety bumpers on as much, as it is "Look, we're protecting programmers, we're protecting the 99% population of programmers from the 1% of crazy wacked out nut jobs like me who did this and thought it would be cool". So, I don't necessarily want to apply that to be on top of the VM is somehow running with safety bumpers. I don't think that is true at all. But even so, we are approaching an era where VM here can be a very loose term. You're still compiling to native code or you can compile to their byte code format and you could do a tremendous amount of interesting runtime optimizations, you are using the LOVM toolkit.



That's not really a VM in the traditional sense of the term, but you can think of it as one at some level from a language perspective. This notion of VM is itself losing some coherency in its definition, which is why the CLR guys always use to refer to what they built as not a virtual machine but an execution engine (EE). I don't know if the difference between the terms is all that semantically rich but it is interesting to look at the full range of tools that are out there.

22. One of the things you've talked about was modular toolchain. So the idea being the move from code to ... back and forth all along that chain. As a developer that's a really interesting idea. How do you get there? If I knew that, I'd be applying for research grants right now. In that particular blog entry where I talked about the modular toolchain, I eluded to this idea of some sort of universal AST. I think that's going to take a very long time for us to get to any sort of meaningful form, so I certainly would not try to do that out of the gate. Having a recognized AST that covers some percentage of what we'd like to do, would be a good place to start.



So, if you are in the Java space, looking at the Java tool chain, looking at the ways we go through parsing and scanning and lexing and so forth to produce the Java AST - let's start from that, let’s define a common Java abstract syntax tree. What sort of additional things would we like to see to that AST or can we reduce that down to? Just because it's the AST for the Java language, it doesn't mean that we might not be able to reduce it down to some core common primitives. Because if we do that, we could define some Java language construct syntax in terms of this core primitives set, then we have flexibility to incorporate other sorts of languages as a part of this environment and so forth.



Let's be really blunt about this: I'm not a compiler author by trade. Neal Gafter or Joshua Block or some of the guys that have been doing this for 20 or so years are likely to have more concrete input that they can weigh in on this particular thing in terms of what those core primitives should look like. I don't know exactly, I have a vague sort of idea in my head what it would look like but they have a lot more experience than I do.



If we can get to that notion of core AST, of an intermediate representation that we could define other languages in terms of, then we can start saying "All right, now we are looking at this essentially 2 phase process". Number 1 - going from whatever your source representation looks like into this AST model, this tree model, and then going from the tree model to code, generated executable code. Now we can talk about taking the generated executable code and going back into the AST model, which then lets us bring it back to a source model.



There is a whole bunch of things that I am leaving out in a traditional compiler chain; there are all sorts of phases that occur once you are in the AST in order to do optimizations and all that stuff.



That's fine, but if we could get to that AST to start with, to get some tools to go Java to AST, AST to code and then back again - that's a step 1. I mean did all for the .NET space: C# to AST, AST to IL, VB to AST - that, in some ways would be the most interesting challenges. The really frustrating thing is .NET took about half of the steps there, they have this thing called Code DOM It ships as part of the .NET frameworks since 1.0 and it's broken. Fundamentally it's broken. They have this document object model about code, Code DOM, and we can use it specifically for code generation. We can use it specifically to create C# classes that can then be compiled, but I can't go from compiled code back to Code DOM and I can't go from Code DOM back to source.



I talked to one of the guys who worked on it and he said "Well, it turns out it was a lot harder then we thought." That's a good reason! Instead of just removing the API that says "Let's create a Code DOM out of compiled code", they just have thrown an exception because they might fix it someday. That's just terrible, awful. That would be an interesting place to start, but again the DOM there are very directly representative of the languages. The C# code DOM looks very like C# and the VB code DOM looks very much like VB. I would like to - and I don't know how feasible that is - create a simpler DOM model that could express the other languages as a part of. I don't know off hand if that means we have to go up one level of abstraction to do that or if we have to try to go back to our Lisp-like roots.



I'm really not excited to the idea of relearning CAR and CADAR and all those others. I don't think we have to go that far back, but certainly, if you look particularly at a lot of the functional languages, they way they are able to compose things with high order functions and so forth, I think there is the germ of an idea that we can explore, and I would love to do it. Now everything I need is somebody to cover my mortgage for the next 6 months and I'll go off and do it. So if you know of anybody, you let me know.

23. The upgrades and updates of the bigger languages are quite large. At what point should you stop adding massive features in languages and stabilize them and create a new language with what you learnt from building it up to a certain stage? It's interesting, Neal is doing his presentation now, so I get to do this. Neal Gafter is giving a presentation here at QCon in the evolving Java track, because Brian Goetz, the track host said "I want you to talk about how we can add new features to a mature language like Java". Neal went and looked up the definition of "mature". Mature means having achieved its full growth and therefore, unchanging. So, by definition it is impossible to add things to a mature language, which makes things funny, but at a certain point, I think it depends on the language and how it's constructed.



If we look at Lisp, Lisp and SmallTalk, and some of the older languages which are intrinsically compositional in nature, I don't think there is really any place where you say we need to stop. Because they are built on top of this core set of primitives and so, if we find that a particular layer of abstraction that we've built on top of - layer upon layer upon layer. If we find those layers too high we back down one, we just don't import that library. We go back, if necessary, all the way back to the core set of primitives and start building in a different direction if we need to.