print(sin(x))

{f(x)}

f(x)

%x

#x

if… what… goto… ok

ptrdiff_t

ptrdiff_t

Some time ago, the creator of Lua programming language, Roberto Ierusalimschy, visited our Moscow office. We asked him some questions that we prepared with the participation of Habr.com users also. And finally, we’d like to share full-text version of this interview.— Wow! That’s a difficult question. There’s so much history embedded in the creation and development of the language. It was not like a big decision at once. There are some regrets, several of which I had a chance to correct over the years. People complain about that all the time because of compatibility. We did it several times. I’m only thinking of small things.— Maybe. But it’s very difficult for dynamic languages. Maybe the solution will be to have no defaults at all, but would be hard to use variables then.For instance, you would have to somehow declare all the standard libraries. You want a one-liner,, and then you’ll have to declare ‘print’ and also declare ‘sin’. So it’s kinda strange to have declarations for that kind of very short scripts.Anything larger should have no defaults, I think. Local-by-default is not the solution, it does not exist. It’s only for assignments, not for usage. Something we assign, and then we use and then assign, and there’s some error — completely mystifying.Maybe global-by-default is not perfect, but for sure local-by-default is not a solution. I think some kind of declaration, maybe optional declaration… We had this proposal a lot of times — some kind of global declaration. But in the end, I think the problem is that people start asking for more and more and we give up.(sarcastically) Yes, we are going to put some global declaration — add that and that and that, put that out, and in the end we understand the final conclusion will not satisfy most people and we will not put all the options everybody wants, so we don’t put anything. In the end, strict mode is a reasonable compromise.There is this problem: more often than not we’re using fields inside the modules for instance, then you have the same problems again. It’s just one very specific case of mistakes the general solution should probably include. So I think if you really want that, you should use a statically typed language.— Yes, exactly, for small scripts and so on.— No, there are always tradeoffs. There’s a tradeoff between small scripts and real programs or something like that.— Well, it’s not a big change, but still… Our bad debt that became a big change is nils in tables. It’s something I really regret. I did that kind of implementation, a kind of hack… Did you see what I did? I sent a version of Lua about six months or a year ago that had nils in tables.— Exactly. I think it was called nils in tables — what’s called null. We did some hack in the grammar to make it somewhat compatible.— I’m really convinced that this is a whole problem of holes… I think that most problems of nils in arrays would disappear, if we could have [nils in tables]… Because the exact problem is not nils in arrays. People say we can’t have nils in arrays, so we should have arrays separated from tables. But the real problem is that we can’t have nils in tables! So the problem is with the tables, not the way we represent arrays. If we could have nils in tables, then we would have nils in arrays without anything else. So this is something I really regret, and many people don’t understand how things would change if Lua allowed nils in tables.— Yeah, I know. That’s exactly my point — this would solve a lot of problems for a lot of people, but there’s a big problem of compatibility. We don’t have the energy to release a version that is so incompatible and then break the community and have different documentation for Lua 5 and Lua 6 etc. But maybe one day we’ll release it. But it’s a really big change. I think it should have been like that since the beginning — if it was, it would be a trivial change in the language, except for compatibility. It breaks a lot of programs, in very subtle ways.— Besides compatibility, the downside is that we would need two new operations, two new functions. Like ‘delete key’, because assigning nil would not delete the key, so we would have a kind of primitive operation to delete the key and really remove it from the table. And ‘test’ to check where exactly to distinguish between nil and absent. So we need two primitive functions.— Yes, we released a version of Lua with that. And as I said, it breaks code in many subtle ways. There are people who do table.insert(f(x)) — a call to a function. And it’s on purpose, it’s by design that when a function doesn’t want to insert anything, it returns nil. So instead of a separate check «do I want to insert?», then I call a table.insert, and knowing that if it’s nil, it won’t be inserted. As everything in every language, a bug becomes a feature, and people use the feature — but if you change it, you break the code.— Oh no, this is a nightmare. You just postpone the problem, if you put another, then you need another and another and another. That’s not the solution. The main problem — well, not main, but one of the problems — is that nil is already ingrained in a lot of places in the language. For instance, a very typical example. We say: you should avoid nils in arrays, holes. But then we have functions that return nil and something after nil, so we get an error code. So that construction itself assumes what nil represents… For instance, if I want to make a list of returns of that function, just to capture all of these returns.— Exactly, but you don’t have to use hacks for so primitive and obvious [issue]. But the way the libraries are built… I once thought of that — maybe the libraries should return false instead of nil — but it’s a half-cooked solution, it solves only a small part of the problem. The real problem, as I said, is that we should have nils in tables. If not, maybe we should not use nils as frequently as we do now. It’s all kinda messy. So if you create a void, these functions would still return a nil, and we’d still have this problem unless we create a new type and the functions would return void instead of nil.— Yes, that’s what I mean. All the functions in the libraries should return void or nil.— Because we’d still have the problem that you cannot capture some functions.— No, there won’t be a second key, because the counting will be wrong and you’ll have a hole in the array.— Yes. My dream is something like that:You should capture all returns of the function. And then I can door, and that will give me the number of returns of the function. That’s what a reasonable language should do. So creating a void will not solve that, unless we had a very strong rule that functions should never return nil, but then why do we need nil? Maybe we should avoid it.— No, I think a really strong static analysis tool is called… type system! If you want a really strong tool you should use a statically typed language, something like Haskell or even something with dependent types. Then you’ll have really strong analysis tools.— Exactly, Lua is for…— Yes, my last slide.— No, I think you can do some large tasks, but not with static analysis. I strongly believe in tests. By the way, I disagree with you on coverage, your opinion is we should not chase coverage… I mean, I fully agree that coverage does not imply full test, but non-coverage implies a zero percent test. So I gave a talk about a testing room — you were there in Stockholm. So I started my test with [a few] bugs — that’s the strangest thing — one of them was famous, the other was completely non-famous. It’s something completely broken in a header file from Microsoft, C and C++. So I search the web and nobody cares about it or even noticed it.For instance, there’s a mathematical function, modf() , where you have to pass a pointer to a double because it returns two doubles. We translate the integer part of the number or the fractional part. So this is a part of a standard library for a long time now. Then came C 99, and you need this function for floats. And the header file from Microsoft simply kept this function and declared another one as a macro. So it got this one into type casts. So it cast the double to float, ok, and then it cast the pointer to double for pointer to float!— This is a header file from Visual C++ and Visual C 2007. I mean, if you called this function once, with any parameters, and checked the results — it would be wrong unless it’s zero. Otherwise, any other value will be wrong. You would never ever use this function. Zero coverage. And then there’s a lot of discussions about testing… I mean, just call a function once, check the results! So it’s there, it’s been there for a long time, for many years nobody cared. One very famous was in Apple. Something like ", it was something like that. Someone put another statement here. And then everything was going to ok. And there was a lot of discussions that you should have the rules, the brackets should be mandatory in your style, etc., etc. Nobody mentioned that there are a lot of other ifs here. That has never been executed…— Yes, exactly. Because they were only testing approved cases. They were not testing anything, because everything would be approved. It means there is not a single test case in the security application that checks whether it refuses some connection or whatever it is that it should refuse. So everyone discuss and say they should have brackets… They should have tests, minimum tests! Because nobody has ever tested that, that’s what I mean by coverage. It’s unbelievable how people don’t do basic tests. Because if they were doing all basic tests, then of course, it’s a nightmare to do all the coverage and execute all the lines, etc. People neglect even basic tests, so coverage is at least about the minimum. It is a way to call the attention to some parts of the program that you forgot about. It is a kind of guide on how to improve your tests a little.— About 99.6. How many lines of code do you have? A million, hundreds of thousands? These are huge numbers. One percent of hundred thousand is a thousand lines of code that were never tested. You did not execute it at all. Your users don’t test anything.— I’m not sure if you want to unstack everything back to where we were… I think one of the problems with dynamic languages (and static languages for that matter) is that people don’t test stuff. Even if you have a static language, unless you have something — not even like Haskell, but Coq, — some proof system, you change that for that or that. No static analysis tool can catch these errors, so you do need tests. And if you have the tests, you detect global problems, rename misspellings, etc. All these kinds of errors. You should have these tests anyway, maybe sometimes it’s a little bit more difficult to debug, sometimes it’s not — depends on the language and the kind of bug. But the problem is that no static analysis tool can allow you to avoid tests. The tests, on the other hand… well, they never prove the absence of error, but I feel much more secure after all the tests.— Why is that, sorry?— If there is functionality that is not reachable by the public interface, it shouldn’t be there, just erase it. Erase that code.— Yes, sometimes I do that in Lua. There was some code coverage, I couldn’t get there or there or there, so I thought it was impossible and just removed the code. It’s not that common, but happened more than once. Those cases were impossible to happen, you just put an assertion to comment on why it cannot happen. If you cannot get inside your functions from the public API, it shouldn’t be there. We should code the public API with incorrect input, that’s essential for the tests.— Yes, extreme programming had this rule. If it’s not in a test, then it doesn’t exist.— I designed Lua for a very specific purpose, it was not an academic project. That’s why when you ask me if I’d create it again, I say there’s lots of historical stuff on the language. I did not start with ‘Let me create the language I want or want to use or everybody needs etc. My problem was ‘This program here needs a configuration language for geologists and engineers, and I need to create some small language they could use with an easy interface. That’s why the API was always an integral part of the language, because it’s easier to be integrated. That was the goal. What I had in my background, it’s a lot of different languages at that time… about ten. If you want all of the background…— I was getting things from many different languages, whatever fitted the problem I had. The single biggest inspiration was the Modula language for syntax, but otherwise, it’s difficult to say because there are so many languages. Some stuff came from AWK, it was another small inspiration. Of course, Scheme and Lisp… I was always fascinated with Lisp since I started programming.— Yes, there is much difference in syntax. Fortran, I think, was the first language… no, the first language I learned was Assembly, then came Fortran. I studied, but never used CLU. I did a lot of programming with Smalltalk, SNOBOL. I also studied, but never used Icon, it’s also very interesting. A lot came from Pascal and C. At the time I created Lua, C++ was already too complex for me — and that was before the templates, etc. It was 1991, and in 1993 Lua was started.— Yes, I think it’s a good reason not to have similar syntax — so you don’t mix them, these are two different languages.It’s something really funny and it’s connected to the answer you didn’t allow me [at the conference] to give on arrays starting at 1. My answer was too long.When we started Lua, the world was different, not everything was C-like. Java and JavaScript did not exist, Python was in an infancy and had a lower than 1.0 version. So there was not this thing when all the languages are supposed to be C-like. C was just one of many syntaxes around.And the arrays were exactly the same. It’s very funny that most people don’t realize that. There are good things about zero-based arrays as well as one-based arrays.The fact is that most popular languages today are zero-based because of C. They were kind of inspired by C. And the funny thing is that C doesn’t have indexing. So you can’t say that C indexes arrays from zero, because there is no indexing operation. C has pointer arithmetic, so zero in C is not an index, it’s an offset. And as an offset, it must be a zero — not because it has better mathematical properties or because it’s more natural, whatever.And all those languages that copied C, they do have indexes and don’t have pointer arithmetic. Java, JavaScript, etc., etc. — none of them have pointer arithmetic. So they just copied the zero, but it’s a completely different operation. They put zero for no reason at all — it’s like a cargo cult.— Who uses C every day?— Exactly. That’s the problem, too many people use C, but should not be allowed to use it. Programmers ought to be certified to use C. Why is software so broken? All those hacks invading the world, all those security problems. At least half of them is because of C. It is really hard to program in C.— Yes, and that’s how we learned how hard it is to program in C. You have buffer overflows, you have integer overflows that cause buffer overflows… Just get a single C program that you can be sure that no arithmetic goes wrong if people put any number anywhere and everything is checked. Then again, real portability issues — maybe sometimes in one CPU it works, but then it gets to the other CPU… It’s crazy.For instance, very recently we had a problem. How do you know your C program does not do stack overflow? I mean stack depth, not stack overflow because you invaded… How many calls you have a right to do in a C program?— Exactly. What the standard says about that? If you code in C and then you do this function that calls this function that calls this function… how many calls can you do?— I may be wrong, but I think the standard says nothing about that.— Of course, it depends on the size of each function. It may be huge, automatic arrays in the function frame… So the standard says nothing and there is no way to check whether a call will be valid. So you may have a single problem if you have three step calls, it can crash and still be a valid C program. Correct according to the standard — though it’s not correct because it crashes. So it’s very hard to program in C, because there are so many… Another good example: what is the result when you subtract two pointers? No one here works with C?— No, C++ has the same problem.— Exactly,is a signed type. So typically, if you have a standard memory the size of your word and you subtract two pointers in this space, you cannot represent all the sizes in the signed type. So, what does the standard say about that?When you subtract two pointers, if the answer fits in a pointer diff, then that is the answer. Otherwise, you have undefined behavior. And how do you know if it fits? You don’t. So whenever you subtract two pointers, usually you know that’s out of standard, that if you’re pointing to anything larger than at least 2 bytes, then the larger size would be half the size of the memory, so everything is ok.So you’re only having a problem if you’re pointing to bytes or characters. But when you do that, you have a real problem, you can’t do pointer arithmetic without worrying that you have a string larger than half of the memory. And then I can’t just compute the size and store in a pointer diff type because it’s wrong.That’s what I mean about having a secure C or C++ program that’s really safe.— When we started, I considered C++, but as I said I gave up using it because of complexity — I cannot learn the whole language. It should be useful to have some stuff from C++ but… even today I don’t see any language that would do.— Because I have no alternatives. I can only explain why against other languages. I’m not saying C is perfect or even good, but it’s the best. To explain why, I need to compare it with other languages.— Oh, JVM. Come on, it doesn’t fit in half the hardware… Portability is the main reason, but performance too. In JVM it’s a little better than .NET, but it’s not that different. A lot of things that Lua does we can’t do with JVM. You cannot control the garbage collector for instance. You have to use JVM garbage collector because you can’t have a different garbage collector implemented on top of JVM. JVM is also a huge consumer of memory. When any Java program starts to say hello, it’s like 10 MB or so. Portability is an issue not because it wasn’t ported, but because it cannot be ported.— That’s not JVM, that’s a joke. It’s like a micro edition of Java, not Java.— Oberon… might be, it depends… Go, again, has a garbage collector and has a runtime too big for Lua. Oberon would be an option, but Oberon has some very strange things, like you almost don’t have constants, if I recall correctly. Yeah, I think they removed const from Pascal to Oberon. I had a book on Oberon and loved Oberon. Its system was unbelievable, it’s really something.I remember that in 1994 I saw a demonstration of Oberon and Self. You know Self? It’s a very interesting dynamic language with jit-compilers etc… I saw these demos a week apart, and Self was very smart, they used some techniques from cartoons to disguise the slowness of the operations. Because when you opened something, it was like ‘woop!’ — first it reduces a little, then expands with some effects. It was implemented very well, these techniques they used to simulate movement…Then a week later we saw a demo of Oberon, it was running on like 1/10of hardware for Self — there was this very old small machine. In Oberon you click and then just boom, everything works immediately, the whole system was so light.But for me it’s too minimalistic, they removed constants and variant types.— I don’t know Haskell or how to implement Lua in Haskell.— I think every one of these has its uses.R seems to be good for statistics. It’s very domain specific, done by people in the area, so this is a strength.Python is nice, but I had personal problems with it. I thought I mentioned it in my talk or the interview. That thing about not knowing the whole language or not using it, the subset fallacy.We use Python in our courses, teaching basic programming — just a small part, loops and integers. Everybody was happy, and then they said it would be nice to have some graphical applications, so we needed some graphical library. And almost all graphical libraries, you get the API… But I don’t know Python enough, this is much-advanced stuff. It has the illusion it’s easy and I have all these libraries for everything, but it’s either easy or you have everything.So when you start using the language, then you start: oh, I have to learn OOP, inheritance, whatever else. Every single library. It looks like authors take pride in using more advanced language features in their API to show I don’t know what. Function calls, standard types, etc. You have this object, and then if you want another thing then you have to create another object…Even the pattern matching, you can do some simple stuff, but usually the standard pattern matching is not something you do. You do a matching, an object returns a result and then you call methods on that object result to get the real result of the match. Sometimes there is a simpler way to use but it’s not obvious, it’s not the way most people use.Another example: I was teaching a course on pattern matching and wanted to use Perl-like syntax, and I couldn’t use Lua because of a completely different syntax. So I thought Python would be the perfect example. But in Python there are some direct functions for some basic stuff but for anything more complex you’d have to know objects and methods etc. I just wanted to do something and have the result.— I used Python and explained to them. But even Perl is much simpler, you do the match and the results are $1, $2, $3, it’s much easier, but I don’t have the courage to use Perl, so…— Yes, and when you want to use a library, then you have to learn this stuff and you don’t understand API etc. Python gives an illusion that it’s easy but it’s quite complex....And Julia, I don’t know much about Julia, but it reminded me of LuaJIT in the sense that sometimes it looks like user’s pride. You can have very good results but you have to really understand what’s going on. It’s not like you write code and get good results. No, you write code and sometimes the results are good, sometimes they are horrible. And when the results are horrible, you have a lot of good tools that show you the intermediate language that was once generated, you check it and then you go through all this almost assembly code. Then you realize: oh, it’s not optimizing that because of that. That’s the problem of programmers, they like games and sometimes they like stuff because it’s difficult, not because it’s easy.I don’t know much about Julia, but I once saw a talk about it. And the guy talking, he was the one to have this point of view: see how nice it is, we wrote this program and it’s perfect. I don’t remember much, something about matrix multiplication I guess. And then the floats are perfect, then the doubles are perfect, and then they put complex [numbers]… and it was a tragedy. Like a hundred times slower.(sarcastically) ‘See how nice it is, we have this tool, we can see the whole assembly [listing], and then you go and change that and that and that. See how efficient this is’. Yes, I see, I can program in assembly directly.But that was just one talk. I studied a little R and have some user experience with Python for small stuff.— Erlang is a funny language. It has some really good uses, fault tolerance is really interesting. But they claim it’s a functional language and the whole idea of the functional language is that you don’t have a state.And Erlang has a huge hidden state in the messages that are sent and not yet received. So each little process is completely functional but the program itself is completely non-functional.It’s a mess of hidden data that is much worse than global variables because if it were global variables, you would print them. Messages that are the real state of your system. Every single moment, what’s the state of the system? There are all these messages sent here and there. It’s completely non-functional, at all.— Lua lies a bit about being small. It’s still smaller than most other languages, but if you want a really small language then Lua is larger than you want it to be.— Forth is, I love Forth.— Maybe, but it’s difficult. I love tables but tables are not very small. If you want to represent small stuff, the whole idea behind tables will not suit you. It would be syntax of Lua, we’d call it Lua but it’s not Lua.It would be just like Java micro edition. You call it Java but does it have multi-threading? No, it doesn’t. Does It have a reflection? No, it doesn’t. So why use it? It has a syntax of Java, the same type system but it’s not Java at all. It’s a different language that is easier to learn if you know Java but it’s not Java.If you want to make a small language that looks like Lua but Lua without tables is not… Probably you should have to declare tables, something like FFI to be able to be small.— Maybe, I don’t know.— Of course, you can. It’s not particularly efficient but only Haskell is really efficient for that. If you start using monads and stuff like that, create new functions, compose functions etc… You can do [that] with Lua, it runs quite reasonably, but you need implementation techniques different from normal imperative languages to do something really efficient.— Yes, it’s reasonable and usable, if you do really need performance; you can do a lot of stuff with it. I love functional stuff and I do it all the time.— I think a new incarnation of garbage collector will help a lot, but again…— Exactly, yes. But as I said even with the standard garbage collector we don’t have optimal performance but it can be reasonable. More often you don’t even need that performance for most actions unless you are writing servers and having big operations.— A simple example. My book, I’m writing my own format and I have a formatter that transforms that in LaTex or DocBook. It’s completely functional, it has a big pattern matching… It’s slightly inspired by LaTex but much more uniformed. There’s @ symbol instead of backslash, a name of a macro and one single argument in curly brackets. So I have gsub that recognizes this kind of stuff and then it calls a function, the function does something and returns something. It’s all functional, just functions on top of functions on top of functions, and the final function gives a big result.— Plain LaTeX? First, it’s too tricky for a lot of stuff and so difficult. I have several things that I don’t know how to do in LaTex. For example, I want to put a piece of inline code inside a text. Then there is a slash verb, standard stuff. But slash verb gives fixed space. And the space between stuff is never right. All real spaces are variable, it depends on how the line is adjusted, so it expands in some spaces and compacts in others depending on a lot of stuff. And those spaces are fixed, so sometimes they look too large, sometimes too small. It also depends on what you put in code.— Yes, but with a lot of preprocessing. I write my own verb but then it changes and becomes not a verb but a lot of stuff. For example, when I write 3+1 I write a very small space here. In verb, if I don’t put any space here, it shrinks, and if I do, it’s too large. So I do the preprocessing, inserting a variable space. It’s very small but can be a little larger if it needs to adjust. But if I put ‘and’ after 1 then I put a larger space. This function here does all that. This is a small example but there are other things…— I do have the source, it is in the git . The program’s called 2html . The current version only generates HTML… Sorry, that’s a kind of a mess. I created it for a book but also another one for the manual. The one in the git is for the manual. But the other one is more complicated and not public, I can’t make it public. But the main problem is that TeX is not there. It’s almost impossible to process TeX files without TeX itself.— Yes, it’s not machine-readable. I mean, it is readable because TeX reads it. It’s so hard to test, so many strange rules etc. So this is much more uniformed and as I said I generate DocBook format, sometimes I need it. That started when I had this contract for a book.— Yes, it generates DocBook directly.If you have any more questions, you can ask them in the Lua Mailing List . See you soon!