Re: Web application design: the REST of the story

This document is the result an e-mail conversation between Marco Baringer and Dave Roberts. The conversation started with Dave Roberts' blog posting talking about REST in the context of web applications.

Marco Baringer's writing is enclosed in boxes like this one, while Dave Roberts' writing is enclosed in boxes like this.

On REST In short: The REST camp is right, but our definition of "web application" is too wide. The term "web application," in its common use, appiles to two radically different things whcih both travel over HTTP: databases interfaces (amazon is the classical example but sites such as lexus nexus and ebay fit in here as well). These are systems where someone is asking a server for information. applications which just happen to use the browser for the GUI. (what you called interactive applications). I'd argue that REST is the perfect (and only) tool for the first category, while continuations work great for the second. I'd also argue that most "web sites" are a mix of both. Take amazon for example, when I use amazon there are two independant things i want to do 1) browse for books and 2) buy books. the browsing part should follow the REST principles, however the buying part should not, purchasing a book is an inherently statefull operation which requires the server to maintain some of the client's state. The order of operations in purchasing a book is complex and, imnsho, much easier to express using continutations. At the same time writing the code to show book details is much easier when you just transalte a url into an sql query and show the resulting rows. I'm not totally convinced that the checkout portion can't be made REST as well. The REST crowd actually thinks so. Clearly, there is state, but they would say that the state should either return with the next request or should be stored on the server side in the database, not carried in the web server. That may be bending the rules on REST, though. i'm not going to argue that i can't, only that it shouldn't. The application, and its underlying database, have a certain state which is global and shared by everyone. Then there is the state of GUI, this information (what page we're looking at, what the navigation history is, etc.) may be implemented in the DB but it's not application state, its GUI state. When the REST guys start sending this GUI state in the response objects they run into a number of obstacles (since they need to send all of the GUI state for every response): 1) object marshalling, 2) malicious clients 3) request/response size limits. Continuation based apps need only send: 1) a key (generally a random string a few tens of characters long) specifying the session and the point in the sesion history, 2) what's changed in the user interface. Right. I would agree that it's much easier to just use standard session data like most web servers provide today. The REST guys would just argue that it's less scalable. And they're right, too. The big question is, does it only matter for sites like Amazon or EBay which are massive, or does it come into play at lower levels of scalability, too. The REST guys are against sessions? I don't think that's what they meant; how else would you ever implement anything personalized? Yep, they're against sessions and consequently generally down on personalization. Their answer is that this isn't REST. I should probably qualify it, though. *In the context of REST*, they are down on those concepts. I think you'd find many people that would say, "Sure, the only way to implement personalization is with sessions, and that's fine when you want to go that way, but realize that you are now stepping out of a REST architecture style and you're going to pay that penalty" This is one reason why I think REST is most applicable to web applications that would otherwise use SOAP (i.e. "web services," not something interactive). Maybe this is just another case of distinguishing between web apps as "collections of resources named by urls" and web apps as "applications which happen to use the browser for a GUI" Yes, indeed.

This doesn't mean that you can't shoe-horn the statefull part of the application into REST but it does mean that REST is not the best option for this (and vice versa). Right. The question really is, what penalty are you paying when you move outside the REST style, and can you afford that. In some cases, you probably can. My big question is, how much is the penalty for continuations? If it's small, then no biggie. If large, then you really have to think about it. I could see continuations being very large, depending on how they were implemented, and that would definitely push you to larger and larger server front-ends with all the consequent headache. when you talk about continuations and web apps (interactive applications with a web GUI) there are (mainly) two places where the continutaions come into play: writing the code - we (some of us at least) would like to write our code sequentially, even though it will be executed asynchronously and certain parts of it may be executed multiple times (think back button and window cloning). there is no penalty here if the contiuations are implemented by a compile time code transformation (as ucw does). [well, there is a slight penalty since we introduce an oscene number of lambda calls, but its basically 0 compared to reading/writing the requests/response (trust me on this)]. Even when you use "full" built in continuations (as SeaSide and PLT scheme do) the overhead is still little more than a couple function calls and some memory (stact) manipulation. The only thing I was worried about was the memory allocation, not the time to render the page. As you say, either style's speed difference is swamped by the cost of the I/O. Again, my main issue is how much memory I'm sucking up with all those various continuations hanging around. And remember that I have multiple continuations per page (every link, right?). That could easily be many KB per continuation. Depending on how much back-button saving state I want to hang around, I could literally have MBs per user. If my policy is to let that stuff hang around until a timeout of some sort or an explicit logout/finish purchase point, I have to worry about how many people abandon their session, etc. You get the idea. a ucw session object consists of: a 40 character id string a last-access time (a fixnum) an object table (an eql hash table) a list of session-frame objects each session frame object consits of: a 20 character id string a component object (which may contain other component objects). a hash table mapping action-ids (10 character strings) to closures. a hash table mapping callback-ids (10 character strings) to closures. a list of (value . ucw:place). a value is a lisp object, ucw:place is an object with 3 closures (one for getting the current value, one for setting and one for copying). what you (and i) should be worried about are all those closure objects. a closure consists of two things: 1) a code vector and 2) an environment. the code vector is shared across all closures in all sessions, so all we're really worried about is the environment. the environment is bascially a (constant size) vector of objects, assuming has a page has ten links each with an action (in ucw the actions are generally simple forms which call some function on the srever) we'll have a couple objets in the environment. each of these objects is however shared across all the closure in the page. in all honesty i don't think the cost of memory is such an issue , i did some quick math and came up with about 400 bytes per session + 700 bytes per request/response iteration (assuming you backtrack about 10 objects per request/response and assuming each "slot" is 4 byte wide (immediate objects are saved in the slot, non immediates require some memory for the data as well). would this be a big issue for sites like amazon and ebay? yes. is this an issue for 99% of the web apps out there? considering how the cost of 1GB of RAM, no. what is a much bigger issue is farming out sessions. if you don't want to get swamped in serializeng/deserializng objects and sending them over the network you really need to leave a session on the same machine for its entire life, while dobale (i should have a client who will fund this work in a few months) it's not immediate and you still run the risk of having one machine at 100% CPU and another idle. This is pretty easy and done all the time. That's why load balancers implement a sticky policy to keep a given client bound to the same front-end web server. The issue then becomes how critical that data is in the event of a server failure, but the scalability model has been conquered. saving the state - when the user clicks on a link we want to make sure that the state of the app as the _user_ saw it on the page corresponds to the state on the server, even when the user has used the back button. in this case you, developer, have to tell the framework what state shoud be saved and restored based on the user's request and what state is global and doesn't change even though the user has "undone" some pages. generally we save the state regarding the GUI (values of forms, what the "current" page is, state of the navigation menu) and don't save the general application state (user data, db manipulations, etc.) how much you save is up to the developer. i'd like to point out that this penalty (saving and restoring GUI state) is taken even by REST apps (if they want to do things right), it's just that's its done by hand. Yes, agreed. State is state. You have to store it somewhere and then restore it when you get another request. the difference is who has to deal with that job. in a continuation based system you usally end up with a framework where the saving and restoring is done by the framework itself, all you (developer) have to do is use tha data. in the rest frameworks i've seen you have to do this work yourself. Right, agreed. The explicit model forces you to do more work. The continuation model makes this seamless, but could hand you a huge scalability problem if you aren't conscious about what's happening behind the scenes. Like any high-level paradigm (language, for instance), they can make some things seem really easy, but that can mask performance problems if you aren't careful.

The problem most people face is that you never hear about applications-which-use-browser-for-a-GUI, what you see 99% of the time are regular old web applications whose underyling paradigm hasn't changed since 1992. Those real world applications (#2 above) are usually deployed on an intranet and neither look nor feel (nor are considered) different from a regular desktop application. How do you mean that you never hear about those? Can you give me an example of an application that you know about that works that way? i've developed a document management systems for a law office and various patient management systems for hospitals, i've worked on buisness intelligence and e-learning tools. None of these apps were ever accessable to the public nor did they ever have to deal with more than 50 simultaneous users, though they did have extremly complex user interactions. Ah, got it. I guess that's the sort of app that I think that continuation-based methods would work well on. If the user count is low and the scalability doesn't have to go higher than those sorts of numbers, I see no reason not to use almost whatever technique is most expedient (continuation, REST, or something else).