Custom dispatchers

Practical web programming with PLT Scheme

Part two of two. « First part

I wanted this tutorial to reflect a realistic scenario, so let's say we wanted to build a CMS. Since SEO is important for anyone who have a commercial or political interest in publishing anything on the web, we need to consider this in any design decision we make.

The most important SEO aspects we as engineers can control is:

Content formatting

URL

Title

Description

Keywords

If you are not sure how these work, or why they are important, you should really read up on this elsewhere. When it comes to URL:s, there are two things all CMS:es seems to get wrong. Why not try to address these issues?

URL:s are cryptic The URL is one of the most important resources for SEO, so we should always make an effort to present nicely formatted URLs to the search engines. Your human visitors will benefit from this too. A horrible example is Joomla. While there are a number of plugins that reformats the URL:s, the problem shouldn't have been there to begin with.

URL:s can't easily be changed Some would say this is a good thing. I agree that URL:s shouldn't change. But when you must do it, it's better if you can do it cleanly and effortlessly. It's not enough to simply change the URL of the page. All saved bookmarks to that specific page would suddenly be invalid. This includes bookmarks in peoples browsers, and perhaps more importantly saved links in social bookmarking sites like stumbleupon, delicious, digg etc. Probably the most important factor is the search engines. Google gives all indexed pages a "pagerank" that effectively determines how many visitors you will get. If your original URL just disappears, you will loose your hard-earned pagerank, and with it, your visitors. Because of this, the old URL needs to be redirected as well, or you'll find yourself losing traffic. Google is known to transfer it's pageranks to the new URL when it's given a 301 redirect. This is something that is often done as an afterthought, possibly by dumping a list of expired URL:s to a .htaccess file for URL rewriting. I believe that it should be a cornerstone feature of any CMS.

The page struct

Let's define a basic datastructure to hold our pages:

; Our page struct. Pretty basic, except for these: ; [urls] is a list of strings for the valid URL:s of the page. ; [revisions] is a latest-first chronological list of xexpr:s, representing page revisions. (define-struct page (language title description keywords urls revisions template) #:mutable)

The struct contains the language of the content of the page, the SEO relevant metadata, a list of URL:s (current and all former), a list of content revisions and a template. The template will be responsible for rendering the page. This is very useful when the same site might have different types of pages that needs to be rendered in unique ways. It is also defined to be mutable, so we can make changes to the page.

To be able to use this struct at all, we need to define a function for rendering it:

; Render the page with the current template. (define (page-render page) ((page-template page) page (if (not (empty? (page-revisions page))) (car (page-revisions page)) empty)))

The template assigned to the page is a function that takes the page struct itself and the xexpr to be used as the HTML content. Obviously, we need to also define such a template:

; A XHTML template to render the page struct. ; The [content] argument makes it easy to render work-in-progress or old revisions. ; (A more realistic example should probably wrap the content in a couple of divs together ; with a global menu, footer, search-field etc.) (define (test-template page content) (xhtml-response (page-language page) (xhtml-head (page-title page) (page-description page) (page-keywords page) '("behaviour/jquery.js" "behaviour/carousell.js" "behaviour/lightbox.js") '("design/reset.css" "design/layout.css" "design/typography.css") '(("if IE 6" . "design/ie6.css") ("if IE 7" . "design/ie7.css"))) `(body ,@content)))

This code builds on top of the code we created in the first part of this tutorial. Since we have changed the representation of the page, we need to update the dispatcher to use this new page struct:

(define (some-dispatcher page) (make (string-append "/" (car (page-urls page))) (lambda (request-ignored) (page-render page))))

We also need a test page in the new format:

; Let's build an example page. (define test-page (make-page "en" "Everything cute" "Here, you'll find everything that's cute and fuzzy." '("bunnies" "rainbows" "soap bubbles" "candy cotton") '("cutest-bunny-all-week" "bunny") '(((h1 "Cutest bunny you've seen all week") (p "Look at " (a ((href "http://www.cutebunnies.com")) "this adorable bunny") " I found on the Internet.")) ((h1 "Cutest bunny you've seen all week") (p "Look at " (a ((href "http://www.cutebunnies.com")) "this adorable bunny") " I'm blog-spamming about."))) test-template))

Notice that this example page has two URL:s and two revisions of the content. For now, only the first element in both lists will be used at all.

When you put all this together, your definitions window should look like this:

#lang scheme (require "xhtml.ss") ; Our page struct. Pretty basic, except for these: ; [urls] is a list of strings for the valid URL:s of the page. ; [revisions] is a latest-first chronological list of xexpr:s, representing page revisions. (define-struct page (language title description keywords urls revisions template) #:mutable) ; Render the page with the current template. (define (page-render page) ((page-template page) page (if (not (empty? (page-revisions page))) (car (page-revisions page)) empty))) ; A XHTML template to render the page struct. ; The [content] argument makes it easy to render work-in-progress or old revisions. ; (A more realistic example should probably wrap the content in a couple of divs together ; with a global menu, footer, search-field etc.) (define (test-template page content) (xhtml-response (page-language page) (xhtml-head (page-title page) (page-description page) (page-keywords page) '("behaviour/jquery.js" "behaviour/carousell.js" "behaviour/lightbox.js") '("design/reset.css" "design/layout.css" "design/typography.css") '(("if IE 6" . "design/ie6.css") ("if IE 7" . "design/ie7.css"))) `(body ,@content))) ; Let's build an example page. (define test-page (make-page "en" "Everything cute" "Here, you'll find everything that's cute and fuzzy." '("bunies" "rainbows" "soap bubbles" "candy cotton") '("cutest-bunny-all-week" "bunny") '(((h1 "Cutest bunny you've seen all week") (p "Look at " (a ((href "http://www.cutebunnies.com")) "this adorable bunny") " I found on the Internet.")) ((h1 "Cutest bunny you've seen all week") (p "Look at " (a ((href "http://www.cutebunnies.com")) "this adorable bunny") " I'm blog-spamming about."))) test-template)) (require web-server/dispatchers/dispatch-pathprocedure) (define (some-dispatcher page) (make (string-append "/" (car (page-urls page))) (lambda (request-ignored) (page-render page)))) (require web-server/web-server) (define stop-server (serve #:dispatch (some-dispatcher test-page) #:port 8080))

Remember that the new dispatcher respects the URL defined by the page struct, so you will have to visit the new URL, probably http://localhost:8080/cutest-bunny-all-week

The page dispatcher

We still don't have the functionality to change and maintain URL:s. To do this, we should know a little bit more about dispatchers. PLT Scheme comes with a number of standard dispatchers. We have already seen the path-procedure dispatcher. Another very useful one is the sequencer dispatcher. It takes a list of dispatchers and tries them one-by-one until it finds one that can handle the request. As it turns out, this is exactly what we need.

since the functions that create these dispatchers have the same name, we need to rename them when they are imported. The module system in PLT Scheme can handle this without to much trouble. just add these lines in your definitions window:

(require (rename-in (only-in web-server/dispatchers/dispatch-pathprocedure make) (make make-pathprocedure))) (require (rename-in (only-in web-server/dispatchers/dispatch-sequencer make) (make make-sequencer)))

Now we have access to both dispatchers through the names make-pathprocedure and make-sequencer. Using the sequencer dispatcher, we can make a single dispatcher that handles all the URL:s of any page. The first URL in the list it the "real" one. We should simply dispatch that URL to the template function rendering the page. The rest of the URL:s should be dispatched to a function that redirects the browser to the real URL. Pretty simple to say, and just as simple to write:

; To easily handle the multiple URL:s of the page correctly, we can build a custom dispatcher. ; [page] is a page struct. (define (page-dispatcher page) ; Each page have a number of URL:s we must handle. ; The first URL in the list is the current one. (let ((real-url (car (page-urls page))) (redirecting-urls (cdr (page-urls page)))) (apply make-sequencer ; The real URL should render the page. (make-pathprocedure (string-append "/" real-url) (lambda (request-ignored) (page-render page))) ; All the others need to redirect 301. (map (lambda (redircting-url) (make-pathprocedure (string-append "/" redircting-url) (lambda (request) (path-redirect-301-response request real-url)))) redirecting-urls))))

Before this will actually work, we need to define the function path-redirect-301-response. It is used to build a redirecting response, causing the browser (and Google!) to be redirected to the URL of our choice. Since we don't want to bother with details, like what server we are currently serving the page from, and what port we are using, we just copy this information from the URI struct in the request. We also need to require another module to make it work.

(require net/uri-codec) (require net/url) ; Creates a response struct, redirecting the browser to another path on the same server while preserving everything else in the URI. (define (path-redirect-301-response request new-url) (make-response/basic 301 "Moved Permanently" (current-seconds) TEXT/HTML-MIME-TYPE (list (make-header #"Location" (string->bytes/utf-8 (url->string ; We copy the current URL... (struct-copy url (request-uri request) ; ...and just replace the path. (path (list (make-path/param ; The URL should be properly encoded. (%20, etc.) (form-urlencoded-encode new-url) empty))))))))))

There. We can now test this. Just make sure it's all in your definitions window and modify the code starting the server to use this new dispatcher:

; Starting the server with the page-dispatcher. (define stop-server (serve #:dispatch (page-dispatcher test-page) #:port 8080))

Click the Run button in DrScheme and refresh your browser. If all goes well you will see the same page as before. The new, interesting part happens when you enter the "old" URL of the page, probably http://localhost:8080/bunny. It should now redirect you to the newer URL http://localhost:8080/cutest-bunny-all-week.

Since we are building a CMS, we probably want to have more than a single page. Therefore, our page dispatcher should take a list of pages and find the page of the current URL. This turns out to be embarrassingly simple.

; Handles any number of pages. ; [pages] is a list of page structs. (define (page-list-dispatcher pages) (apply make-sequencer (map page-dispatcher pages)))

If you modify the line where the server is started, you can now serve any number of pages:

; Starting the server with the page-list-dispatcher. (define stop-server (serve #:dispatch (page-list-dispatcher (list test-page)) #:port 8080))

Updating the page

; Add the new URL to the top of the list, but first remove it if it's already in there. ; All URLs but the first will redirect 301. This bookkeeping makes sure that the new URL ; will recieve the visitors (and pagerank) of any old URL. (define (set-page-url! page new-url) (set-page-urls! page (cons new-url (filter (lambda (x) (not (equal? x new-url))) (page-urls page)))))

; Add the new revision to the top of the list. (It might be a good idea to limit the list to n revisions.) (define (set-page-revision! page new-revision) (set-page-revisions! page (cons new-revision (page-revisions page))))

Since we can handle the multiple URL:s, we should write a simple interface for this.The code above will simply add the new URL to the list, or move it to the beginning if it's already there. I also added a function to slightly simplify handling revisions.

Let's wrap this up by putting our new functions in their own file and adding a few more example pages. Your definitions window should look something like this:

#lang scheme (require "xhtml.ss") (require "page.ss") ; some example pages. (define test-pages (list (make-page "en" "Everything cute" "Here, you'll find everything that's cute and fuzzy." '("bunies" "rainbows" "soap bubbles" "candy cotton") '("cutest-bunny-all-week" "bunny") '(((h1 "Cutest bunny you've seen all week") (p "Look at " (a ((href "http://www.cutebunnies.com")) "this adorable bunny") " I found on the Internet.")) ((h1 "Cutest bunny you've seen all week") (p "Look at " (a ((href "http://www.cutebunnies.com")) "this adorable bunny") " I'm blog-spamming about."))) test-template) (make-page "en" "Hello World" "Another take at the Hello World page." '("hello world") '("hello-world" "hello world") '(((p "Hello World"))) test-template))) (require web-server/dispatchers/dispatch-pathprocedure) (require web-server/web-server) ; Starting the server with the page-list-dispatcher. (define stop-server (serve #:dispatch (page-list-dispatcher test-pages) #:port 8080))

Try reverting a page to an older revision! Bonus points if you implement a function to move the nth revision to the first place in the list.

; The page struct contains all it's revisions. ; We can revert to an older version with a function like this: (define (page-revert! page) (set-page-revision! page (cadr (page-revisions page))))

Share and enjoy!

I hope you found this useful. If you have any comments or bug fixes, feel free to email me at victor@topmost.se.