Wednesday, August 26, 2009

SHP 0.2 and JSMGR 0.1, HTTP 0.1 Are Now Available

SHP 0.2 is now available through planet, along with JSMGR 0.1 and HTTP 0.1. SHP 0.2 adds a number of enhancements on top of SHP 0.1 toward making SHP simple to use:
  • simpler startup
  • default and configurable chrome - so you can switch the UI chrome depending on the page
  • proxy integration - a built-in http/s proxy so you can use AJAX with ease
  • JSMGR, which allows you to develop javascript and css in modular fashion but yet serve them combined and compressed; it is optionally available (you'll have to require it) in SHP
  • HTTP client abstraction, which include https protocol - this is the basis of proxy
All code are released under LGPL.

Simple Startup

Now to start a new site - you just need to do:

;; startup.ss or startup.scm
(require (planet bzlib/shp/start))
(start-shp-server! <path> #:htdocs <htdocs-path> ...)


The following parameters are available:
  • path: this is the root directory for your SHP scripts
  • #:port: the port of the server - default to 8080
  • #:default: this is the path of the default script for each directory - defaults to "index"
  • #:not-found: the not-found script for the site (within the SHP root directory) - default to #f as it is optional
  • #:required: the required script for the site - default to /include/required - the site will function if the script does not exist within the SHP directory
  • #:topfilter: the topfilter script for the site - default to #f. If it is specified it must exist or an error will occur
  • #:chrome: the chrome script for the site - default to #f. If specified it must exist.
Default and Configurable Chrome

Instead of using topfilter to factor out the common chrome, you can now specify a default chrome (through the startup) and dynamically change chrome within the scripts (NOTE - there is only one chrome for a given request).

To specify the default chrome - do it when you startup the shp server via the #:chrome parameter. Make sure the chrome script does exist.

To disable chrome within the script, issue the following:
;; inside your shp script
($chrome #f)

To change chrome to another chrome script, issue the following:

;; inside your shp script
($chrome "/new/chrome/path")

A chrome script takes in one parameter, which is the actual results evaluated of all of the inner scripts (so it differs from the topfilter's parameter, which is a procedure). Remember chrome is evaluated last of all scripts, so please ensure there aren't evaluation order dependencies between chrome and other scripts.

Proxy Integration
A HTTP proxy is built-in by default in SHP to help you integrate with other web services (your own or a third party's). All you need to do is use the following script:

;; /proxy script
(proxy!)

And it'll automatically convert the pathinfo into the target URL, and pass the rest of the data in query string or post to pass it to the URL.

Headers are currently filtered, and only headers with the prefix of "bzl-" are passed through (with the "bzl-" prefix stripped).

This is designed for using with AJAX, where there is issue of the same-origin policy, so you can query web services residing in different domains without issues. It is not a full web browser proxy (yet).

Proxy is built on top of bzlib/http, which will be described later.

JSMGR Integration

An optional integration is available with JSMGR, which helps you to develop javascript and css scripts in modular fashion but allows you to serve them in combined and compressed mode.

To use JSMGR, add the following to your required script:

(require (planet bzlib/jsmgr/shp))

If you have not previously downloaded JSMGR - the above will do it for you the first time you run the code in SHP script, but since JSMGR is quite large as it includes YUI compressor, it's probably best to do so in DrScheme or mzscheme REPL.

Then you can serve js and css with the following code:

;; either /js or /css script
(js/css! #:base <your-javascript-or-css-path-here>)


And you can use helpers to generate the URL for you:

;; in your shp scripts
(script* <js1> <js2> ...) ;; generates a <script> xexpr
(css* <css1> <css2> ...) ;; generates a <link> xexpr

They generate the block by utilizing a known url path (i.e., /js for javascript and /css for css) - if you decide to use a different url you'll have to reset the following in your topfilter script:

;; topfilter script
(parameterize ((js-base-path "/new/js/url")
(css-base-path "/new/css/url"))
...)

JSMGR is distributed with YUI compressor, which is licensed by Yahoo! under BSD. I do not believe there are license compatibility issues, but let me know if that is not the case.

You can switch to a different compressor of your choice, as long as you implement a compress! procedure that has the following signature:
(-> path-string? path-string? any). The first is the path to the actual script (need to exist), and the second is the path for the compressed script (need not exist).

JSMGR also comes with a servlet-util module that allow you to integrate JSMGR into a regular servlet rather than SHP. This is not a well tested feature, however, so let me know if you run into issues.

;; your servlet
(require (planet bzlib/jsmgr/servlet-util))
(define start (make-start <base-path-to-the-javascript-or-css-directory>))
(provide start)


HTTP Client Abstraction

SHP proxy depends on bzlib/http, which will be automatically installed if you install SHP. You can use it as a replacement for the net/url's get-impure-port and post-impure-port. Besides working with the http and file protocol, http-get and http-post also work with https protocol.

(require (planet bzlib/http/client))
(http-get <url> <list-of-headers>)
(http-post <url> <data> <list-of-headers>)

The list of headers is a list of string pairs so the headers are easy to construct. The data for http-post is a list of bytes, so you'll have to manually construct the values for now.

bzlib/http also depends on bzlib/net, which provides additional utilities to deal with network code. This module will currently not be supported so I won't document how it can be used.

That's it for SHP 0.2 - happy SHP'ing.

Using JSMGR to Compress a Single Script

JSMGR as currently designed handles the compression and combination of several scripts very well. But what if you just want to access a single script?

Currently in order to access a single script you'll have to use the following URL:

http://<host>/js?s=path/to/js-file.js

It is just a tag bit more inconvenient than

http://<host>/js/path/to/js-file.js

Fortunately it is simple for us to handle pathinfos:

;; the javascript! loader will depend on a having a paticular setting
(define (js/css! (scripts ($query* "s")) #:base (base (current-directory)) #:compress! (compress! yui-compress!))
(define (scripts-helper)
(open-js/css-files/base
(if (not (null? ($pathinfo)))
(cons (string-join (filter (lambda (segment)
(not (string=? segment "")))
($pathinfo))
"/")
scripts)
scripts)
base
compress!))

(raise
(http-client-response->response
(make-http-client-response "1.1"
200
"OK"
'(("Content-Type" . "text/javascript"))
(scripts-helper))
(lambda (x) x))))

Now if we want to pass a single script "normally" - we can do that.

This single script support is currently only available in the shp adapter, not the servlet adapter.

Tuesday, August 25, 2009

A Unified Startup Script

Up to now we have kept the startup script in two separate places: one is the servlet.ss so it can be readily used as a servlet, and the other is web.ss that will serve the single servlet. Satisfying our adherence to the Pareto principle, we will support out of the box only the single servlet method, so we'll combine the two files into one.

By combining the two into one we'll also solve an interesting issue: up to now shp-handler does not really have access to the htdocs path, since that value is consumed by the serve/servlet but is not really passed to make-shp-handler in the separated approach. Now we'll remedy this problem:

(define (start-shp-server! path
#:htdocs (htdocs #f)
#:port (port 8080)
#:default (default "index")
#:not-found (not-found #f)
#:required (required "/include/required")
#:topfilter (topfilter #f)
#:chrome (chrome #f))
(define start
(make-shp-handler path
#:default default
#:not-found not-found
#:required required
#:topfilter topfilter
#:chrome chrome
#:htdocs htdocs))
(thread (lambda ()
(serve/servlet start #:port port
#:servlet-path "/"
#:launch-browser? #f
#:listen-ip #f
#:servlet-namespace (list (path->string (this-expression-file-name)))
#:servlets-root path
#:servlet-regexp #px".*"
#:server-root-path path
#:extra-files-paths (list (if (path? htdocs)
htdocs
(string->path htdocs)))
))))

The single function now creates the shp-handler and start web-server at the same time. All that's left to do is to pass the value of the htdocs into shp-handler. We want to be able to access the value in shp script as follows:

($htdocs)

So to accomplish the following, we will first add an additional attribute to shp-handler struct:

(define-struct shp-handler (... htdocs)
...)

Then we will modify the constructor function to take in an htdocs parameter:

(define (*make-shp-handler path
...
#:htdocs (htdocs #f))
(make-shp-handler path
...
(if (not htdocs)
(build-path path 'up "file")
htdocs)

))

Which defaults to the the "file" directory in the same directory that "shp" directory live in.

Finally - we have the accessor function to be used in the SHP scripts:

(define ($htdocs)
(shp-handler-htdocs ($server)))

Lastly - let's make sure we have error checking to test for both the shp and htdocs path's existence.

(define (*make-shp-handler path
...
#:htdocs (htdocs #f))
(unless (directory-exists? path)
(error 'make-shp-handler "path ~a does not exist." path))
(let ((htdocs
(let ((htdocs (if (not htdocs)
(build-path path 'up "file")
htdocs)))
(if (directory-exists? htdocs)
htdocs
(error 'make-shp-handler "htdocs path ~a does not exist." htdocs)))))

....))
Now we can simple do the following to start the site with any path.

(require (planet bzlib/shp/start))
(start-shp-server! <path> #:htdocs <htdocs-path>)




Make the Not Found Path Optional

Currently you'll have to supply the not-found path for the app to work, this slows down the creation of the site, so we'll try to make it optional.

The idea is simple - since the package includes a not-found script, we should use that by default when it is not specified.

(define (*make-shp-handler path
#:default (default "index")
#:not-found (not-found #f)
#:required (required "required")
#:topfilter (topfilter #f)
#:chrome (chrome #f))
(make-shp-handler path
default
(if (not not-found)
(make-script (build-path (this-expression-source-directory) "example" "shp" "notfound") 0)
not-found)

(init-script required path) topfilter chrome))

we'll also have to change the segment->path and segment->partial-path function to ensure make them play well with having a script struct as the value of not-found:

(define (not-found-path base)
(let ((not-found (shp-handler-not-found ($server))))
(if (script? not-found)
(script-path not-found)
(build-path base not-found))))


;; partially matching the path from the beginning of the path
(define (segments->partial-path segments)
(define (helper rest path default)
;; if we did not find any match return not-found
(cond ((null? rest)
(not-found-path (shp-handler-path ($server)))) ...)
...)

(define (segments->path segments (partial? #t))
....
(else
(not-found-path script)))))

So now if you decide not to specify the not-found script path, the code will still work.

Monday, August 24, 2009

Manage CSS through SHP and JSMGR

Besides javascript, CSS is another type of text resource that suffer same problems that javascript does:
  • lack of modularity
  • lack of compression
The problem is almost exactly the same that we should be able to use JSMGR to handle CSS as well!

Luckily - YUI compressor works with both javascript and CSS, so by using YUI compressor our work is limited (you can of course implement your own compress! routine). What we need is to add the signature specific to CSS, we have a couple of approaches:
  1. have css-* functions that matches the javascript-* functions
  2. change the signature of javascript-* functions to it does not have the type of the script in its name, and instead, takes it in a #:type parameter
It seems that at the compression level, approach #2 is more logical, especially since YUI compressor works with either simply by the file extension type. So instead of open-javascript-files and open-javascript-files/base, we should have open-js/css-files and open-js/css-files/base. And javascript! will also be renamed as js/css!.

The only trouble we will encounter at this time with this approach is that when we need to implement another compressor we'll have to follow YUI's interface, but since that's out of scope, that's worry for another day.

At the xexpr-generator level, though, it makes sense to have the two types separated, since by default they generate different signatures:
  • js-url/xexpr generates <script> block, but we'll call it script* to match it more with xexpr
  • css-url/xexpr generates <link rel=styleet> block, and we'll call this css* to match it more with xexpr
So, we'll adapt the signature accordingly:

(define css-base-path (make-parameter "/css"))

(define (css-url path . paths)
(let ((url (string->url (css-base-path))))
(set-url-query! url (map (lambda (path)
(cons 's path))
(cons path paths)))
(regexp-replace* #px"(\\&)([^a])" (url->string url) ";\\2")))

(define (css* path . paths)
`(link ((rel "stylesheet") (type "text/css") (href ,(apply css-url path paths)))))


Now we can handle CSS with JSMGR as well!

Handling UI Chromes - Make it Switchable

You might have noticed that since topfilter gets executed with every single request, it is a nice place to specify the UI chrome so we do not need to separately specify it for every page.

The drawback with such approach is that once you have placed the chrome into topfilter, it'll accompany all of the responses (unless you raise an error or a response object to escape the continuation), regardless whether you intend it or not.

In majority of the cases, you'll be just fine, since most of the web pages would want to have such a chrome. But what if you are generating non-html pages? For example, if you want to dynamically generate a javascript file, or if you want to return an XML data (i.e. setting up a web-service). In these situation, a default chrome that runs 100% of the time defeats the purpose. Hence we want to have a "switchable" chrome that is defaulted to some value most of the time, but can be turned off as necessary.

The first thing we need again is a parameter:

(define $chrome (make-parameter #f))

What should go into the parameter is the path of the chrome file.

Then in order to execute the chrome, we'll wrap around the inner handler with:

(define (make-chrome-based-handler inner)
(lambda ()
(let ((result (inner)))
(if ($chrome)
(include! ($chrome) result)
result))))


(define (include! path
#:topfilter (topfilter #f)
#:partial? (partial? #f)
. args)
(define (make-script path partial?)
(evaluate-script (if (script? path)
(script-path path)
(resolve-path path partial?))))
(define (helper topfilter)
(let ((proc (make-script path partial?)))
(let ((chrome-handler (make-chrome-based-handler (lambda () (apply proc args)))))
(if topfilter
(topfilter chrome-handler)
(chrome-handler)))
))
(helper (if topfilter (make-script topfilter #f) topfilter)))
;; adding the chrome parameter
(define-struct shp-handler (path default not-found required topfilter chrome)
#:property prop:procedure
(lambda ($struct request)
(handle-request $struct request)))
;; parameterize the chrome parameter
(define (handle-request server request)
(parameterize (($server server) ...)
(parameterize (...
($chrome (shp-handler-chrome server)))
...)))

With the above, we can then put chrome into its own file:

;; a default chrome
(:args inner)
`(html (body ,inner)) ;; NOTE inner is not a procedure

You'll notice that chrome looks a lot like the topfilter. And yes, in a way, chrome is a specialized filter, but it differ from the topfilter in that the inner is not a procedure call. Instead, the values are first evaluated and finally pass to chrome. This way we can modify the $chrome parameter to disable it half way in the code.

So - when you use chrome you will need to remember that it is evaluated last. This generally should not matter, but it does when your code have order dependencies. Make sure your chrome will work as the last thing being executed in your code.

Productionize the Chrome Code

The above chrome code has a serious bug - it causes infinite recursion. The reason is that make-chrome-base-handler calls include! and include! calls make-chrome-base-handler, so we cannot use include! inside make-chrome-base-handler:

(define (make-chrome-based-handler inner)
(lambda ()
(let ((result (inner)))
(if ($chrome)
;; I see - chrome itself now will cause a recursion!!
((evaluate-script (resolve-path ($chrome) #f)) result)
result))))

With the above, your handler will return, but if you use include! in your script you'll notice they also have the chrome applied to them - our include! does not know when chrome should be applied (only at the outermost layer), so we'll also need to fix that:

(define (include! path
#:topfilter (topfilter #f)
#:partial? (partial? #f)
#:chrome? (chrome? #f) ;; disable by default.
. args)
(define (make-script path partial?)
(evaluate-script (if (script? path)
(script-path path)
(resolve-path path partial?))))
(define (helper topfilter)
(let ((proc (make-script path partial?)))
(let ((handler
(let ((handler (lambda () (apply proc args))))
(if chrome? (make-chrome-based-handler handler) handler))))
(if topfilter
(topfilter handler)
(handler))))
)
(helper (if topfilter (make-script topfilter #f) topfilter)))

And we need to ensure the outermost layer is called with chrome enabled:

(define (handle-request server request)
...
(make-response (include! (request-uri request)
#:topfilter (shp-handler-topfilter server)
#:chrome? #t
#:partial? #t))))))

Now we have a default and switchable chrome, and to turn the chrome off, just call ($chrome #f).

Friday, August 21, 2009

Managing your Javascripts through SHP (4) - Shorthand for Generating the JSMGR querystring

To really simplify the development it would be nice to be able to call something like the following to issue the javascript reference:

;; in any SHP script
`(script ((src ,(js-link "jquery.js" "jquery.ui.js" ...))) "")

and it will translate into the appropriate link. Let's see how it can be done.

Assuming the path is hardcoded (let's say "/js/"), then it's pretty straight forward:

(define js-base-path (make-parameter "/js"))

(define (js-url path . paths)
(let ((url (string->url (js-base-path))))
(set-url-query! url (map (lambda (path)
(cons 's path))
(cons path paths)))
(url->string url)))

Given the base path is set in a parameter, you can parameterize it to a different value. It would be nice to "automatically" set the js-base-path based on where you call the value, but that might be more difficult than it's worth so it's out of scope for now.

Managing your Javascripts through SHP (3) - Integrate with Servlets

If you do not want to use SHP but would like to use jsmgr, it provides a servlet adapter that you can use - just do the following in your servlet:

;; your servlet...
(require (planet bzlib/jsmgr/servlet-util))
(define start
(make-start "the-path-to-javascript-here..."))

And the servlet will have the exact same functionality. The following is the servlet-util.ss that enabled the above:

(require web-server/http/request-structs
web-server/http/response-structs
net/url
mzlib/etc
scheme/contract
(planet bzlib/shp:1:1/request)
(planet bzlib/shp:1:1/proxy)
(planet bzlib/http/client)
"loader.ss"
)

(define (request-helper request)
(parameterize (($request request))
($query* "s")))

;; we want to take the request's query object (which are quite available through our definitions...)
;; and then map it to
(define (make-start base)
(lambda (request)
(let ((scripts (request-helper request)))
(http-client-response->response
(make-http-client-response "1.1"
200
"OK"
'(("Content-Type" . "text/javascript; charset=utf-8"))
(open-javascript-files/base scripts base))
(lambda (x) x)))))

(provide/contract
(make-start (-> path-string? (-> request? response/c)))
)

Managing your Javascripts through SHP (2) - Integrate with SHP

Since we now can retrieve the combined scripts, the integration basically will focus on mapping the request query values to the scripts, and return the combined, mini-fied scripts as the response. Let's try to do that within SHP.

We'll use s as the query key name. And it can be repeated multiple times. What we want is to write an SHP script that looks like the following:

;; js
;; loading javascripts
;; default to using s key, but can be customized
(javascript!) ;; or (javascript! ($query* "s"))
The definition of javascript! will roughly look like the following:

(define (javascript! (scripts ($query "s")) #:base (base (current-directory)))
(http-client-response->response
(make-http-client-response "1.1"
200
"OK"
'(("Content-Type" . "text/javascript; charset=utf-8"))
(apply javascript-loader #:base base
scripts)) ;; this is a mismatch

(lambda (x) x)))

And we encounter our first mismatch - our original loader was a closure over the base path. While we can still create the closure as previously designed, the closure is wasted since we are not keeping it around. It's simpler to have a new interface that takes in the base path - and we'll take the opportunity to refactor the code a bit:

(define (yui-compress! path min-path)
(system (string-join
(list "java"
"-jar"
(path->string (build-path (this-expression-source-directory)
"yuicompressor.jar"))
"-o" (path->string min-path) (path->string path))
" ")))

(define (open-javascript-files paths (compress! yui-compress!))
(define (min-path-helper path)
(string->path (string-append (path->string path) ".min")))
(define (helper path)
(let ((min-path (min-path-helper path)))
(when (or (not (file-exists? min-path))
(> (file-or-directory-modify-seconds path)
(file-or-directory-modify-seconds min-path)))
(compress! path min-path))
(open-input-file min-path)))
(apply input-port-append #t (map helper paths)))

(define (open-javascript-files/base base paths (compress! yui-compress!))
(define (helper path)
(build-path base path))
(open-javascript-files (map helper paths) compress!))

Now yui-compress! is extracted from the loader, and two versions of the loader are exposed: open-javascript-files and open-javascript-files/base, which takes in an additional base argument. And they both take in an optional compress! argument that you can use to switch out yui-compress! if you want to use another javascript compressor.

Now the javascript! handler would look like:

;; the javascript! loader will depend on a having a paticular setting
(define (javascript! (scripts ($query* "s")) (base (current-directory)))
(raise
(http-client-response->response
(make-http-client-response "1.1"
200
"OK"
'(("Content-Type" . "text/javascript; charset=utf-8"))
(open-javascript-files/base scripts base)
(lambda (x) x)))))
And we now can add the module (we are calling this bzlib/jsmgr/shp) to our SHP instance:

;; required.shp
(require (planet bzlib/jsmgr/shp))

And the script.shp should contain the following:

;; script.shp
(javascript! #:base "some-path-here...")

Testing it out give us a cryptic error:

Exception

The application raised an exception with the message:

open-input-file: cannot open input file: "../collects/planet/main.ss" (The system cannot find the file specified.; errno=2)

Although the error is cryptic, we know it arises from our SHP handler's require-modules! procedure. It turns out that because the procedure calls flatten, we have flattened (planet bzlib/jsmgr) into two separate collections and hence caused the error. Below fixes the error:

(define (require-modules! terms)
(define (require! module)
(namespace-require module))
(define (helper listof-modules)
(parameterize ((current-namespace handler-namespace))
(for-each (lambda (modules)
(for-each require! modules))
listof-modules)))

(helper (map cdr (filter require-exp? terms))))

Now the Javascript Manager is available through SHP.

Thursday, August 20, 2009

Managing your Javascripts through SHP

If you are doing extensive AJAX development (which many people are these days), you probably are writing quite a bit of OOP Javascript code, along with using libraries such as Prototype, jQuery, Dojo, etc.

The trouble with javascript is that it lacks modern module management facilities that we have come to expect from programming languages, especially given its importance in the browser world. You either develop all of your code in one large script, or you risk the browser needing to download numerous small scripts, which is inefficient. Furthermore, if you plan on using javascript compressors such as the YUI compressor, if you make updates to the source script you would have to recopmress the code manually.

It would be nice to have a tool that can help with the above issues:
  • allows you to code in multiple small scripts, but automatically combine the smaller scripts into a single script as you specify
  • automatically helps you recompress the code if you make changes to them, so you just need to worry about the source (which will not be served directly unless specified)
Let's look at what we need to accomplish the goal:
  • ability to load and combine multiple files into a single response
  • ability to detect timestamp of the source and the compressed scripts, and re-compress if the source script's timestamp is newer than the compressed scripts
You might have come across "script builders" such as jQuery UI, and Mootools - you can use them as a mental model for the goal (for your own scripts!).

Compressor Choice

Currently the best choice of javascript compressor appears to be YUI compressor (if you are not trying to obfuscate the scripts too much) as it provides the best combination of compression ratio as well as the decompression speed (obfuscators such as Packer loses speed during the unpack phase), so it is our choice for compressor. It is licensed under BSD license, so we can redistribute the binary without issues. In the future we might add support for other compressors, but for now it's out of scope.

Prerequisite

YUI compressor requires Java (version >= 1.4), which you'll have to install separately and make it available in the system path.

First Attempt - a Single Javascript File
Let's see if we can get a simple version of this up and running - let's first just serve out a javascript file sitting somewhere in our system. No compressions. It is quite straight forward.

(define (make-loader root-path)
(lambda (path)
(open-input-file (build-path root-path path))))

The basic signature will be make-loader, which will take a path and returns a procedure we can then use to load the file. If the file does not exist an appropriate error will be thrown (which we'll have to handle once integrating into servlets). We'll also have to handle setting the content-type during integration.

Second Attempt - Multiple Javascript Files

Now let's see if we can serve multiple javascript files. PLT Scheme makes this easy with input-port-append:

(require scheme/port)
(define (make-loader root-path)
(lambda (path . paths)
(define (helper path)
(open-input-file (build-path root-path path)))
(apply input-port-append #t (map helper (cons path paths)))))

All of the input-port are now appended together, so when we exhausted the data from the first port, we'll then retrieve the data from the second port, so on.

Adding Compression

So far, so good. Let's now try to see if we can add compression in here.

Let's just say that the YUI compressor live in the same directory as the module, and Java is in the path. And all of the compressed script's path is the same as the source path, with the addition of ".min":

(require scheme/system scheme/string)
(define (make-loader root-path)
(lambda (path . paths)
(define (min-path-helper full-path)
(string->path (string-append (path->string full-path) ".min")))
(define (path-helper path)
(build-path root-path path))
(define (helper path)
(let* ((path (path-helper path))
(min-path (min-path-helper path)))
(when (or (not (file-exists? min-path))
(< (file-or-directory-modify-seconds min-path)
(file-or-directory-modify-seconds path)))
(system (string-join (list "java"
"-jar"
(path->string (build-path (this-expression-source-directory)
"yuicompressor.jar"))
"-o" (path->string min-path) (path->string path))
" ")))
(open-input-file min-path)))
(apply input-port-append #t (map helper (cons path paths)))))

The only thing we need to make sure to handle is the error that could result from the system - it returns #f when the call failed, so we need something a bit more verbose (including error messages) in order to throw error about the failed compilation (it is best to have such error messages show up during your development, rather than during production).

Allright - the next step would be for us to integrate this into SHP. Stay tuned.

Wednesday, August 19, 2009

Continuing of HTTP Call & Proxy Integration (4) - Customize Responses

If you are doing AJAX you know that the XMLHttpRequest object is quite brittle - different browsers offer different behaviors and bugs. One such bug is that IE does not properly the more complicated cousins of text/xml, such as application/atom+xml. In such situations it would be nice to have the proxy customizing the response before it is passed back to the browser.

The simplest way of accomplishing conversion of content-type is to take an keyword parameter:

(define (proxy! (url ($pathinfo)) (headers ($headers))
#:content-type (content-type (lambda (x) x)))
(define (helper url headers)
(raise
(http-client-response->response
(case ($method)
((post) (http-post url (request-post-data/raw ($request)) headers))
((get) (http-get url headers))
(else (error 'proxy "proxy method ~a not supported" ($method))))
content-type)))
(call-with-values
(lambda ()
(normalize-url+headers url headers))
helper))

This means we should also modify http-client-response->response:

(define (http-client-response->response r content-type)
(define (get-content-type r)
(define (helper header)
(string->bytes/utf-8
(content-type (if (not header)
"text/html; charset=utf-8"
(cdr header)))))

(helper (assf (lambda (key)
(string-ci=? key "content-type"))
(http-client-response-headers r))))
(define (normalize-headers r)
(map (lambda (kv)
(make-header (string->bytes/utf-8 (car kv))
(string->bytes/utf-8 (cdr kv))))
(http-client-response-headers r)))
(define (make-generator)
(lambda (output)
(let loop ((b (read-bytes 4096 r)))
(cond ((eof-object? b)
(void))
(else
(output b)
(loop (read-bytes 4095 r)))))))
(make-response/incremental (http-client-response-code r)
(string->bytes/utf-8 (http-client-response-reason r))
(current-seconds)
(get-content-type r)
(normalize-headers r)
(make-generator)))

And now our proxy can be customized if we want to filter out extended XML headers:

;; proxy.shp
(proxy! #:content-type
(lambda (ct)
(if (regexp-match #px"^application/.+xml.*$" ct)
"text/xml"
ct)))

Tuesday, August 18, 2009

Continuing of HTTP Call & Proxy Integration (3)

To satisfy the requirement of proxy, the following serves as the basic skeletion:

(define (proxy! (url ($pathinfo)) (headers ($headers)))
(case ($method)
(("post") (http-post (url-helper url) (request-post-data/raw ($request)) headers))
(("get") (http-get (url-helper url) headers))
(else (error 'proxy "proxy method ~a not supported" ($method)))))

As we can see, we already have most of the HTTP connection code defined, except we need to ensure all of the custom headers, as well as the user/password credentials are properly retrieved. For that we need to massage the url and the headers.

Filter Out Non-Custom Headers

The first step is to filter out the non-custom headers:

(define (custom-header? header)
(string-ci=? "bzl-" (substring (car header) 0 (max (string-length (car header)) 4))))

(define (convert-header header)
(cons (regexp-replace #px"^bzl-(.+)$" (car header) "\\1")
(cdr header)))

(define (headers->custom-headers headers)
(map convert-header (filter custom-header? header)))


We also want to keep some regular headers such as Content-Type and Content-Length:

(define (headers->custom-headers headers)
(append (filter (lambda (header)
(or (string-ci=? (car header) "content-type")
(string-ci=? (car header) "content-length")))
headers)

(map convert-header (filter custom-header? headers))))

Extract User Credential from URL

In case the user credentials is supplied in user:password form in the path, we want to extract it and push it onto the headers:

(define (url-helper url)
(cond ((url? url) url)
((string? url) (string->url url))
(else ;; this is based on pathinfo...
(let ((url (string-join url "/")))
;; keep the url query
(set-url-query! url (url-query ($uri)))
url))))

(define (url->auth-header url)
;; helper removes the additional \r\n appended by base64-encode
(define (remove-extra-crlf auth)
(substring auth 0 (- (string-length auth) 2)))
(if (not (url-user url))
#f
(cons "Authorization"
(string-append "Basic "
(remove-extra-crlf
(bytes->string/utf-8
(base64-encode
(string->bytes/utf-8 (url-user url)))))))))

And finally we modify proxy! to use the newly generated url & headers:

(define (normalize-url+headers url headers)
(let ((url (url-helper url))
(headers (headers->custom-headers headers)))
(let ((auth (url->auth-header url))) ;; in case the auth info is passed in via url.
(let ((headers (if (not auth) headers
(cons auth headers))))
(values url headers)))))


(define (proxy! (url ($pathinfo)) (headers ($headers)))
(define (helper url headers)
(case ($method)
(("post") (http-post url (request-post-data/raw ($request)) headers))
(("get") (http-get url headers))
(else (error 'proxy "proxy method ~a not supported" ($method)))))
(call-with-values
(lambda ()
(normalize-url+headers url headers))
helper)
)

Convert the Responses

Since the http-get and http-post returns http-client-response instead of response/basic, we'll have to have an adapter to convert from one type to another, and once it's converted, we can handle it the same way as redirect! to raise the response/basic.

(define (http-client-response->response r)
(define (get-content-type r)
(define (helper header)
(if (not header)
#"text/html; charset=utf-8"
(string->bytes/utf-8 (cdr header))))
(helper (assf (lambda (key)
(string-ci=? key "content-type"))
(http-client-response-headers r))))
(define (normalize-headers r)
(map (lambda (kv)
(make-header (string->bytes/utf-8 (car kv))
(string->bytes/utf-8 (cdr kv))))
(http-client-response-headers r)))
(define (make-generator)
(lambda (output)
(let loop ((b (read-bytes 4096 r)))
(cond ((eof-object? b)
(void))
(else
(output b)
(loop (read-bytes 4095 r)))))))
(make-response/incremental (http-client-response-code r)
(string->bytes/utf-8 (http-client-response-reason r))
(current-seconds)
(get-content-type r)
(normalize-headers r)
(make-generator)))

(define (proxy! (url ($pathinfo)) (headers ($headers)))
(define (helper url headers)
(raise
(http-client-response->response
(case ($method)
((post) (http-post url (request-post-data/raw ($request)) headers))
((get) (http-get url headers))
(else (error 'proxy "proxy method ~a not supported" ($method)))))))
(call-with-values
(lambda ()
(normalize-url+headers url headers))
helper))

Fixing the URL

The code so far almost works, except that since empty strings are stripped from pathinfo, we'll get http:/www.google.com instead of http://www.google.com when we reconstruct the pathinfo into url, so we'll fix that by ensuring an additional empty path is reintroduced:

(define (join-url segments)
(define (helper segments)
(string-join segments "/"))
(cond ((null? segments) (error 'join-url "invalid segments: ~a" segments))
((string-ci=? (car segments) "http:")
(helper (list* (car segments) "" (cdr segments))))
((string-ci=? (car segments) "https:")
(helper (list* (car segments) "" (cdr segments))))
(else
(helper (list* "http:" "" (cdr segments))))))

(define (url-helper url)
(cond ((url? url) url)
((string? url) (string->url url))
(else ;; this is based on pathinfo...
(let ((url (string->url (join-url url))))
;; keep the url query
(set-url-query! url (url-query ($uri)))
(display (format "~a\n" (url->string url)) (current-error-port))
url))))


With these it is now easy to add proxy capability into SHP:

;; proxy
(proxy!)


Voila - how you can use proxy to wrap around internal or 3rd party web services!

Continuing of HTTP Call & Proxy Integration (2)

In the last post we basically have a working rudimentary HTTP client. It should not be a lot of work to create a rudimentary proxy out of the HTTP client.

Basically - we want to more or less pass the request to the destination web service, with as minimal fuss as possible.

By default, the proxy call should look something like the following:

http://host:port/http://target/path
http://host:port/https://target2/path2

But of course we also want to allow for hardcoded path to possibly map directly to another url, for example:

http://host:port/google => http://www.google.com/

In order for the above to work, we would want to have scripts called http: and https: located at the SHP root path with the following syntax:

;; http: or https:
(proxy! (pathinfo->url))

But unfortunately http: and https: are not valid file names in Windows, so the above technique cannot be used with Windows (but should be usable with Linux). and we'll settle for the following:

http://host:port/proxy/http://target/path
http://host:port/proxy/https://target2/path2

Then the code for proxy should be:

;; proxy
(proxy! (path->info->url (cdr ($pathinfo))))
Custom Headers

Many web services (such as Twitter, Del.icio.us, and Blogger) do not take all of the client headers with their APIs, since they are not expecting the browser making direct connections and hence do not need all of the headers issued from browsers.

What this means is that we should filter out most of the headers except needed ones, a simple way to determine what's needed is via custom headers. We'll choose to create our custom headers via prefix of bzl-.

So, any headers started with bzl- shall be passed onto the target, and other ones (except mandatory headers such as Content-Type, Content-Length) should be filtered out. This way it would allow fine grained control by us.

Since all of the headers are already available in $headers parameter, the conversion should be automatic.

Authentication & Passing User Credentials Securely

A special header is the Authorization header, which contains the user's login and password. HTTP Basic is basically insecure, and HTTP Digest will require a challenge and a response - for now we'll only handle passing HTTP Basic.

There are a couple of ways of passing user credentials - one is via the above mentioned custom header mechanism (i.e. bzl-Authorization), and the other is via the user field of the URL value, i.e.

http://user:password@host/path...

Then within our proxy it looks like:

http://host:port/proxy/http://user:password@host/path...

Since either way is insecure without a SSL, it does not really matter which way is preferred. But what we need to make sure is not to use the regular Authorization header, since that should be used for potentially authenticating against the proxy itself.

Review of Requirements

Let's reiterate the requirements:
  • a pass through of request to the proxied target, and a pass through of responses
  • take the URL from the pathinfo of the request
  • very simple usage within SHP (i.e. no fancy setup of anything)
  • a clean way of passing through headers
  • a clean (& mostly secure) way of passing credentials for the 3rd party web service
Let's see how we can develop such a proxy to match the requirements. To be continued...

Continuing of Integration Features - HTTP Call & Proxy

In order to fulfill the role of an integrator, we want to add the ability to make HTTP calls to other services (either within your own network or 3rd party web services). The reason such capability is needed is simple: unless we plan on developing everything under the sun, we'll need to interface with other applications that provides crucial and non-trivial services to speed up your development.

PLT Scheme already provides basic URL fetching capability through the net/url module, which can serve as a basis for our development. Let's get started.

HTTPS Calls

The first thing to note is that net/url does not support SSL connections, so we'll need to add the support ourselves.

(require openssl
scheme/tcp
net/url
mzlib/trace
scheme/contract
)

;; https->impure-port
;; base function to handle the connection over https
(define (https->impure-port method url (headers '()) (data #f))
(let-values (((s->c c->s) (ssl-connect (url-host url)
(if (url-port url) (url-port url) 443)))
((path) (make-url #f #f #f #f
(url-path-absolute? url)
(url-path url)
(url-query url)
(url-fragment url))))
(define (to-server fmt . args)
(display (apply format (string-append fmt "\r\n") args) c->s))
;; (trace to-server)
(to-server "~a ~a HTTP/1.0" method (url->string path))
(to-server "Host: ~a:~a" (url-host url)
(if (url-port url) (url-port url) 443))
(when data
(to-server "Content-Length: ~a" (bytes-length data)))
(for-each (lambda (header)
(to-server "~a" header)) headers)
(to-server "")
(when data
(display data c->s))
(flush-output c->s)
(close-output-port c->s)
s->c))

;; get-impure-port/https
;; a GET version of https call
(define (get-impure-port/https url (headers '()))
(https->impure-port "GET" url headers))

;; post-impure-port/https
;; a POST version of https call
(define (post-impure-port/https url data (headers '()))
(https->impure-port "POST" url headers data))

The above provided get-impure-port/https and post-impure-port/https, which mimics net/url's get-impure-port and post-impure-port. We are only interested in impure port as we want to be able to manipulate the headers, since it's trivial to add pure port on top of it.

With the above we can then create an abstraction over both https & http url fetching:

;; http-client-response holds all of the metadata (code, status, headers)
;; as well as the data stream
(define-struct http-client-response (version code reason headers input)
#:property prop:input-port 4)

;; read-http-status
;; parse the http-status of the response.
(define (read-http-status in)
(define (helper match)
(if match
(list (cadr match) (caddr match) (cadddr match))
match))
(define (reader in)
(read-folded-line in))
(trace reader)
(helper (regexp-match #px"^HTTP/(\\d\\.\\d)\\s+(\\d+)\\s+(.+)$" (reader in))))

;; a helper over the make-http-client-response
(define (*make-http-client-response in)
(define (helper version code reason)
(make-http-client-response version (string->number code) reason (read-headers in) in))
(let ((status (read-http-status in)))
(if (not status)
(error 'make-http-client-response "invalid http response")
(apply helper status))))
;; helper over url conversion
(define (url-helper url)
(if (string? url) (string->url url)
url))

;; converting headers over to headers that can be used by get/post-impure-port
(define (headers-helper headers)
(map (lambda (kv)
(format "~a: ~a" (car kv) (cdr kv)))
headers))

;; http-get
;; abstraction over http GET
(define (http-get url (headers '()))
(define (helper url)
(*make-http-client-response
((if (string-ci=? (url-scheme url) "https")
get-impure-port/https
get-impure-port)
url (headers-helper headers))))
(helper (url-helper url)))

;; http-post
;; abstarction over http POST
(define (http-post url data (headers '()))
(define (helper url)
(*make-http-client-response
((if (string-ci=? (url-scheme url) "https")
post-impure-port/https
post-impure-port)
url data (headers-helper headers))))
(helper (url-helper url)))

The abstraction more of less follows the net/url's approach, except that it wraps around both http & https procedures, adding convenient header handlings, as well as providing an abstraction over the http response to parse through all of the metdata, but yet still retain their values (unlike get/post-pure-port which gets rid of all of the status and headers).

Reading and Parsing RFC822 Headers

RFC822 compliant headers requires non-trivial treatment. While the concept of headers that's made of key/values appear simple, in fact they are not for many historical reasons that are well captured in all of the RFC's. Below are a list of things that we need to be aware of:
  • RFC822 headers might span multiple lines in the style of "folded line" (the line continues if the following line starts with non-terminating whitespace, which includes #\space and #\tab), and it might keep going indefinitely
  • The header values might contain comments (enclosed in parentheses), which are nestable, and generally the comments should be ignored but might not be (for example - many server generate date fields with comment to denote the timezone)
  • Because of the traditional SMTP line width limitations, generating headers might require breaking the line into the folded line along line width limitation (which generally is around 70)
  • Also due to the traditional ASCII oriented nature of network protocols, there are two additional encodings defined (called Q and B) that parsers should be able to handle appropriately in order to correctly parse headers
It would be cool to support all of the above capabilities, but given we are only using headers in HTTP situation (which generally are not subjected to the SMTP line & encoding limits) we'll limit ourselves to the following for the immediate purpose:
  • generating a single line per header
  • parse a folded line per header, but do not handle encodings
  • assume the character set to be UTF-8 (otherwise throw errors)
The following handles the folded line:

;; read-folded-line
;; read folded line according to RFC822.
(define (read-folded-line in)
(define (folding? c)
(or (equal? c #\space)
(equal? c #\tab)))
(define (return lines)
(apply string-append "" (reverse lines)))
(define (convert-folding lines)
(let ((c (peek-char in)))
(cond ((folding? c)
(read-char in)
(convert-folding lines))
(else
(helper (cons " " lines))))))
(define (helper lines)
(let ((l (read-line in 'return-linefeed)))
(if (eof-object? l)
(return lines)
(let ((c (peek-char in)))
(if (folding? c) ;; we should keep going but first let's convert all folding whitespaces...
(convert-folding (cons l lines))
;; otherwise we are done...
(return (cons l lines)))))))
(helper '()))

Then to read all of the headers is to read in all of the folded lines until we encounter an empty line (which would either be EOF or the separator between headers and the data).

;; a header is simply a pair of strings...
(define (header? h)
(and (pair? h)
(string? (car h))
(string? (cdr h))))

;; header->string: does not generate terminator
(define (header->string h)
(format "~a: ~a" (car h) (cdr h)))

;; string->header
;; convert a string into a header?
(define (string->header line)
(define (helper match)
(if match
(cons (cadr match) (caddr match))
#f))
(helper (regexp-match #px"^([^:]+)\\s*:\\s*(.+)$" line)))
;; reading header.
;; RFC822 headers are actually non-trivial for parsing purposes.
;; first it requires the handling of "folded line", which means that any line
;; that does not end directly in
(define (read-headers in)
(define (return lines)
(map line->header lines))
(define (helper lines)
(let ((l (read-folded-line in)))
(if (string=? l "") ;; we are done...
(return lines)
(helper (cons l lines)))))
(helper '()))

The above should cover the majority of cases that we'll encounter during interactions with web services, and we'll fix issues as we encounter them.

To be continued...

Monday, August 17, 2009

Introducing SHP 0.1: Scheme Hypertext Processor - A Tutorial

SHP v0.1 has just been released, and this is a tutorial for using SHP.

Introduction

SHP is a PHP/JSP-like framework for PLT Scheme web-server. Instead of developing in servlets, you develop in shp scripts that are dynamically loaded and evaluated. The main benefits for developing in shp scripts instead of servlets are:
  • instant refresh of changes - you do not need to reload servlets manually anymore
  • file-based dispatching and URL mapping - you get URL mapping to the underlying script location, which takes out a big chunk of URL mapping work if you like to have "pretty" URLs
As SHP's underlying platform is PLT Scheme and web-server, you get retain all of the benefits of developing in scheme instead of in php as well, including:
  • writing in xexpr and use quasiquotes instead of html snippets and PHP SSIs
  • scope safety - no global or superglobal variables roaming around somewhere, and every script is compiled into its own functions, so all variables will either need to be parameters or explicitly passed as arguments.
  • much safer language than PHP - you'll never have a mysterious variable magically appearing due to typos


License

SHP is released under LGPL.

Prerequisites

SHP depends on PLT Scheme version 4.2.1 or later. Make sure you have it installed.

Installation

SHP is easy to install within PLT Scheme, just open either DrScheme or mzscheme and type the following in REPL:

(require (planet bzlib/shp))

PLT Scheme automatically takes care of downloading the latest version. If you want to download a specific version, just do

(require (planet bzlib/shp:<major>:<minor>))

and substitute <major> & <minor> for the appropriate planet version numbers.

Example Site

SHP comes with an example site - you can browse and take a look at the source code to get a sense of how to develop in SHP. In general, you can just think of developing SHP as developing with PHP (if you have such experiences).

The example is located under the example/ subpath of the installed location of SHP package. If you go to the example directory you will find the following:
  • web.ss - for starting a single servlet web server
  • servlet.ss - the servlet wrapper for SHP
  • test-servlet.ss - an example servlet that is being "wrapped" by SHP
  • shp - the example shp script directory; contains the scripts


You can study all of the source code in the example directory for developing your own sites. The rest of the tutorial goes through the process and introduce the currently available features of SHP.

To start the example site, just do the following:

(require (planet bzlib/shp/example/web))

And point your browser to http://<site>:8001/.

Setting Up a Site

If you have been developing web-server servlets then you already know how to get started. Use the following as your start procedure:

(define start
(make-shp-handler ))


make-shp-handler generates your start procedure for you. You can specify the following parameters:

  • path: the main path of your shp script directory
  • #:default: the path to the "default" script, which by default points to "index.shp" (like index.html or index.php).
  • #:not-found: the path to the "not found" script. Unlike #:default, there is currently only one "not found" script per site (not per directory).
  • #:required: an optional script for holding a list of common required modules
  • #:topfilter: an optional script for handling pre & post processing for all of the requests.


We'll discuss the rest of the parameters shortly.

Developing Your First Script

It is easy to develop a hello world example - the following is your first script, call it index.shp:

;; index.shp
;; the first hello world example:
`(p "Hello world - this is an shp script")

Save the above to your SHP script directory, and refresh your browser.

Quasiquotes and Script Evaluations

Since this is a scheme xexpr - you can use quasiquotes to include snippets of scheme code for dynamic execution. Below we add the current timestamp to our hello world example:

;; index.shp
;; the first hello world example:
`(p "Hello world - this is an shp script and the timestamp is: "
,(number->string (current-seconds)))

Each script gets compiled into one scheme procedure.

Requiring External Modules & #:required

The goals for SHP is to simplify the writing of presentational layers in web-server, not to replace modules, which is probably one of the most advanced module systems available on the market. Hence, the best practice is to do most of your logic and business layers in PLT Scheme modules, and then require the modules from within SHP scripts. The following requires scheme/string into the hello world example so we can use string-join.

;; index.shp
;; the first hello world example:
(require scheme/string)
`(p "Hello world - this is an shp script and the timestamp is: "
,(number->string (current-seconds))
"."
(br)
"The pathinfo are: "
,(string-join '("abc" "def" "ghi") "/"))

All of the required modules are available to all of the scripts. But the evaluation order still applies, so if you call a script that's expecting a module is already available, without first calling the script that loads the module, then you'll get a compilation error. To solve the issue of evaluation order, you can put all of the require statements into a single script, and pass that into make-shp-handler via the #:required parameter.

(make-shp-handler <path> #:required <required-script-path> ...)

The require statement differ from regular PLT Scheme in one regard - it does not handle the extra require options such as only-in, prefix-in, etc. So make sure you only have require statement with module paths.

Handling Not Found

The #:not-found option for make-shp-handler takes in a script that'll get executed when user enters in an invalid URL that does not resolve into any particular script. This option is currently required. Below is an example of a "not found" script:

;; this script gets executed when user enters in an invalid URL
`(p "Sorry we did not find what you are looking for. Please check your URL and try again")
Accessing Environment Variables

The following environment (request, response, cookies, etc) are available for this release:
  • ($uri) - returns the url of the request
  • ($pathinfo) - return the extra path attached to the script.
  • ($header <key>) - return the header value in string, or false if the header is not found
  • ($query <key>) - return the first value (either in uri query or form post) in string, or false if not found
  • ($query* <key>) - return the value as a list - null if no value found. This is useful for retrieve multiple values
  • ($status) or ($status <new-status>) - retrieve or sets the status, which must be one of the following values - 'continue, 'switching-protocols, 'ok, 'created, 'accepted, 'non-authortative-information, 'no-content, 'reset-content, 'partial-content, 'multiple-choice, 'moved-permanently, 'found, 'see-other, 'not-modified, 'use-proxy, 'temporary-redirect, 'bad-request, 'unauthorized, 'payment-required, 'forbidden, 'not-found, 'method-not-allowed, 'not-acceptable, 'proxy-authentication-required, 'request-timeout, 'conflict, 'length-required, 'precondition-failed, 'request-entity-too-large, 'request-uri-too-long, 'unsupported-media-type, 'request-range-not-satisfied, 'expectation-failed, 'internal-server-error, 'not-implemented, 'bad-gateway, 'service-unavailable, 'gateway-timeout, 'version-not-supported
  • (header! <key> <value>) - sets a response header with key & value
  • ($content-type) or ($content-type <new-content-type>) - get or sets the content-type of the response, defaults to text/html
  • ($redirect <new-url> <status>) - redirect to the new url. The status should be one of the following - 'found, 'see-other, or temporary-redirect (default). The headers are automatically passed along with the redirect.
  • ($headers) or ($headers <new-headers>) - this parameter holds the underlying headers. Since we already have ($header <key>) and (header! <key> <val>), direct use of ($headers) is discouraged, except to set it to empty list.
  • (cookie-ref <key>), (cookie-set! <key> <val>), (cookie-del! <key>) - for manipulating cookies
  • ($cookies) - this is the underlying parameter that holds all of the cookies. Similar to ($headers) its direct use is generally discouraged


Including Other Scripts

An important feature in PHP is the ability to include other scripts for execution, and in SHP we can also do the same. To do so, let's include a secondary script called foo.shp in our hello world example:

;; foo.shp
;; returns the value of a + b
(:args a b)
(number->string (+ a b))

As you can see - foo.shp takes in two arguments, created via the :args directive, a and b, add the two values and return the results. Let's call foo.shp from within index.shp:

;; index.shp
;; the first hello world example:
(require scheme/string)
`(p "Hello world - this is an shp script and the timestamp is: "
,(number->string (current-seconds))
"."
(br)
"The pathinfo are: "
,(string-join '("abc" "def" "ghi") "/")
(br)
;; call /foo.shp
,(include! "/foo.shp" 5 10))

include! handles the magic of calling the script (which takes an absolute path as if mapped from the URL). Since each script gets compiled into its own function and has its own scope, there are no shared variables and variables must be either created as parameters (so they are available to all scripts) or passed as parameters in the include! call.

SHP automatically ensures that any scripts that requires arguments (like foo.shp) cannot be called directly by the user, and this provides an added bonus of hiding library scripts that PHP cannot do!

Toplevel Filters for Managing Global Parameters

The best place to hold any global parameterizations is with the #:topfilter option of make-shp-handler, which takes a filter script path.

A filter is a function that takes one parameter, which is a compiled form of the script. The following is a template for a filter:

;; example filter template
(:args inner) ;; inner is the handler
(parameterize ((parameter-a (some-value)) ...)
(inner))

Thus you can use #:toplevel to initialize and control any parameters (let's say a database connection handle) so all of the scripts have access to them. Currently make-shp-handler can only handle one filter, but in the future we can have multiple filters, which will open up the possibilities of using filters!

Interoperating with web-server and servlets

Since the idea of SHP is to be a script language for rapidly developing UIs, it's important that it can operate well within the context of the existing infrastructures. Besides hooking into web-server's servlet system, it currently have a couple of other features for interacting with web-server.

Punting the request back to web-server

If you would like the web-server instead of SHP to handle a particular request - just issue (punt!). This comes in handy if a particular URL maps to a static file, and you want to serve it through web-server.

Example - we want to punt all of the /scripts/ path to web-server, then put the following scripts to /scripts/index.shp:

;; /scripts/index.shp
(punt!)

Now if you have a scripts directory that maps to /scripts/ in your web-server document root, those files will be served instead.

Wrapping Around Servlets

If you already have substantial development done in servlets, you can call it through shp as well, so you do not have to rewrite everything to shp scripts. To do so is also simple, just have a particular script call (servlet <module-path>), like what we have in the example site:

;; servlet.shp
(servlet! "test-servlet.ss")

The <module-path> value can be a path that's resolvable by the underlying dynamic-require. And as long as the result value can be wrapped (i.e. an xexpr), you do not need to do any additional modification.


That's it for now - folks! Would love to hear back on any thoughts, etc. Happy SHP'ing.

Packaging SHP and Releasing to Wild

At this particular point there is enough basic functionalities in SHP that it's time to release it to the wild.

Releasing a package in PLT Scheme is a very straight forward process, which we will follow:
  1. organize all of the files into a single directory
  2. create an info.ss file to describe the metadata of the package
  3. use planet create to generate the PLT package file
  4. test the package file to ensure it is functional - use planet fileinject to install the package, and planet remove to deinstall the package
  5. submit the package to http://planet.plt-scheme.org
Specifically, SHP is developed on PLT Scheme 4.2.1 and due to the differences of the functionalities available in web-server from previous versions, SHP will not be compatible with older web-server versions without work, so we'll only support SHP for 4.2.1 onwards.

SHP will be released with the next post, which will include a tutorial for using SHP that summarizes all previous posts so far.

Friday, August 14, 2009

Servlet Support (wrapping servlets)

It would be nice to have the script able to call existing servlets, so they can be integrated easily.

The basic idea would be like:

;; foo.shp
(servlet "servlet.path")


Let's see how it would work:

(define (servlet! path (request ($request)) #:start (start 'start))
((dynamic-require path start) request))

Pretty straight forward!

The procedure allows loading of a servlet via dynamic-require, and it also offers the potential for you to rewrite the request before passing it to the servlet.

Now dynamic-require does not reload the servlet module when it changes, so that would be something we'll have to deal with in the future.

One thing you should make sure is that the servlet does not return an actual response object, as currently the code will not be converting the response object back so it can be wrapped by outer scripts. This is something we can attempt to address in the future. This means that if your servlet calls redirect-to, it will not work (you'll have to call redirect!).

Redirect Support

Perhaps we would want to control redirects from within the scripts - ideally something like below should work:

(redirect! url)

To enable such handling is straight forward:
(define (redirect! url 
#:status (status 'temporary-redirect)
#:headers (headers '()))
(define (header-helper h)
(cond ((header? h) h)
((pair? h) (make-header (string->bytes/utf-8 (car h))
(string->bytes/utf-8 (cdr h))))
((cookie? h)
(cookie->header h))))
(redirect-to (if (url? url) (url->string url) url)
(case status
((found) permanently)
((see-other) see-other)
((temporary-redirect) temporarily))
#:headers (map header-helper headers)))

The wrapper over redirect-to simplifies type conversion between url and string, and pairs of string to header, as well as cookie to header, so you can do the following:

(redirect! '/another/path' #:headers `(("test" . "me") ,(cookie-ref! "session")))

As redirect-to only returns a response/full struct, we need to raise the struct so we can escape the calling stack if the script is being included. And let's catch the object at the request handling:

(define (handle-request server request)
(parameterize (($pathinfo ($pathinfo))
($server server)
($request request))
(parameterize (($cookies (init-cookies!)))
(eval-script-if-changed! (shp-handler-required server))
(with-handlers ((response/basic?
(lambda (r) r)))
(make-response (include! (request-uri request)
#:topfilter (shp-handler-topfilter server)
#:partial? #t))))))


Now we can do a redirect from within the scripts! ;)

Uploaded Files Support

In PHP there is a $_FILES superglobal variable, which contains information about the uploaded file that have been stored to the temporary folder. In PHP, files and the rest of the form variables are treated separately.

In web-server, the uploaded files are treated just the same as rest of the form variables, i.e. you can access them through $query. The advantage of this approach is that you do not have think about file separately, but the disadvantage is that web-server will hold all of the uploaded files in memory, so the actual total upload size will be limited to the available RAM.

We are not going to solve that particular limit for web-server right now, instead, we want to think about whether it makes sense to have a $files parameter separately from $query, so it appears more similar to PHP.

I think at this moment it makes no sense to treat files the same way as PHP. So we'll keep them in $query.

Enabling Cookie Support

Allright - time to add more environment support, and this time it's the cookies.

web-server already provide cookie support, so it's just a matter of wrapping around them to get started.

Let's first define a parameter to hold cookies - this parameter will be populated at the start of the request, and be used to generate cookie headers during response time.

(define $cookies (make-parameter '()))

We will also use our own structure, since in web-server there are two types of cookie objects (client-cookie and cookie):

(define-struct cookie (name value domain path max-age comment secure?))


Extracting Cookies

web-server put handling of parsing request-cookies under web-server/http/cookie-parse, let's wrap around it so we have access to cookies:

;; convert ws:client-cookie (web-server cookie) to cookie
(define (ws:client-cookie->cookie c)
(make-cookie (ws:client-cookie-name c)
(ws:client-cookie-value c)
#f
#f
#f
#f
#f
))
;; extract the cookies from the request object
(define (init-cookies!)
(map (lambda (c)
(let ((c (ws:client-cookie->cookie c)))
(cons (cookie-name c) c)))
(ws:request-cookies ($request))))

;; parameterize cookie with extraction
(define (handle-request server request)
(parameterize (($pathinfo ($pathinfo))
($server server)
($request request))
(parameterize (($cookies (init-cookies!)))
...)))

So once we are inside the scripts we have access to the cookies from the request.

Manipulating Cookies


We will want to access cookie values, set cookies, and delete cookies, as follows:


(define (cookie-set! name value
#:domain (domain #f)
#:path (path #f)
#:max-age (max-age #f)
#:comment (comment #f)
#:secure? (secure? #f))
($cookies (cons (cons name (make-cookie name value domain path max-age comment secure?))
($cookies))))

(define (cookie-ref name)
(let ((kv (assoc name ($cookies))))
(if kv (cdr kv) #f)))

(define (cookie-del! name)
(let ((kv (assoc name ($cookies))))
($cookies (if kv
(filter (lambda (c)
(not (equal? kv c)))
($cookies))
($cookies)))))

Generating Cookies for Response


Finally - we will send the cookies in $cookies parameter through the response object back to the client:

(define (cookie->header c)
(ws:cookie->header (ws:make-cookie (cookie-name c)
(cookie-value c)
#:domain (cookie-domain c)
#:path (cookie-path c)
#:max-age (cookie-max-age c)
#:comment (cookie-comment c)
#:secure? (cookie-secure? c))))

(define ($cookies->headers)
(map (lambda (kv)
(cookie->header (cdr kv)))
($cookies)))


The IE Bug

If you like to set path to "/", IE has an issue with handling "/" in the cookie path; it prefers / instead. So we will have to do a manual conversion to solve this problem:

(define (cookie->header c)
(define (helper c)
(ws:make-header #"Set-Cookie"
(string->bytes/utf-8
(regexp-replace #px"Path=\\\"/\\\""
(ws:print-cookie c)
"Path=/"))))
(helper
(ws:make-cookie (cookie-name c)
(cookie-value c)
#:domain (cookie-domain c)
#:path (cookie-path c)
#:max-age (cookie-max-age c)
#:comment (cookie-comment c)
#:secure? (cookie-secure? c))))

Now we can enjoy cookies in all its glory of good, bad, and ugliness like other platforms :)

Thursday, August 13, 2009

Intermission - Refactoring Path and Dispatch Related Procedures

At this time we have multiple procedures that overlap in terms of functionality for path and dispatching, and it is probably a good time to take a look at how to refactor them.

The following are path related procedures:
  • normalize-path: convert a path to the underlying system path, depends on path->segments and shp-handler-path
  • url->path-segments: converts the url into path segments; overlapping partly with path->segments
  • normalize-partial-path: takes the segments and convert into an underlying path; depends on shp-handler-path
  • segments->path: converts the segments back to the underling path; calls normalize-partial-path
  • url->shp-path: wraps around url-path->segments and segments->path
And dispatch related procedures:
  • eval-script-if-changed: evaluate the script if it is changed; takes in script struct
  • shp-handler: evaluate the script from top level, takes in nothing
  • envoke-topfilter-if-available: call the top filter if available - takes in a procedure
  • include! - calls the code, takes in a path
The dispatching procedures appear even more overlapped than the path procedures. Let's start somewhere to refactor them.

First - url-path->segments can be refactored to:

;; url->path-segments - returns an underlying path based on input url (duplicate with normalize-path)
(define (url->path-segments url (default "index.shp"))
(filter (lambda (path)
(not (equal? path "")))
(map path/param-path (url-path url))))

But now url->path-segments looks a lot like path->segments, so the two can be refactored:

(define (path->segments path)
(filter (lambda (path)
(not (equal? path "")))
(if (url? path)
(map path/param-path (url-path path))
(regexp-split #px"\\/" path))))

(define (url->shp-path url)
(segments->path (path->segments url) #t))

Change the name of normalize-partial-path to segments->partial-path to more correctly state its purpose.

The difference between include!, shp-handler, eval-script-if-changed, and envoke-topfilter-if-available are basically their signatures, one takes in a path, the other takes a request, then one takes a script structure, and the last takes a procedure. It would be great to "normalize" them.

Note shp-handler and envoke-top-filter-if-available depend on include!, so include! should be the base of our refactoring. Let's see if we can push redundant features into include!.

Since include! takes a path into path->segments, which now also takes an url, we can reduce the code in shp-handler if we can fold envoke-topfilter-if-available in as well.

(define (include! path #:topfilter (topfilter #f) . args)
(define (helper topfilter)
(let ((proc (evaluate-script (segments->path (path->segments path) #f))))
(if topfilter
(topfilter (lambda () (apply proc args)))
(apply proc args))))
(helper (if topfilter
(evaluate-script (segments->path (path->segments topfilter) #f))
topfilter)))

So we can now get rid of envoke-topfilter-if-available, and modify shp-handler as such:

(define (handle-request server request)
(parameterize (($pathinfo ($pathinfo))
($server server)
($request request))
(eval-script-if-changed! (shp-handler-required server))
(make-response (include! (request-uri request)
#:topfilter (shp-handler-topfilter server)
#:partial? #t))))

handle-request is extracted out of the struct, which now becomes:

(define-struct shp-handler (path default not-found required topfilter)
#:property prop:procedure
(lambda ($struct request)
(handle-request $struct request)))

You'll notice that handler-request calls include! with an extra partial? parameter, which will allow a partial path match, so include! now looks like:

(define (include! path
#:topfilter (topfilter #f)
#:partial? (partial? #f)
. args)
(define (helper topfilter)
(let ((proc (evaluate-script (segments->path (path->segments path) #f))))
(if topfilter
(topfilter (lambda () (apply proc args)))
(apply proc args))))
(helper (if topfilter
(evaluate-script (segments->path (path->segments topfilter) partial?))
topfilter)))

Notice that partial? is only applied to the script and not to the toplevel filter, which needs a full path match.

At this time only eval-script-if-changed has not been combined. A simple way of handling the issue is to allow include! to take in script struct as well:

(define (include! path
#:topfilter (topfilter #f)
#:partial? (partial? #f)
. args)
(define (helper topfilter)
(let ((proc (evaluate-script
(if (script? path)
(script-path path)
(segments->path (path->segments path) partial?)))))
(if topfilter
(topfilter (lambda () (apply proc args)))
(apply proc args))))
(helper (if topfilter
(evaluate-script (segments->path (path->segments topfilter) #f))
topfilter)))

(define (eval-script-if-changed! script)
(unless (not (file-exists? (script-path script)))
(let ((timestamp (file-or-directory-modify-seconds (script-path script))))
(when (> timestamp (script-timestamp script))
(set-script-timestamp! script timestamp)
(include! script)))))

Finally - we can abstract (segments->path (path->segments ...)) into its own procedure:

(define (resolve-path path (partial? #t))
(segments->path (path->segments path) partial?))

(define (include! path
#:topfilter (topfilter #f)
#:partial? (partial? #f)
. args)
(define (make-script path partial?)
(evaluate-script (if (script? path)
(script-path path)
(resolve-path path partial?))))
(define (helper topfilter)
(let ((proc (make-script path partial?)))
(if topfilter
(topfilter (lambda () (apply proc args)))
(apply proc args))))
(helper (if topfilter (make-script topfilter #f) topfilter)))


Allright! Now the code looks a lot better! We can move onto adding more features!