Sunday, November 15, 2009

BZLIB/SESSION.plt - a Session Store via BZLIB/DBI.plt

I originally planned to write a series on the development of a session store, but it turned out that there aren't that many things to write about, so I am just going to release the code via planet. As usual, this is released under LGPL.

Installation

To download the planet package:

(require (planet bzlib/session)) 

The package comes with three separate database installation scripts (one each for sqlite, mysql, and postgresql). You can call them via the following:

;; installing to a sqlite database
(require (planet bzlib/session/setup/jsqlite)) 
(setup-session-store/jsqlite! <path-to-sqlite-db>) 

;; installing to a mysql database 
(require (planet bzlib/session/setup/jazmysql)) 
(setup-session-store/jazmysql! <host> <port> <user>  
                               <password> <schema>)

;; installing to a postgresql database 
(require (planet bzlib/session/setup/spgsql)) 
(setup-session-store/spgsql! <host> <port> <user>
                             <password> <database>) 
Known Issue: the script currently can only be run once and it assumes the table session_t does not exist in the target database. This will be rectified in the future.

Session ID

The session store uses uuid for session IDs. bzlib/base provides API for manipulation of uuids.

To create an uuid, just run (make-uuid). It can optionally takes in a parameter that are either an uuid in string, and then create the corresponding uuid structure (it can also takes in another uuid structure and make an equivalent uuid structure).

> (require (planet bzlib/base))
> (make-uuid)
#<uuid:4ba52eac-a0b4-415a-88f5-57d1fadd1aba>
> (make-uuid "4ba52eac-a0b4-415a-88f5-57d1fadd1aba")
#<uuid:4ba52eac-a0b4-415a-88f5-57d1fadd1aba>
> (make-uuid (make-uuid "4ba52eac-a0b4-415a-88f5-57d1fadd1aba"))
#<uuid:4ba52eac-a0b4-415a-88f5-57d1fadd1aba>

Monday, November 9, 2009

Building a Web Session Store (1)

Given that we have previously determined the need for a web session store even if we are using continuations, we'll go ahead and build it on top of our DBI stack, so the session data can be persisted as long as necessary.

Quick Word About Session Store Performance

One thing to note about session data is that its data usage is both read and write intensive, and such data can put strain on the database. It's write-intensive because with each request we'll extend the expiration time on the session itself, and it's read-intensive because the data is needed for every request, but it changes with every request.

For now we'll assume that our database is capable of handling such needs (and it will until you have a sufficiently large amount of traffic), but it's something to keep in mind. The nice thing of building the session logic on top of DBI is that when we need to deal with the performance issue, we can add logics into the DBI tier easily with developing a customer driver, for example, by integrating memcached as a intermediate store that'll flush out the changes to the database once a while instead of with every request.

Active Record

The active record pattern are not just for OOP fanatics - we schemers know that you can craft your own OOP with FP easily. In DBI today there is a base structure for active record definition:

(define active-record (handle id)) 
Such definition is a lot simpler than the usual OOP representations, which usually try to construct the data model in memory, along with dynamically constructed SQL statements. Although such OOP records provide simplicity for the simple cases, it has proven to be a leaky abstraction due to the object vs relational paradigm mismatch, as well as a significant performance overhead. Our simple definition will do us just fine right now.

What would our session API look like then?

;; expiration a julian-day 
;; store is a hash table 
(define-struct (session active-record) (expiration store) #:mutable) 

;; the session key/value manipulation calls... 
(define (session-ref session key (default #f)) ...) 
(define (session-set! session key val) ...) 
(define (session-del! session key) ...) 
;; the persistence calls 
(define (build-session handle ...) ...) 
(define (save-session! session) ...) 
(define (refresh-session! session) ...) 
(define (destroy-session! session) ...) 

Saturday, November 7, 2009

Web Sessions vs. Continuations

The "session info in web server applications" thread recently in plt-scheme list has an undertone that continuations are equivalent of web sessions as understood in other languages and frameworks. This undertone is highlighted by the lack of a session-like capability within the web-server collection that exists in other web frameworks.

This got me to think: are continuations equivalent of sessions?

The original intent (indicated in Shriram's research paper) of web-server's continuation is to correctly and succinctly model interactive web application's application flow. The paper sites examples of incorrectly implemented web apps that would do something like the following:
  1. user browse a list of goods
  2. user opens new window to get the details of goods A
  3. user goes back to original window
  4. user then opens another new window to get the details of goods B
  5. user then goes to goods A and click "Buy Now"
  6. incorrectly implemented app will cause the user to buy goods B instead of goods A
The traditional solution to the above interaction would be to use sessions, and since continuation models such interactions as well, there is no question that in this case continuations supplant the needs of sessions.

But for other scenarios involving sessions it might be more natural to model the computations by using the traditional session concepts.

For example - identifying the user across visits after significant time lapse (this is generally toggled by a "remember me" checkbox during login). Normally web sites accomplish this by persisting the user's authenticators via cookies or sessions.

This process is awkward to model with continuations, since the user likely come back to the site via a top level link that has no captured continuations, instead of digging up the last continuation url for the site, and the continuations might have expired between the visits if you use stateful servlets.

If you use web-server's stateless servlet language, an approach is probably to serialize the continuation into a cookie so it can model the above scenario, but you'll have to write your code in the stateless language or convert your code over, and it feels like a more complex solution compared to simply having a regular session capability. This is similar to using continuations to model non-interactive web links - it can work, but it does not follow Occam's razor.

Furthermore - if your site uses extensive ajax, your use of continuations will decrease, since Ajax models the interactions as well and supplants the needs for continuations. and in such case you might regain the needs for sessions that was reduced by continuations.

So, as far as I can tell, continuations is not equivalent to web sessions and do not eliminate the needs for session capabilities.

Friday, November 6, 2009

Using DBI to Run Scripts & Load Prepared Statement Scripts

There are a couple of utility functions that I have designed into DBI that was not previously discussed. They are both oriented to work with SQL scripts.

I don't know about you, but I like to write SQL statements in SQL scripts:

-- a sql query inside a sql script
select * 
  from table1;
instead of embedding them as strings in programming languages:

;; a sql query embedded in scheme code 
(exec handle "select * from table1")
- it just looks so much nicer.

If you have a ton of complex prepared statements, you'll find you'll have such statements littered everywhere, which makes them difficult to maintain.

Of course - a possible solution to this problem is to create a DSL so the SQL strings can be generated. We might eventually entertain such solution, but it's clear that any such DSL will be non-trivial.

A simpler but equally as powerful of an idea is to move all such queries into SQL scripts, and have the database handle load the scripts and convert them into prepared statements.

Loading Prepared Statement Scripts

You can supply a #:load parameter to the three RDBMS drivers' connect procedure:

;; example - connect to spgsql 
(define handle 
  (connect 'spgsql <regular-args> ... 
           '#:load <path-string? or (listof path-string?)>))

Thursday, November 5, 2009

Latest DBI.plt Available - Handling Last Inserted ID and Side Effects

The newest version of DBI (and the 3 RDBMS drivers) has now been made available on planet - it addresses the issue of side effects and last inserted id.

As usual - they are released under LGPL.

To download & install - use require:

(require (planet bzlib/dbi:1:3) 
         (planet bzlib/dbd-jazmysql:1:2) 
         (planet bzlib/dbd-jsqlite:1:3) 
         (planet bzlib/dbd-spgsql:1:2))
Unifying the side effects and the last inserted id turns out to be a non-trivial task - I have already voiced the options on the plt-scheme list, repeating here for the sake of completeness:
  • the underlying drivers returns different values for side effects
  • there might be legacy code that are utilizing the underlying side effect results - which can increase effort if porting to another database
  • not all drivers provide ways to retrieve all side effect values
  • last inserted id does not work consistently when dealing with the multi-records-insert-at-once scenario - luckily this is generally not how insertion is used
  • for postgresql - we'll need to derive the underlying table to sequence name mapping in order to determine the last inserted id, and this requires a separate query that you might not want to "pay for" unless you have needs for last inserted id
Based on the above design constraints, I have chosen the following:
  • make available multiple drivers for each database - one for different types of the side effect results
  • default the side effect results to an unified effect structure, which is inspired by jazmysql
  • provide last-inserted-id for the 3 RDBMS drivers (SPGSQL has a special requirement, described below)

The Different Side Effects

There are 3 separate side effect types:
  • past-thru-effect - this is the side effect available as a backward compatibility. Basically whatever the side effect objects are returned by the underlying driver are direct returned; and you can use the underlying driver's code to access the values
  • effect - this is the unified side effect object and the default going forward
  • effect-set - this converts the effect structure into a result set

For example, if you want to make a connection with the first side effect type with bzlib/dbd-spgsql, you can do the following: