Wednesday, September 23, 2009

Develop a Memcached Client (2) - Filling Out the API

This is the continuation of the network programming series - see the previous installments for more details:

Now we have the ability to store data into memcached, we want to retrieve the stored data.

Generating Requests

Memcached offers two API for such purpose:
  • get - return the data 
  • gets - return the data along with the cas id 
(yes - it's only one character difference... oh well)


Let's take a look at the details of the API:

get <key>*\r\n
gets <key>*\r\n
The <key>* means there can be multiple keys, separated by space. We can represent the generation with the following:

(define (cmd-get out type keys) 
  (define (keys-helper)
    (let ((out (format "~a" keys)))
      (substring out 1 (sub1 (string-length out)))))
  (display-line out "~a ~a" (case type
                              ((get gets) type)
                              (else (error 'cmd-get "unknown get type: ~a" type)))
                (keys-helper)))
The next step is to parse the input.

Parse Response

The response is a bit harder, and it takes the following forms:
  • there can be multiple values returned (one for each key found), and the ending is marked by END\r\n
  • each value starts with a line of VALUE <key> <flags> <bytes-length> [<cas-unique>]\r\n
    • <cas-unique> is a 64-bit integer that uniquely identifies the object; and it is only returned if you use gets instead of get


  • then the value is followed by the data block, terminated by \r\n. The data block should have the same length as indicated by <bytes-length>


The following algorithm will handle the above responses:
  • read a line - and see if it is either a value or end line (any other value would be an error)
  • if it is end, we are done and return all of the read data
  • if it is value, retrieve the value of key, flags, and possibly cas-unique
  • read in the data block
  • repeat
Which translates into the below code:

(define (response-get in)
  (define (end? ln)
    (string-ci=? ln "end"))
  (define (value? ln)
    (regexp-match #px"^(?i:VALUE) ([^\\s]+) (\\d+) (\\d+)\\s?(\\d+)?$" ln))
  (define (helper acc)
    (let ((ln (read-line in 'return-linefeed)))
      (if (end? ln) ;; we are done.
          (reverse acc)
          (if-it (value? ln) 
                 (let ((key (string->symbol (second it)))
                       (flags (string->number (third it)))
                       (len (string->number (fourth it)))
                       (cas (if-it (fifth it)
                                   (string->number it)
                                   #f)))
                   (let ((bytes (read-bytes len in))) 
                     (read-line in 'return-linefeed) ;; remove the last \r\n
                     (helper (cons (list key bytes flags cas) acc))))
                 (error 'response-get "Invalid response: ~a" ln)))))
  (helper '()))
Then we can tie the two together with the following:

(define (get in out type keys) 
  (cmd-get out type keys) 
  (response-get in)) 
And finally complete the retrieval API:

(define (memcached-get client key . keys)
  (get (client-in client) (client-out client) 'get (cons key keys)))

(define (memcached-gets client key . keys)
  (get (client-in client) (client-out client) 'gets (cons key keys)))
The Rest of the API Fleshing out the rest of the API takes basically repeating the steps of studying the protocol, write out the request generation, response parsing, and combination code, for each of the APIs. Without going through the details again, below is the code for the deletion API, and the rest are left as exercises.

(define (cmd-delete! out key (time 0) (noreply? #f))
  (display/noreply out "delete ~a ~a" noreply? key time))

(define (response-delete! in (noreply? #f))
  (if (not noreply?)
      (let ((ln (read-line in 'return-linefeed)))
        (string->symbol (string-downcase ln)))
      'deleted))

(define (delete! in out key (time 0) (noreply? #f))
  (cmd-delete! out key time noreply?)
  (let ((resp (response-delete! in noreply?)))
    (case resp
      ((deleted not_found) resp)
      (else (error 'delete! "invalid response: ~a" resp)))))

(define (memcached-delete! client key (time 0) (noreply? #f))
  (delete! (client-in client) (client-out client) key time noreply?))
Error Handling Depending on the complexity of the API, you might choose one of the following error handling strategies:
  • let things fail - I personally like this error handling strategy for the simple situations, to push the error handling up stream. This works especially well for protocols that are largely atomic and does not have to track states
  • Resiliency - handle most of the errors and try to correct them, and only pass the buck upstream when there is no way it can handle the problem. This works well for heavy weight protocols where it is expensive to restart the establishment.

Memcached falls into the first bucket, of course. But it is possible to create a resilient client by building on top of a simple client, and that brings up the possibility of reusing the error handling code. Erlang is a strong proponent of such philosophy and I think it's a great one to adapt.

Anyhow - we're done talking about the API development, and we now have a workable memcached client. What's left to discuss? Well, memcached can be setup in a distributed fashion, so we should make our client distribution capable as well. Plus - the "dramatic" conclusion ;) of DBI integration still yet to come. Stay tuned.

No comments:

Post a Comment