Friday, September 18, 2009

Create a Driver for bzlib/dbi (3) - Continuing Filesystem Driver

Previously we have discussed the internals of DBI and how to extend it, as well as crafted the first draft of the filesystem database driver.  We are now onto #3 of the DBI extension series.  We'll continue by adding the ability to save data to the driver.

Save The Data

Saving the data is equivalent to both insert and update in SQL venacular.  The simplest version looks like below:

(define (file-query handle stmt (args '())) 
  ... 
  (case stmt
    ... 
    ((save!)
     (call-with-output-file (path-helper (assoc/cdr 'path args))
       (lambda (out) 
         (write-bytes (assoc/cdr 'content args) out))
       #:exists 'replace))
    (else 
     (error 'file-query "unknown statement: ~a" stmt))))

With the above we now can save data into a particular file with the following usage:

(query handle 'save! `((path . "/foo/bar/baz.txt") (content . #"this is the content")))  
But unfortunately, there are a few hiccups:
  • there is no verification that path and content key/value pairs are passed in (for 'list and 'open queries the path key/value pairs are optional) 
  • there is no guarantee the directory of the path exists (and if not it will result in an error) 
  • the above saving is not an atomic operation and can corrupt the data
So we'll address each of the issue to ensure we have a solid implementation for saving data.

Argument Verification 

To verify the arguments, we can use let/assert! from the next version of bzlib/base (which will be released together with this driver) as follows:

(define (file-query handle stmt (args '())) 
  ... 
  (case stmt
    ... 
    ((save!)
     (let/assert! ((path (assoc/cdr 'path args))
                   (content (assoc/cdr 'content args)))
                  (call-with-output-file (path-helper path)
                    (lambda (out) 
                      (write-bytes content out))
                    #:exists 'replace)))
    (else 
     (error 'file-query "unknown statement: ~a" stmt))))

let/assert! checks to see if the values returned are false, and if so raises the error, else binds the variable call the inner expression.  It also behaves like let* instead of let, as the subsequent variable can see the previous variable.

Gaurantee of Directory Path

To ensure the directory for the path already exists - we can utilize make-directory* to create the parent directory for the path, but we need to make sure the parent path of the path is not a file (so we can create a directory):


;; from the yet to be released bzlib/file
(define (ensure-parent-path-exists! path)
  (let ((parent (parent-path path)))
    (cond ((file-exists? parent) 
           (error 'path-exists "path ~a is a file instead of a directory" parent))
          (else
           (make-directory* (parent-path path))
           path))))
ensure-parent-path-exists! will return the original path if it suceeds so you can then feed into whichever function that requires the path:

(define (file-query handle stmt (args '())) 
  ...
  (case stmt
    ... 
    ((save!)
     (let/assert! ((path (assoc/cdr 'path args))
                   (content (assoc/cdr 'content args)))
                  (call-with-output-file (ensure-parent-path-exists! (path-helper path))
                    (lambda (out) 
                      (write-bytes content out))
                    #:exists 'replace)))
    ...))

Save File Atomically 

Unfortunately, we cannot ensure the file will save atomically for Windows.  The issue is that NTFS will lock any opened file handle so rename will fail while the file is open.  So if you have multiple threads running, and one thread is overwriting the file while another thread holds the file handle, the rename will fail.  This issue is execerbated by the fact that programs like antivirus runs in the background and can open the files at anytime, and thus causes the save to fail mysteriously even if you have not opened the files anywhere.

(I know there is a transactional filesystem for Windows Vista, but since that doesn't solve problems for other versions of Windows, its effectiveness is limited and we are not going to support it for now).

The best we can do for now is to *attempt* saving a few times, and with each error pause for a while on Windows.


(define (rename-file from to)
  (define (skip++ skip)
    (+ 0.1 skip))
  (define (helper exn skip count)
    (cond ((> count 3)
           (raise exn))
          (else
           (do-it (skip++ skip) (add1 count)))))
  (define (do-it skip count)
    (sleep skip)
    (with-handlers ((exn:fail:filesystem?
                     (lambda (e)
                       (helper e skip count))))
      (rename-file-or-directory from to #t)))
  (do-it 0 0))
And we have wrapper functions over the save/rename pattern - open-output-atomic-file and call-with-output-atomic-file from bzlib/file to help us saving atomically, so our save handle will now look like the following:

(define (file-query handle stmt (args '())) 
  ...  
  (case stmt
    ... 
    ((save!)
     (let/assert! ((path (assoc/cdr 'path args))
                   (content (assoc/cdr 'content args)))
                  (call-with-output-atomic-file 
                      (ensure-parent-path-exists! (path-helper path))
                    (lambda (out) 
                      (write-bytes content out))
                    #:exists 'replace)))
    ...))

Now we can ensure the file is saved atomically (on Unix; mostly on Windows).

Stay tuned for the rest of the capabilities.

No comments:

Post a Comment