Saturday, September 19, 2009

Create a Driver for bzlib/dbi (4) - Concluding Filesystem Driver

This is the fourth installment of the extending bzlib/dbi series - for refreshers, see the following previous posts:
  1. Overview of the DBI internals 
  2. a draft driver - list directories and open files
  3. enhance the driver - save files atomically
Basically we now have the SQL equivalent of select, insert, and update.  The next stop is to provide the ability to delete files and directories.

The Deletion Capability 

PLT Scheme offers the following for deleting files and directories:
  • (delete-file file-path) - delete the path if it is a file; otherwise raise an error
  • (delete-directory dir-path) - delete the path if it is an empty directory; otherwise raises exceptions 
  • (delete-directory/files file-or-dir-path) - delete the path whether it is a file or directory, and if it is an non-empty directory, the sub files and directories will first be deleted recursively 
It seems that delete-directory/files is the most convenient to use.  However, there might be situations you want to ensure that you are deleting either a directory or a file, and if you are deleting a directory you only want to delete an empty directory, and we want to make sure our API reflects that, so we will have 3 separate calls:

;; delete files use delete! 
(query handle 'delete! `((path . path1) ...))
;; delete empty-only directories uses rmdir!
(query handle 'rmdir! `((path . path1) ...))
;; delete either file or directories use rm-rf! 
(query handle 'rm-rf! `((path . path1) ...)) 

Similar to SQL's delete statement, the above query can delete multiple paths at once.  However, given transactions is not supported - the deletes are done independently, and if any of the delete fails the rest will stay undeleted.

The following addition to file-query accomplishes the above:


(define (file-query handle stmt (args '())) 
  ...  
  (case stmt
    ...  
    ((delete!)
     (for-each delete-file (get-paths)))
    ((rmdir!) 
     (for-each delete-directory (get-paths)))
    ((rm-rf!)
     (for-each delete-directory/files (get-paths)))
    ...))

Deleting Files or Directories Atomically


How can deleting files or directories fail? Generally the following are the bucket of errors:
  • the files or directories do not exist
  • lack of sufficient privileges 
  • the files or directories are locked (specifically on Windows)
While the knowledge of the file existence is helpful, they are really unnecessary - what we want is that after the deletion, the file is either deleted or an error is thrown, but not both.  So we'll handle that error and make it a NOOP. 

We cannot really tell apart the exceptions between the lack of sufficient privileges and the file locking, but we'll assume that if you use the directory as a database that you can manipulate, you have sufficient privileges.

What's left again is the issue w.r.t. temporary locking by Windows. We again can employ the same approach we took with save! - try to delete a few times, with pauses in between if the deletion fails:

;; refactor out the do-it pattern from rename-file
(define (do-it/retry thunk (skips 3) (interval 0.1))
  (define (skip++ skip)
    (+ interval skip))
  (define (helper exn skip count)
    (cond ((> count skips)
           (raise exn))
          (else
           (do-it (skip++ skip) (add1 count)))))
  (define (do-it skip count)
    (sleep skip)
    (with-handlers ((exn:fail:filesystem?
                     (lambda (e)
                       (helper e skip count))))
      (thunk)))
  (do-it 0 0))

;; delete-file! just need to wrap do it around delete-file
(define (delete-file! path)
  (do-it/retry (lambda ()
                 (delete-file path))
               3))
;; rename-file can also be refactored to use do-it/retry
(define (rename-file from to)
  (do-it/retry (lambda ()
                       (rename-file-or-directory from to #t))
               3))
;; we'll do the same with delete-directory & delete-directory/files 
(define (delete-directory! path)
  (do-it/retry (lambda ()
                 (delete-directory path))
               3))

(define (delete-directory/files! path)
  (do-it/retry (lambda ()
                 (delete-directory/files path))
               3))

With the above, we now can retry 3 times conditionally on Windows before erroring out, as well as capturing the exception when the deleted file no longer exists:

(require (planet bzlib/os))
(define (file-query handle stmt (args '())) 
  ... 
  (case stmt
    ... 
    ((delete!)
     (for-each (lambda (path) 
                 (with-handlers ((exn:fail:filesystem?
                                  (lambda (e)
                                    (when (file-exists? path)
                                      (raise e)))))
                   ((+:windows delete-file! delete-file) path)))
               (get-paths)))
    ((rmdir!) 
     (for-each (lambda (path)
                 (with-handlers ((exn:fail:filesystem?
                                  (lambda (e)
                                    (when (directory-exists? path)
                                      (raise e)))))
                   ((+:windows delete-directory! 
                               delete-directory) path))) 
               (get-paths)))
    ((rm-rf!)
     (for-each (lambda (path) 
                 (with-handlers ((exn:fail:filesystem?
                                  (lambda (e)
                                    (when (or (file-exists? path)
                                              (directory-exists? path))
                                      (raise e)))))
                   ((+:windows delete-directory/files!
                              delete-directory/files) path)) 
               (get-paths)))

    ...)) 

Now you have the ability to delete files *atomically* without raising unnecessary errors.

Conclusion 

At this point our filesystem driver has the following capabilities:
  • jails the filesystem path (so the path is independent of the root directory)
  • two separate drivers - one works with the query helper functions such as rows and cell, and the other has a bit more efficiency and might be more natural to use in some circumstances 
  • list - listing the files within a directory path 
  • open - reads and returns the content of the files as list of bytes
  • save! - save a single file specified by a path and a byte *atomically*
  • delete! - delete files by path *atomically*
  • rmdir! - delete empty directories by path *atomically* 
  • rm-rf! - delete files or directories by path *atomically*
We can of course keep on extending the features as we need to, but we are now complete for the purpose of the tutorial.  As an exercise - try to extend the driver for the following capabilities:
  • move/rename - move or rename a file or directory
  • copy - copy a file or directory to another part of the path
That's it for now - I'll release the driver (and the associated dependent code) in the near future.

2 comments: