Albion's technical merits

This archive management system was designed to accommodate a wide variety of operating systems and interaction environments. Philip Greenspun [1,2,3] poses the following question regarding web-style publishing: is this site static, a program, or a database? My goal for this project was to allow presentation to potentially occur as all three. Specifically, the photos/documents need to be searched, which implies database functionality. This is the approach taken by the majority of other photo archiving projects I've run across [4,5]. However, one might also like to copy a portion of the archive to removable media (such as burning to a CD) for viewing (or even editing) off-line, in which case the functionality is static (or potentially even a program).

So, I see the following as strengths of the Albion system:

  1. You only need four programs to run the archive system: Perl, ImageMagick, a web page server, and a browser.

    Each of these are stable, easy to find, and mature enough that the installation should only take a few steps (with the possible exception of the web server configuration). In addition, each of the programs are available to run on Windows, Linux, Unix, Mac OS, and maybe even DOS?, ensuring cross-platform compatibility.

    update: Code for a 'lightbox' using Tcl/Tk is also incubating (pre-alpha stage) within the distribution --- for those that don't want to use a web server.
     

  2. The archive system uses a "database" distributed across a number of flat ASCII files.

    The argument against flat files in [3] is that a SQL database will improve search response times. However, I have found that the processing overhead is so high, that this is probably only true for extremely large databases (i.e. greater than 100's of thousands of record entries). In fact, the author of PhotoShelf suggests that optimization of the Perl-Apache-MySQL interface because "[Photoshelf] will run like a dog otherwise" [5]. For a few thousand images, I find this to be unacceptable.

    A second advantage is that a person can copy a portion of the database to a second machine, say a laptop, and take that machine into the woods/beach/desert and still update the database. Any data changes can then be merged with the full database at some later date. Again, this feature isn't scalable to a large number of users, but for five people or so the ability to work without an Internet connection is a strong feature.

    One thing that SQL database management systems provide automatically is transaction negotiations. This allows multiple people to access the same database, or even data record, without colliding into one another. Phil Greenspun provides the following example in [5] to illustrates this:

    "[Suppose two users] ask to be added to a mailing list at exactly the same time. Depending on how you wrote your program, the particular kind of file system that you have, and luck, you could get any of the following behaviors:
              * Both inserts succeed. 
    	  * One of the inserts is lost. 
              * Information from the two inserts is mixed together so that
    	    both are corrupted.
    
    I have implemented a simple transaction negotiation mechanism to compensate for data collisions. Basically, the data record is given a unique tag that is included with the data update form. When the form is submitted, the tag is compared with the data record. If the data record has not changed, the assosiated tag will be the same as previously issued, and the submission is allowed to proceed. If the form tag and the data record tag differ, the transaction fails and an error message is returned to the user.
     
  3. The preview and directory index pages are generated only when the database is updated. They are semi-static.

    The advantage of this is that at any time, you can burn the entire directory structure to a CD and have a browseable layout that can be viewed on "any" machine. You can then give copies of this CD away and the recipients won't have to go through contortions to find a given picture by browsing. See the 'extras' directory for more on CD distribution methods.
     
  4. There is more than one way to find a photo.

    Really good web sites usually have both a "Site Index" and a "Search" interface. A good photo archive should too. The semi-static, distributed flat file design currently implemented is the easiest way I know how to merge both while still providing the broadest cross-platform and on-line/off-line accessibility.

BTW: The above is really just an extension of the Perl Philosophy: "There is more than one way to do things."


References

  1. Philip Greenspun, "Philip and Alex's Guide to Web Publishing,"
    Chapter 6: Adding Images to Your Site
    http://www.arsdigita.com/books/panda/images [accessed June 2, 2001]
  2. Philip Greenspun, "Philip and Alex's Guide to Web Publishing,"
    Chapter 11: Sites that are really databases
    http://www.arsdigita.com/books/panda/databases-intro [accessed June 2, 2001]
  3. Philip Greenspun, "Philip and Alex's Guide to Web Publishing,"
    Chapter 12: Database Management Systems
    http://www.arsdigita.com/books/panda/databases-choosing [accessed June 2, 2001]
  4. ids http://ids.sourceforge.net/
  5. PhotoShelf http://photoshelf.sourceforge.net/

Home