Simple image server with tiled online viewer

December 01, 2014

In this note, I describe a simple image server and JavaScript web application that allows on-demand display of image tiles on a browser. I implemented this system to display biological media such as X-ray and gene expression images. While I really enjoyed implementing this system, I also felt that the system is generic enough to be easily adaptable for other purposes. In the following, I describe the internals to complement the code-base and documentation that are available from GitHub.

Please follow this link to try out the media viewer that this note describes.


Media Viewer web application

Requirements

The following are the main requirements of the image server and online viewer:

  • Fully automated system from data import to delivery over the internet
  • Low maintenance and easy recovery from failure
  • Support various image formats
  • Re-use rather than re-invent
  • Fast and light-weight for efficient viewing on web browsers

All of the above requirements are interrelated. Firstly, the system must allow automated collection and import of media files. In the project mentioned above, we were collecting several thousand images and there was no way we could have done this manually. Not only does the system import existing media, it also has the ability to download media from other resources. Secondly, the media must be processed so that they may be delivered efficiently to the browser. This means transforming image media in various formats to an image format that is both efficient and supported on all of the browsers. This also means tiling large images so that they can be delievered in pieces as on-demand image tiles. Doing all of this processing requires understanding several image formats, and requires a mature software package that was designed specifically for this purpose. Hence, we reuse existing software to process the images. Finally, we want a system that is easy to maintain, where, if anything should go wrong, we can fix them easily, manually if needed. This means no complex database schemas and storage architecture.

Based on the above, the following is what we use in this project:

  • The data downloader and task scheduler is written in Python. This module is responsible for carrying out the automated download, issuing the processing task and keeping track of the processing phases for each media that is imported to the system.

  • The image processing unit is written in Bash, which uses existing software to do the processing. To process the images, we use ImageMagick. It allows us to convert images from one format to another. Since our delivery mechanism is over the internet, for display on the browser, we have chosen JPEG format. Hence, all image media collected as DICOM, TIFF, JPEG, PNG and BMP are converted automatically to JPEG.

  • We also use ImageMagick to generate the image tiles from the original image media file. With advancements in storage technology, the cost-per-byte is reducing. This means that instead of generating the image tiles on the fly, we can now store pre-processed tiles that the web server can serve efficiently as static files.

  • This tile delivery mechanism means that we can use any web server of our choosing, as no specialised features are necessary. This makes maintenance extremely easy. All we need now is to package them together to work seamlessly with one another.

What we need

Based on the above requirements, we need the following:

  • A web server for hosting the JavaScript web application.
  • A large enough file system for hosting the media, images and tiles. The web server must have access to this file system.
  • An operating system with the ability to run Bash, Python and ImageMagick.

For instance, the underlying infrastructure that corresponds to the screenshot above is:

Keeping track of available media

The system that we are developing expects to download all of the media files to be imported from some other resource. These could be available on a local disk-drive or accessible from a remote file server. Hence, as a first step, this system must be given a list of media files to be imported by downloading them from a supplied URL. We use a database to do this, and the schema is quite simple, as shown below:

drop table if exists mediadb;
create table mediadb (
   id bigint unsigned not null auto_increment,
   url varchar(2048),                    /* the source URL from which the media must be downloaded */
   checksum varchar(40),                 /* SHA1 checksum of media file content */
   extension_id smallint unsigned,       /* media file extension */
   is_image tinyint,                     /* 1 if image media; otherwise, 0 */
   width int unsigned,                   /* image width (if image media) */
   height int unsigned,                  /* image height (if image media) */
   phase_id smallint unsigned not null,  /* download or processing phase */
   status_id smallint unsigned not null, /* status of the phase */
   ... /* other columns that identify a selection context for the media */
   ... /* keys, indices and integrity constraints */
) engine = innodb;

First of all, we need the URL of the media file. This is the URL from which the system will automatically download the media. Once a media has been downloaded, the system will calculate the SHA1 checksum of the media file content. This checksum is very important, as we shall see in later sections. We also store the file extension of the media file (filename suffix after last '.') for convenience, which the system uses to decide on how the file must be processed. Then we have another convenience column that separates out image media from other media, such as PDF documents. As it will become clear later on, this column is also used when delivering media (e.g., deciding whether or not to deliver tiled images). Then, for image media, we have the width and height columns that store the original image width and height respectively. Finally, since a media could go through several processing phases, we keep track of this information using the phase and status columns. For instance, we start with the <download, pending> state when a media URL is first added to the system. We use three phases: download, checksum, tiling, and four statuses: pending, running, done and failed. Finally, we add columns to the table that define the selection context for the imported media. For instance, all of the biological media in the above screenshot is grouped and identified by the following columns:

cid int unsigned not null, /* centre identifier */
lid int unsigned not null, /* experiment pipeline identifier */
gid int unsigned not null, /* genotype identifier */
sid int unsigned not null, /* gene background strain identifier */
pid int unsigned not null, /* experimental procedure identifier */
qid int unsigned not null, /* measured parameter identifier */

This context is used to store, filter and retrieve media files, as described below.

Importing and storage

To import a media file, a new record is added to the table described above, and the phase and status are set to download and pending respectively. The media downloader script, when run in the download mode, will download these new media files. The phase and status will be updated accordingly. The following is an overview of the processing step:

Processing phase

All of the downloaded media are stored in an archive directory, which is just a directory on the allocated disk-drive. Since we wish to be able to fix any problems manually, if needed, we want the structure of the archive directory to have meaning. This allows us to go back from the media file as stored in the archive directory to the corresponding record in the table described above. To do this, all of the media are stored using a directory structure that captures the context. Hence, if we wish to download a DICOM media, which corresponds to the record with primary key id = 22992 for the context, say, <cid, lid, gid, sid, pid, qid> = <10, 5, 4, 11, 19, 90>, the media file will be stored as follows:

/archive_directory/10/5/4/11/19/90/22992.dcm

Here, /archive_directory is the directory where all of the original media are stored. From the path above, we can go back to the record and also identify the context which corresponds to the media. This allows the database to be decoupled from the directory where the media are stored. For instance, if we wish to move all of the original media to another disk-drive or file system, we do not need to update the database. We can simply move the files to another location. All we really need to do is to update the archive directory property in the system configuration before the download and processing script is run again. And of course, the web server configuration must be updated so that it now points to the new location.

Generating and storing tiles

Similar to the storage of original media, the generated tiles are also stored in a form that allows back-referencing. To do this, we use the SHA1 checksum which is calculated after a media has been downloaded. This also allows us to re-use existing tiles. Sometimes, different URLs may point to the same media file. Although we cannot prevent multiple downloads of the same media file, because the URL alone cannot ascertain the uniqueness or similarity of files, we can prevent re-generation of tiles. Since the SHA1 checksum of a media file uniquely identifies its content (except when there is a collision), the same checksum means the same content. Hence, if tiles are stored using the checksum of the original media, we can prevent tiles from being re-generated. This reduces the storage requirement while also reducing the processing overhead.

In our system, we use a tiles directory where all of the generated tiles are stored. Inside this directory, the group of tiles that correspond to a media file are organised as follows:

/tiles_directory/87ca/2fee/bf6c/8539/7dcd/bcac/88ea/422b/e95b/adb0

Here, /tiles_directory is just a normal directory on our allocated disk-drive where all of the tiles are stored. Inside this directory, we have a subdirectory structure for a checksum. For instance, the above directory stores all of the tiles for a media file with checksum 87ca2feebf6c85397dcdbcac88ea422be95badb0. We break the checksum into subdirectories to work around file system limitations, where there is a limit to the number of entries inside a directory.

Structure of the tiles directory

Inside the subdirectory for a given checksum, the generated image files are stored as follows:

  • A thumbnail is automatically generated, for use by the browser to implement, say an image selection panel.

  • Since the tiles are generated for on-demand delivery to a web browser client, we need to know the size of the image tiles. This can be anything, however, we choose tiles that has a maximum of 256x256 pixels. Since we can have multiple tile sizes, all of the tiles for a given tile size are stored in a directory named after the tile size. For instance, tiles with a maximum of 256x256 pixels are stored in the 256 sub-directory.

  • Furthermore, since a media viewer is expected to allow zooming in and out, we store the tiles for each zoom level. In our system, we have chosen the following zoom levels: 10%, 25%, 50%, 75% and 100%.

  • Finally, all of the tiles are generated and stored in a form that allows easy on-demand retrieval and assembly on the browser. To do this, we use the following tile naming scheme:

      <number of columns>_<number of rows>_<row_index>_<column_index>.jpg
    

Hence, the top-left tile in a 32 tile breakup for a 8 column by 4 row grid is stored as 8_4_0_0.jpg. Similarly, the bottom-right tile in stored as 8_4_3_7.jpg.

To sum up, the tiles for an image with checksum 87ca2feebf6c85397dcdbcac88ea422be95badb0, say, is stored using the following directory structure:

$ pwd
/tiles_directory/87ca/2fee/bf6c/8539/7dcd/bcac/88ea/422b/e95b/adb0

$ tree
.
|__ 256
|    |__ 100
|    |    |__ 8_4_0_0.jpg
|    |    |__ 8_4_0_1.jpg
              ...
|    |    |__ 8_4_3_7.jpg
|    |__ 75
|    |__ 50
|    |__ 25
|    |__ 10
|__ thumbnail.jpg

Generating the tiles

To generate the tiles, we use ImageMagick. The following is a summary of the process:

  1. Firstly, we convert the original image to JPEG and name this temporary file as original.jpg.

    convert original.${extension} original.jpg;
    

    Here ${extension} is the filename extension of the original media file that was downloaded during the import phase. To get a list of image formats supported by ImageMagick, run identify -list format on the command line. When necessary, we use special settings for special file formats; e.g., normalising a DICOM image:

    convert -define dcm:display-range=reset original.dcm -normalize original.jpg;
    
  2. Secondly, we generate thumbnail.jpg. In our system, we wish them to be 300 pixels wide and must preserve the aspect ratio of the original image.

    convert original.jpg -resize 300x thumbnail.jpg;
    
  3. Finally, for each of the tile sizes specified in the system configuration, we scale original.jpg and generate the tiles using the tile naming schema described above. Hence, for zoom level of 75% with 256x256 pixels tile size, the tiles are generated as follows:

    convert original.jpg -resize 75% scaled.jpg;
    convert scaled.jpg -crop 256x256 -set filename:tile "%[fx:ceil(page.width/256)]_%[fx:ceil(page.height/256)]_%[fx:page.y/256]_%[fx:page.x/256]" +repage +adjoin "%[filename:tile].jpg";
    

    To save space, we delete the scaled image once the tiles have been generated.

Displaying the media

To retrieve the media files, the media viewer client on the browser must first retrieve a list of the available media files that satisfies certain criteria. This criteria is expressed by defining the appropriate selection context using context related columns in the media database. The following is an overview of the client-browser interaction with the web server, and internally with the media database:

Request for list of media files available

For instance, the browser client would send a request like:

http://localhost/imageviewer/rest/mediafiles/?cid=10&gid=4&sid=11&pid=19...

to which the server responds with a JSON response that looks like:

{
    "success": true,
    "total": 898,     /* total number of media files available */
    "details": [{
        "id": 22992,  /* primary key, which is also the filename of original media */
        "an": "ADH5-TM1B-IC/12.1a_5242318", /* image label */
        "c": "8fe6920197eb100b46fca3b91a1efe65899cd08b", /* content SHA1 checksum */
        "i": 1,       /* is media an image */
        "e": "dcm",   /* extension of the original media */
        "w": 2816,    /* width of the original image in pixels  */
        "h": 2048,    /* height of the original image in pixels */
        "p": 3,       /* processing phase */
        "s": 3,       /* processing status */
        ...
        },
        ...
    }]
}

Using the list of media files retrieved in the previous stage, the browser can now retrieve the original media, or retrieve image tiles. If the media file is not an image, tiling is disabled. This will choose an embedded viewer where applicable. For instance, PDF documents are displayed using an embedded PDF viewer. Otherwise, the original media is made available for download. This is shown below:

Get original media file

To make the original media available for download, the media viewer client assembles the URL to the file by using information returned by the web server in the list of media files. For instance, if the context for retrieving the list of available media was, say, <cid, lid, gid, sid, pid, qid> = <10, 5, 4, 11, 19, 90>, and we wish to download the original media for the media file with primary key 22992 and file extension .dcm, which are extracted from the JSON response above, the URL is composed as follows:

http://localhost/src/10/5/4/11/19/90/22992.dcm

Here, http://localhost/src/ points to the archive_directory defined in the processing phase above.

Retrieving image media

If the media file is an image, tiling is enabled. Instead of returning a single file, the client browser will request image tiles from the web server. Furthermore, the browser can now display the corresponding thumbnail by using the content checksum. For instance, the following will retrieve the corresponding thumbnail for the image media with checksum 8fe6920197eb100b46fca3b91a1efe65899cd08b. Note that the web server has already been set up so that the URI for the image tiles http://localhost/tiles/ points to the tiles_directory defined in the processing phase above.

http://localhost/tiles/8fe6/9201/97eb/100b/46fc/a3b9/1a1e/fe65/899c/d08b/thumbnail.jpg

When the client browser makes requests for image tiles, it will only request tiles that are currently visible inside the viewport, thus avoiding unnecessary requests and therefore improving performance of both the server and the browser. This is decribed in the following section.

Display of image tiles

The following are the basic components of the image viewer:

  • Viewport The part of the viewer that displays the image tiles. It is like a window to the original image, displaying a rectangular region of the image. The size of the viewport depends on the display size available to the browser.

  • Layers To allow image overlay, each viewport is associated with a set of image layers, each displaying one image media file.

  • Image grid Each image layer is composed of an image grid, where the components of the grid are placeholders for the image tiles. The size of the grid is determined by the size of the original image and the size of the image tiles. The sequential indexing of the grid starts at 0, and is such that the top-left corner tile has indices (0, 0).

The following is an overview of how image tiles are retrieved:

Get image tiles

When the web server sends a list of available media files, it also sends the dimensions of all the image media. The image viewer uses this to create an image grid as shown below:

Viewport and tiles grid

To load the image, the browser only sends requests for image tiles that overlap with the viewport. Hence, in the diagram above, the browser will only request 12 image tiles, even though the total size of the original image requires 88 image tiles. Each of these requests to the web server looks like the following:

http://localhost/tiles/8fe6/9201/97eb/100b/46fc/a3b9/1a1e/fe65/899c/d08b/256/100/11_8_2_4.jpg
http://localhost/tiles/8fe6/9201/97eb/100b/46fc/a3b9/1a1e/fe65/899c/d08b/256/100/11_8_2_5.jpg
...
http://localhost/tiles/8fe6/9201/97eb/100b/46fc/a3b9/1a1e/fe65/899c/d08b/256/100/11_8_4_7.jpg

Again, http://localhost/tiles points to the tiles_directory used above during the processing phase. The /256/100/ denotes the tile size and the zoom-level respectively. Hence, if we had generated tiles with a maximum size of 512x512 pixels, we shall change this to /512/100/. Similarly, if we wanted to get the 512x512 tiles at zoom-level 75%, we will change this to /512/75/.

Finally, the remaining suffix identifies the image tile required. To create this suffix, we choose the grid placeholders that overlap with the current viewport. This is then converted to the tile name as follows:

  • number of columns in the grid = ceil(scaled image width / tile size),
  • number of rows in the grid = ceil(scaled image height / tile size),
  • tile row index = row index of the visible grid component, and
  • tile column index = column index of the visible grid component.

where the values correspond to the tile naming scheme:

<number of columns>_<number of rows>_<row_index>_<column_index>.jpg

Given this framework, all we need to do is attach event handlers to the viewport so that every time the viewport is updated relative to the tiles placeholder grid, we retrieve the missing tiles for all of the empty grid placeholders that overlap with the viewport. Thus we get on-demand tiled image viewer.

Conclusion

The code-base at GitHub describes in detail the implementation of the event handlers, region-of-interest selectors, single image viewer panel and side-by-side comparative image viewer panels. It would be counter-productive to describe them all in this note. Nonetheless, please do leave a comment if anything needs further clarification.

comments powered by Disqus