[FrontPage] [TitleIndex] [WordIndex

Coverage Subsystem Development

Status: alpha (2008/09/15)

What is the raster API? The really short answer is: An interface to load images. But as short the answer is, it is very inaccurate. While an image is a raster, not every raster must be an image. A raster stores information that covers a region under, on or above the (earth)ground. The information is stored in a discrete rectangular grid. The many different types of information that can be represented in a raster. A raster can store the reflected radiation in a specific spectral band (eg. a color band of an image), the temperature, noise emissions, elevation, etc. These types of information require different precisions. Each point in the raster can contain multiple informations (eg. a color image contains radiation values for the red, green and blue spectral band, temperatures can be measured in different heights, noise emissions for different daytimes). Furthermore raster can be very large in terms of the size of the data.

1. Requirements

For good integration of the API in the deegree framework and for the WCS it needs support for:

2. Current Status

Most of these requirements are currently implemented.

2.1. Low-Level API

The package org.deegree.model.coverage.raster.data contains the low level parts of the API. The current implementation is based on Java NIOs ByteBuffer to store the raster data in memory. It supports different data types (byte, short, float) and different interleaving (pixel, line and band).

The low level API allows access to single samples and pixels and to rectangular pixel subsets.

2.2. High-Level API

The low level API operates on pixel coordinates, while the high level API operates on world coordinates (ie. non discrete values). Conversion between both systems is done with the RasterEnvelope class. The public API is defined by the AbstractRaster class. It offers methods to get and set parts of the raster (getSubset and setSubset).

Raster that are based on a single file are implemented by SimpleRaster. Raster with multiple ranges are implemented by MultiRangedRaster and multiple tiles by TiledRaster. The MultiRangedRaster and TiledRaster aggregate multiple AbstractRaster.

2.2.1. SimpleRaster

A SimpleRaster wraps a low level raster. Other AbstractRaster will normally be based on or aggregate SimpleRaster. To allow different loading and caching policies for the raster data, the access to the data is abstracted by the RasterDataContainer interface. The raster model comes with an in-memory implementation that loads the raster immediately and keeps it in memory (MemoryRasterDataContainer) and a implementation that loads the raster on first access (LazyRasterDataContainer). The deegree dataaccess module comes with an ehcache based implementation.

The RasterDataContainerFactory instantiates the requested RasterDataContainer-type. This factory uses the deegree3/ServiceLoader.

2.2.2. TiledRaster

The TiledRaster extends the AbstractRaster, thus offering the same getSubset/setSubset API. On each call the class will check which tiles are affected. The handling of the tiles (spatial information of each tile) is done by implementations of the TileContainer interface. Currently it comes with a simple MemoryTileContainer that stores all tiles in a list and tests all tiles for intersection on getTiles(Envelope) calls. Additional implementations should take advantage of spatial indices (indexed shapefiles, postgis db, etc).

2.2.3. MultiRangedRaster

A MultiRangedRaster also extends the AbstractRaster. It aggregates multiple raster. Each raster may represents a single (spectral)band, a different time, a different height(eg. temperatures in 0m, 500m and 1000m), etc. The MultiRangedRaster doesn't store any metadata that describes what type of raster it aggregates. This is a missing feature, because it is required for the WCS where a query can request specific time ranges, etc.

2.2.4. MultiResolutionRaster

A raster can cover large areas with high resolutions. For most operations like calculations/simulations or visualization only a small part or a lower resolution is needed. Smaller subsets of the data can efficiently be accessed with a pre-tiling of the data (TiledRaster). For a low resolution representation of the whole coverage, all data needs to be processed (subsampled). For this task it is common to pre-generate multiple resolutions of the original data. The result is often called a raster pyramid or a raster tree. The MultiResolutionRaster class can store multiple resolutions of the same raster.

2.3. ToDo

The raster API is not complete and some important features are still marked as a TODO. There are two major features that are needed, but not implemented because some required dependencies are not implemented in deegree3 at the moment.

2.3.1. ShapeFile/Database TileContainer

First, the implementation is lacking TileContainer implementations that can handle efficiently thousands to millions of tiles. This can be implemented when the (shapefile/database) datastore API of deegree3 is ready for use.

2.3.2. Metadata/Filter

Second, the raster model needs to integrate metadata of the raster. A raster coverage can consist of multiple spectral, temporal and spatial ranges. The Web Coverage Service allows fine control over the result ranges and therefor needs this information. The raster model should use the deegree feature model for the metadata handling and should try to utilize the filter API. Both are not ready/implemented yet.

2.3.3. Other ToDos

2.4. Limitations

2.4.1. Low-Level Tiling

Early on in the development process the requirement was set, that the tiling should only be handled in the high level API. That way each raster/tile must be loaded entirely. This makes it inefficient to load large raster files. A solution is to use the RasterTreeBuilder to build smaller tiles, that can be handled by the TiledRaster. Together with a TileContainer that has a spatial index and a caching RasterDataContainer, large raster can be handled efficiently. Problems can occur when the raster is to large to be loaded with the RasterTreeBuilder.

Most raster APIs (like the Java ones) offer access to tiles on the pixel level. This access depends on the low level format of the raster file. Some files will result in tiles that span a single pixel row (width x 1), a small blocks (8 x 8, or 64 x 64) or even larger blocks (>500 x 500). The later is often used to store large TIFFs (>10000 x 10000) in one file and still allow efficient access to subsets.

The deegree raster API could benefit from the support of the low level tiling in different ways. First and foremost it would allow to read large raster files. While a pre-generation of tiles with the RasterTreeBuilder may still be the preferred way to serve large raster, only the change would allow the RasterTreeBuilder to process these huge raster at all.

A second benefit could be a better support for multi-threading operations on raster data. The ByteBuffer all rasters are based on is not threadsafe. This could be prevented with locking (synchronized), but tests have shown that a locking on a pixel-level produce a large overhead that would defeat the benefits of multiple threads. With a pixel-level tiling a locking could be implemented on tiles to reduce the overhead. This is just an assumption though, tests should be made verify this.

2.4.2. Low-Level Storage (Thread-safety)

The API doesn't offer classes for each datatype (like ByteRaster, FloatRaster,...), but one class that offers access to the specific data type (RasterData.getByteSample,...). Internally all data is stored in a Java NIO ByteBuffer. The ByteBuffer converts the datatypes into the byte representation. E.g. ByteBuffer.putFloat will convert a float into four bytes. The user of the API must ensure that the getter method matches the data type of the raster. Otherwise the user will get wrong values. Most operations, like getSubset, setSubset, NearestNeighborInterpolation, use the generic getSample and getPixel methods. These are independent from the type and return the low level byte[] representation.

Also note that ByteBuffers are not threadsafe. They use a position()-get() combo that may be interrupted by another position() call and thus return the wrong samples. This becomes a problem when you try to access a raster from two different threads (like two overlapping WCS requests). ByteBuffer allows to share the data from another ByteBuffer, both can then operate independently (threadsafe) on the data.

The SimpleRaster is extended by the getReadOnlyRasterData() method that does return a read-only representation of the same underlying RasterData. ByteBufferRasterData#asReadOnly() will return a RasterData instance that shares the same data and allows threadsafe access to it. (Each thread needs a separate copy.) The getSubset() and setSubset() methods of SimpleRaster will create a read-only view before the raster data is accessed. Since the mapping between high-level and low-level raster is implemented in SimpleRaster all operations should be threadsafe.

2.4.3. Other Limitations

Some minor limitations, or more precisely, some design decisions that may have some drawbacks:

All getSubset-methods create a new raster, the data is copied. A first implementation of the API returned views on the original data. That way all getSubset-calls had no processing overhead. The downside of that implementation was that a setSubset changed all raster that shared the same data. So this feature was dropped later.

3. Example Usage of the API

3.1. Loading

The RasterFactory offers methods to create a raster from a file. The raster loading process is abstracted by the RasterReader and RasterWriter interfaces. The implementation is selected with a deegree3/ServiceLoader (RasterIOProvider). The deegree commons module doesn't contain an implementation, but the dataaccess module offers reader and writer based on JAI and ImageIO.

3.2. Simple Raster Example

   1 // load a raster
   2 AbstractRaster raster = RasterFactory.loadRasterFromFile( new File("test.tiff") );
   4 // depending on the raster loader and the raster file, the may contain a crs...
   5 CoordinateSystem crs = raster.getCoordinateSystem();
   6 // and an envelope
   7 Envelope env = raster.getEnvelope();
   9 // get a subset and save it
  10 AbstractRaster subset = raster.getSubset( requestEnvelope );
  11 RasterFactory.saveRasterToFile( subset, new File("output.png") );

Optionally you can supply a RasterIOOptions instance to the read and write methods of the RasterFactory. This object can contain arbitrary key-values that may change the behavior of the RasterReader or RasterWriter. This interface may be used to set metadata or influence the selection of the actual reader/writer implementation. At the moment only the "FORMAT" key (`RasterIOOptions.OPT_FORMAT) is used by the raster API to select the format of the raster reader/writer. This is used to write to streams in a chosen format (where there is no file extension to guess the right format).

3.3. Transforming a Raster

This example loads tiles into a raster and extracts a subset of the raster. The target envelope and raster size is given. The subset is in another coordinate system and so the output will be transformed by the RasterTransformer. This is pretty much everything you need for a simple WCS.

   1 // create a tile container...
   2 MemoryTileContainer tileContainer = new MemoryTileContainer();
   3 for (String tile: tileFileNames ) {
   4     // and add all tiles
   5     tileContainer.addTile( RasterFactory.loadRasterFromFile( new File( tile ) )
   6 }
   7 // create a TiledRaster and set the native crs
   8 AbstractRaster srcRaster = new TiledRaster( tileContainer );
   9 srcRaster.setCoordinateSystem( CRSFactory.create( srcCRS ) );
  11 RasterTransformer transf = new RasterTransformer( dstCRS );
  12 SimpleRaster result = transf.transform( srcRaster, targetEnvelope, targetWidth, targetHeight InterpolationType.BILINEAR );
  14 RasterFactory.saveRasterToFile( result, new File( "result.tiff" ) );


2018-04-20 12:04