[FrontPage] [TitleIndex] [WordIndex

Processing framework review

The processing framework is a central subsystem of deegree 3. It is urgent to get this component into a stable state, because it is required for the WPS, which has to be delivered (as a prototype) at the end of September 2008. External developers will start to develop WPS processes then -- any later changes to the processing API will require changes on the side of the external code when they update to a later deegree 3 version.

This page reviews the current status of the processing framework and points out issues that need to be resolved/considered before the subsystem can be promoted to beta status.

1. Requirements

The WPS specification requires the following processing functionality:

Another requirement -- which is not strictly a requirement of the WPS specification, but an external one -- is the ability to persist the list of processes (and the processes themselves). In the WPS implementation, this allows to guarantee that a started asynchronous process gets executed, even when the system is interrupted (e.g. Tomcat crashes). Execution of the process will restart when the WPS is up again.

The processing API should also provide the functionality of the deegree 2 concurrent API, i.e:

1.1. Shortcomings/missing features of the deegree 2 concurrent framework

Compared with the deegree 2 concurrent API, the processing API must meet the following additional requirements:

2. Review of the current design

2.1. CommandProcessor

The CommandProcessor is responsible for the execution of commands (BTW, is there a difference between a process and a command? We should use a clear naming convention to avoid confusion!).

command_processor_hierarchy.png

2.2. Command

The Command interface must be implemented to define a method that can be executed by a CommandProcessor instance.

command_hierarchy.png

2.3. General design notes/questions

2.4. Bugs found during evaluation

3. Writing processes for the WPS

To write your own process, you'll have to implement the Process interface. This currently involves a createCommand method to create the Command to execute from the WPS execute request, and a getCommandProcessListener method to retrieve a listener to be added to the command.

So you'll also need to have a custom Command that actually executes the process, since the Process interface is designed to deal with WPS request specifics/data input etc. Here we have the WPSCommand interface, that deals with WPS like result generation (getXMLResult and writeResultToFile), and the Command interface respectively the AbstractCommand class from the processing package.

The AbstractCommand class forces you to implement methods such as cancel, pause, resume, getResult. For WPS purposes, these methods may not always be necessary (for example, WPS does not have a method of cancelling requests). The getResult method forces you to implement the CommandResult interface, just to transport the result of your operation.

Concluding, a better separation between the actual process and the WPS specific input/output would be desirable. For the WPS specific input/output requirements, a toolbox should be created, which enables you to quickly generate result documents etc. For a lean implementation of a WPS process, the Command interface seems to have far too many capabilities. If one wants to define the example process as a Callable (Standard Java abstraction for a "command" with a result), it may look like this:

public class AdditionCallable implements Callable<Integer> {
    private int a, b;
    public AdditionCallable( int a, int b ) {
        this.a = a;
        this.b = b;
    }
    @Override
    public Integer call()
                            throws Exception {
        return a + b;
    }
}

The same code using the Command interface looks like this:

public class AdditionCommand implements Command<Integer> {
    private int a, b;
    int c;
    public AdditionCommand( int a, int b ) {
        this.a = a;
        this.b = b;
    }
    @Override
    public void addCommandProcessorListener( CommandProcessorListener listener ) {
        // what to do here?
    }
    @Override
    public void cancel() {
        // or here?
    }
    @Override
    public void execute() {
        c = a + b;
    }
    @Override
    public Identifier getIdentifier() {
        // or here?
        return null;
    }
    @Override
    public User getOwner() {
        // or here?
        return null;
    }
    @Override
    public CommandResult<Integer> getResult() {
        return new CommandResult<Integer>() {
            @Override
            public CommandState getState() {
                // or here?
                return null;
            }
            @Override
            public Integer getValue() {
                return c;
            }
        };
    }
    @Override
    public void pause() {
        // or here?
    }
    @Override
    public void resume() {
        // or here?
    }
    @Override
    public void setPriority( int priority ) {
        // or here?
    }
    @Override
    public void setProcessMonitor( ProcessMonitor processMonitor ) {
        // or here?
    }
}

4. Practical tests

We set up a small test case computing the md5 hash from a randomly filled byte array. Besides an implementation for the processing framework we also implemented a Callable and executed it through the java concurrency framework (upon which the deegree 2 framework is built). The execution times were measured (note, that the initialization of the Quartz-Framework falsifies the timing-results). One typical run's output looks like this:

4.1. Normal run

#### Callable Test #####
e804c4a71f976aebba917f7d84426fe4
Callable: 5327ms

#### Current processing framework (Quartz) ####
[09:52:53]  INFO: [QuartzScheduler] Quartz Scheduler v.1.6.0 created.
[09:52:53]  INFO: [RAMJobStore] RAMJobStore initialized.
[09:52:53]  INFO: [StdSchedulerFactory] Quartz scheduler 'Sched1' initialized from default resource file in Quartz package: 'quartz.properties'
[09:52:53]  INFO: [StdSchedulerFactory] Quartz scheduler version: 1.6.0
[09:52:53]  INFO: [QuartzScheduler] Scheduler Sched1_$_1 started.
name:0
name:0
name:0
e804c4a71f976aebba917f7d84426fe4
Command: 6516ms

This means that the callable runs in about 5.3 seconds, while the command processing framework (using synchronous execution) runs in about 6.5 seconds. This includes starting up Quartz and the processing framework, which probably explains the difference.

The code I used can be found here: attachment:TestCallable.java, attachment:TestCommand.java, attachment:ProcessingTester.java

4.2. Failing run

Next we tested a similar, but slightly modified version that throws an exception during execution.

#### Callable Test #####
java.security.NoSuchAlgorithmException: md6 MessageDigest not available
Callable: 144ms

#### Current processing framework (Quartz) ####
[10:05:27]  INFO: [QuartzScheduler] Quartz Scheduler v.1.6.0 created.
[10:05:27]  INFO: [RAMJobStore] RAMJobStore initialized.
[10:05:27]  INFO: [StdSchedulerFactory] Quartz scheduler 'Sched1' initialized from default resource file in Quartz package: 'quartz.properties'
[10:05:27]  INFO: [StdSchedulerFactory] Quartz scheduler version: 1.6.0
[10:05:27]  INFO: [QuartzScheduler] Scheduler Sched1_$_1 started.
name:0
name:0
name:0
java.security.NoSuchAlgorithmException: md6 MessageDigest not available
Command: 535ms

The time is as expected again slower for the processing framework, probably again due to the startup of Quartz and the framework. What's interesting to see is that for the processing framework, one has to code a lot by oneself. That includes providing a way for the executing code to see the actual exception BY HAND! The Java concurrency framework just throws an exception, where one can extract the original cause easily. The deegree 2 concurrent framework just throws the original exception upon value retrieval, which is most convenient.

Here's the code for the second test case: attachment:TestCallable2.java, attachment:TestCommand2.java, attachment:ProcessingTester2.java

5. Conclusion

It doesn't seem feasible to promote the processing subsystem to beta state just now.

5.1. Issues

The most important issues that must be reconsidered or need to be taken care:

5.2. Thoughts

The mentioned issues need to be resolved quickly, because the processing API needs to be (mostly) stable for the rollout of the WPS prototype at the end of September 2008.


CategoryDeegree3


2018-04-20 12:05