Home :: Developers

CEDA OGC Web Services Framework

«  COWS WPS   ::   Contents

Introduction

A brief overview of COWS WPS

COWS WPS Overview

The COWS Web Processing Service (WPS) is a “generic” web service and offline processing tool developed within the Centre for Environmental Data Archival (CEDA). The CEDA OGC web services (COWS) is a set of Python libraries that allow rapid development and deployment of geospatial web applications and services built around the standards managed by the Open Geospatial Consortium*** (OGC).

COWS WPS grew out of a need to provide a flexible framework to deploy ad-hoc web services that did not fit into the scope of the commonly deployed Web Mapping Service*** (WMS) and Web Coverage Service*** (WCS). Whilst the aim of the COWS WPS is conformance with the WPS version 1.0.0 standard*** it has been used within CEDA (including the UK Climate Projections User Interface***) to support operational systems that require asynchronous processing, job scheduling and management of user jobs.

Rather than developing separate Web Services (and interfaces) for each project and requirement the COWS WPS provides the opportunity to deploy “processes” inside a single framework. The framework itself contains a set of utilities that the administrator can call upon such as zipping up output files, automatically generating an HTML submission form, polling offline job progress and e-mailing users when a job has completed.

The documentation presented here is intended primarily to suit the needs of administrators and should developers. It should also be useful to the user who is accessing the COWS WPS via the web, or using the WPS user interface.

The CEDA OGC Web Services libraries (COWS)

CEDA OGC Web Services*** (COWS) is a Python software framework for implementing Open Geospatial Consortium web service standards. COWS emphasises rapid service development by providing a lightweight layer of OGC web service logic on top of Pylons***, a mature web application framework for the Python language. This approach provides developers with a flexible web service development environment without compromising access to the full range of web application tools and patterns: Model-View-Controller paradigm, XML templating, Object-Relational-Mapper integration and authentication/authorisation. COWS contains pre-configured implementations of WMS, WCS and WFS services, a web client and WPS.

Web Processing Service (WPS)

The OGC Web Processing Service*** version 1.0.0 specification provides a means of wrapping up arbitrary processes (such as GetAPlot or RunAModel) in a common framework with the following features:

  • web service interface, using POST or GET.
  • asynchronous reporting and control of jobs.
  • a defined XML interface for responses, including exceptions.
  • a common format for passing arguments to the server.
  • job status interrogation.

The COWS WPS extends this functionality to include:

  • Inform users when a job has completed (or failed).
  • Provide a configuration system that allows new processes to be added via a simple configuration file and a single Python interface module. This feature is only available to the WPS administrator in the current implementation and new processes are only identified after the service is restarted. However, the “plug-in” methodology could be extended to allow authorised users to add processes at runtime.
  • Connection to a scheduling tool that communicates with external offline processing nodes. The SOA layer and the Offline Processing layer share a file system which allows either layer to read/write status information to a single location.
  • A multiple-processor server application that can run “quick” tasks within a worker pool of connected processes.
  • Make a “costonly” (or “dry-run”) request before submitting a job that results in an estimate of the response size and duration without actually executing the job. This is implemented as an additional argument to the top-level WPS Execute Request so that it is available for use with all processes.
  • Report a job history for the entire system or the individual user.
  • Zip up groups of output files and report details of their contents in the XML response.

This functionality enables selecting and viewing of thumbnail plots within the client tools.

A specific feature of the system that required a great deal of integration between the PHP/JavaScript UI and the Python WPS was the tracking of user jobs via the UI Jobs page. The WPS specification provided a mechanism for presenting a persistent URL that returns the status of a given job. The UI employed AJAX technology to poll this “status URL”, parsing the XML and displaying a table of outputs that can be interrogated by the user. For accessing information about previous jobs, an additional process was deployed under the COWS-WPS that returns an XML-encoding of the metadata for previous jobs issued to the WPS by a given user. This extension of the WPS capability is likely to be an essential requirement of any implementation that includes asynchronous job management.

The COWS WPS provides the standard Web Processing Service methods as web-accessible end-points:

GetCapabilities
an XML doc saying the names of processes deployed within the WPS
DescribeProcess
details on the inputs/outputs of a given process
Execute
a request to execute a job using a providing the process ID and input parameters

Each “process” can be thought of as a Web Service that has is deployed within the WPS container. Rather than discovering and calling the API of a bespoke Web Service the client calls the API of the WPS with a set of arguments relevant to the “process”. The advantages of this approach are that a single framework can manage multiple Web Services and work-flows, with all common functionality being housed outside the process. Such functionality includes:

  • scheduling of off-line jobs and asynchronous polling
  • zipping of output files and reporting to the user on completion/failure of a job
  • auto-generation of HTML forms to submit jobs
  • application of a common security layer
One view on WPS “processes” is to think of them as the equivalent of command-line scripts with input arguments. Hence it is easy to wrap any old coded function within a WPS process as long as the configuration file provides appropriate information about the input/output arguments and their data types.

History of the COWS WPS

Catering for jobs that would run for different durations and produce a range of different outputs required a solution that managed both in-process execution of processes and asynchronous job management. When selecting a tool or framework to deliver this flexibility the OGC Web Processing Service (WPS) specification was considered a suitable candidate. Whilst still in its infancy, the WPS interface provides a means of wrapping up arbitrary processes (such as GetAPlot or RunAModel) in a common framework with the following advantages:

  • Web Service interface, using POST or GET.
  • asynchronous reporting and control of jobs.
  • a defined XML interface for responses, including exceptions.
  • a common format for passing arguments to the server.
  • job status interrogation.

COWS WPS and standards-compliance

WPS 1.0.0 Specification

The standard...

Application Profiles

***WPS 1.0 introduces the concept of Application Profiles as means of providing and publishing domain-specific processes to aid interoperability in terms of building clients and utilising the publish/find/bind paradigm. In the case of this project the outputs are highly tailored so in most cases the processes are unlikely to be generally applicable to other climate-related datasets. However, the development of Application Profiles for more general climate-plotting processes would be beneficial as there are common requirements for plotting tools outside the 2-dimensional view covered by WMS.

Web Service Description Language and Publish/Find/Bind

Compliance issues

The COWS WPS is aiming for standards compliance with WPS 1.0.0. However,***

Web Server Gateway Interface (WSGI)

Getting Started

Installation

Overview

System requirements

Python 2.5+

Pylons

Sqlalchemy

A database

Sun Grid Engine

Supported platforms

This is not really relevant . OpenSuSe 10.3 . but all about portability of the underlying toolkits

Downloading the sources

Building a virtual environment for housing the WPS

Installing the Sun Grid Engine scheduler

Overview of SGE

Obtaining the source

Installation on each server

WPS servers
Offline processing servers

Deployment

Deployment configurations

There are many possible configurations in which the COWS WPS can be deployed. The simplest is a single-server deployment but the system is designed to be scalable to any size. Figure *** shows the basic architecture which is made up of:

  • WPS server(s)
  • Offline processing servers(s)
  • State (database) server(s)

Virtual or physical servers?

The COWS WPS will work whether installed on physical servers or on virtual machines (VMs). The advantage of the VM approach is that the WPS VMs can be replicated via “copy-and-paste” rather than having to manually install software on each system.

The VM approach is also compatible with cloud-technologies that are likely to become common-place in the future.

Single-server deployment

The simplest method of deploying the COWS WPS is on a single server. This limits the service to only providing “synchronous” responses so it does not exploit the full power of WPS. However, this approach is perfect for proof-of-concept and demonstrator activities. The single-server deployment requires you to build a single instance of the WPS (see the installation section). The key point to note is that all process configuration files must have the “process_type”*** set to “sync” to ensure that the WPS runs them directly without attempting to contact the scheduler.

The 3 server types

WPS servers

Offline processing servers

State (database) servers

Main features

Built on Python web technologies

Common framework for Web Service deployment

Asynchronous processing and scalability

Dry-run execution for resource estimation

WPS User Interface

Security

Service-chaining

Useful processes . and add your own

WPS processes

Thinking in terms of “processes”

Unlike other OGC Web Services...[copy from ExArch and published paper...WPS is a framework for deploying processes...

How does the WPS know about its “processes”?

A simple interface has been developed to manage the interactions between the main WPS application and the processes that are deployed within it.

The process-WPS interface

The process configuration file

The process python interface module (and class)

Supported processes

Processes supported by the current COWS WPS

WCS tools - schedule requests to a WCS

CDMS tools

interrogate and extract data handled described in Climate Data Markup Language (CDML)

Plotting tools . plot data from a NetCDF file

Dap?

Internal processes

...what they are, and where they are stored

Local processes

What are local processes?

Adding a new process

Running the create_process.sh script

Defining the inputs

Returning the outputs

XML outputs
File outputs
OpenDap end-point outputs

Re-starting the web app (touch or restart apache)

Application profiles in WPS 1.0.0

The version 1.0.0 specification introduced the idea of “Application profiles” for WPS. ...

Service deployment

Deploying under Apache/mod_wsgi or Python Paste

Running with Python Paste

Running within Apache with mod_wsgi

Parallel deployment and load-balancing

Configuring your application

The top-level configuration file

Key elements of the top-level configuration file...

Securing your application and user-management

...it takes place outside of the app...ish

Configuring the roles for each secured process

Securing outputs

Managing outputs

The ExecuteResponse document

Zipping output files

Maintaining output files for client access

Thumbnails

Caching

Archival and removal of old jobs

The COWS WPS User Interface (CWUI)

.Pronounced “kwee”

Why build a User Interface?

Generic requirements for Web Service deployment

I.e. you don.t need to build a customised client for every bespoke service you want to deploy.

As a test client

Exploring service-chaining capabilities

Interoperability with other WPS services

Overview of the CWUI

Auto-generated forms for job submission

Multiple job submission

Submission via URLs

Access to previous jobs

Viewing the status of asynchronous jobs

Accessing other WPS services

Current status

User management in the CWUI

Interoperability and service-chaining

The WPS methods

GetCapabilities DescribeProcess Execute

Handling “dynamic” input parameters

What are dynamic parameters and why are they useful?

In the DescribeProcess: TBC

Details for developers

Pylons core and Model-View-Controller approach

Code overview

Database and Object-relational model

Request and Job objects

Synchronising code across servers

Testing processes on offline processing servers

When an offline request is submitted the inputs are serialised into a Python pickle object. The execution environment is provided by a wrapper script used to run all processes. It is often useful to test that a process runs correctly on an offline processing server. To do this, you need to:

# Login to the offline processing server
$ ssh my-proc1

# Change directory to the job directory (which was saved when the "dry-run" Execute call was made)
$ cd <proc_outputs_dir>/<date_dir>/<process_identifier>/<process_id>/

# Then run the process using the wps_runproc wrapper script
$ <wps_home_dir>/bin/wps_runproc process_modules.<proc_module>#<ProcClass> $PWD

CWUI for developers

CWUI Architecture

Home

Capabilities

View

Submit

Jobs

Docs

Submitter

JobViewer

CWUI Style

Point to CSS

CWUI Javascript and AJAX usage

JQuery

OpenLayers

Dynamic parameters and process configuration files

Future development

Contribute code to the open source COWS WPS

See the COWS WPS wish list page

Licensing

The COWS WPS is provided under a *** license

References

Publications and poster sessions

In press

IJoDE Other Stephen references

«  COWS WPS   ::   Contents