CSW Development
concepts and development for deegree3 catalogueService
1. Terminology
resource: something that has an identity. e.g. document, image, service...
property: element to describe a resource.
record: structured metadata about a resource searched by properties and their associated values.
brief, summary, full: views of the core metadata properties.
common queryable properties ==
core queryable properties ==
common/core queryable elements
1.1. Profiles in focus
- information resources for profiles DC and ISO among different CSW specs:
|
CSW 2.0.0/2.0.1 |
CSW 2.0.2 |
DC |
csw:Record |
csw:Record |
ISO |
Dataset |
gmd:MD_Metadata |
2. Implementationstatus
Feature |
MiniSpec |
Status |
Description and Report |
GetCapabilities-operation |
display the capabilities of the CSW |
implemented - (100%) |
- |
DescribeRecord-operation |
display the XSD from the requested record |
implemented - (90%) |
the XSD files are requested remote from the respective sites |
GetDomain-operation |
- |
not implemented - (0%) |
- |
GetRecords-operation |
display all the requested records due to the filter expression |
implemented - (80%) |
there are some minor attributes left which are ignored at the moment because of non-mandatoryness in spec |
GetRecordById-operation |
display the record(s) due to the identifier property |
implemented - (90%) |
there is an own method implemented that doesn't use the generation of the SELECT statement which is used from the GetRecords-operation |
Transaction-operation DC |
insert-action |
implemented - (100%) |
if you update a complete record there is no problem and is implemented correctly. If you update just some of the mono-properties like the language there is no problem at all and is implemented correctly. But if you update some properties that are multiple properties there is a known bug and has to be fixed soon. |
Transaction-operation ISO |
insert-action |
implemented - (100%) |
same problem as above |
Harvest-operation |
- |
not implemented - (0%) |
- |
INSPIRE #1 |
queryable properties |
implemented - (NAN) |
correctly implemented for inofficial compliance tests |
INSPIRE #2 |
INSPIRE constraints regarding to ISO spec (page 10-11 in INSPIRE spec) |
implemented - (5%) |
This switch should touch the properties mentioned at page 10-11 in INSPIRE spec like: language -> mandatory or hierarchieLevel -> mandatory |
3. Handling of metadata formats (profiles)
3.1. TypeName / outputSchema
The typeName specifies the requested record profile.
The outputSchema specifies the requested response profile in which the record should be displayed.
typeName
outputSchema
response
1
DC
DC
DC
2
ISO
ISO
ISO
3
DC
ISO
ISO
4
ISO
DC
DC
- The first two rows are implemented yet.
Row 3 and 4 has to be implemented -> should be a little bugfix because needed structure is available, already.
4. Reading relevant Catalogue Specs
4.1. Information model
Provides a formal structure for describing the information resources a catalogue manages to fulfil an application profile.
- ISO 19115: defines a set of metadata elements; specifies a general purpose model for metadata description
- ISO 19119: facilitates the management of service metadata; service metadata elements should be consistent with this standard
- ISO 19139: defines formal encoding and structure for dataexchange
Mandatory operations
OGC-Service
getCapabilities
Discovery
getRecords
describeRecord
getRecordById -> in 07-045 ISO Metadata App Profile 1.0
Optional operations
Discovery
getDomain
getRecordById -> in 07-006r1 Catalogue Spec 2.0.2Publication
transaction
harvest
4.2. application profile
A set of international metadata-standards like ISO 19115 or Dublin Core. Application profiles may modify and redefine the realization of the core queryable properties and how their values are encoded.
4.3. information resources
Entities that are managed by a service. Are defined by the information model.
4.4. core queryable properties
Properties on which a catalogue can formulate filter expressions. The goal is:
query interoperability: different catalogues that implement the same protocol binding
query compatibility: different catalogues that implement different protocol bindings
cross-profile discovery: executing same queries against any catalogue without modification and without knowledge of the specific catalogue's information model
4.5. INSPIRE
The INSPIRE document for CSW is a top part specification on the ISO specifications 19115 and 19119. The last version of this specification is used, with title: "INSPIRE Metadata Implementing Rules: Technical Guidelines based on EN ISO 19115 and EN ISO 19119". This document can be found at the specific inspire homepage.
Relevant information is on page 10 and 11 which constraints the ISO metadata set. There is a switch implemented already to change between a INSPIRE conform- or non-conform CSW. This implementation has to be improved to be conform regarding to the points specified at page 10 and 11 in INSPIRE spec.
5. General CSW interface model
5.1. discovery class
It provides 4 operations for client-discovery:
CSW-Spec_general
Http
description
Spec-page
query
produces a result set according to the metadata/records in the catalogue
33
present
returns selected metadata for some or all of the resources from a specific result set
38
describeRecordType
retrieves the type definition used by metadata of one or more registered resource types
41
getDomain
retrieves information about the valid values of one or more named metadata properties
43
5.2. session class
It provides 4 operations for interactive sessions between client and server:
- [Spec page 28] and [Spec page 45]
5.3. manager class
It provides 2 operations for interacting with the cataloguestore (push) und retrieving metadata from other services (pull)
CSW-Spec_general
Http
description
Spec-page
transaction
insert, update and delete actions are possible in the request by a client
52
harvestResource
retrieving resources from specified locations
54
5.4. brokered access class
It provides the order operation.
6. Architecture
7. BACKEND
2009-11: This is the first draft of the storingarchitecture.
The CSWController delegates a request either to a DC or ISO recordstore. If a describeRecord operation has to be managed there should be a mapping from the typeName to the relevant store. Question is: how to map, just to DC if typeName is DC or to ISO al well?
2010-04: There is one integrated recordstore implemented for DC and ISO in combination. There is a mapping for DC, ISO and INSPIRE queryable properties. Each representation in brief, summary and full has its own table in the backend, so there is no overhead to present the found records.
7.1. DummyDatasets (closed)
- 2009-11: There are 6 datasets for dublin core inserted in the backend to test the functionality of the application operations.
- 2010-04: Because of the correctly implementation of the transaction operation regarding to insert and delete, there a no dummy datasets needed any more.
7.2. BoundingBox
2009-11: The interpretation of the CRS has been ignored. The interpretation of the envelope is (lon lat, lon lat) for all datasets. So some datasets has been transformed.
- 2010-04:
8. Multilinguarity
The INSPIRE-specification defines the multilinuarity for the CSW.
- 2010-04: A first easy implementable solution has been found. There will be no extra tables integrated into the backend, so there is no blowing up of the databaseschema. Every queryable property that should be able to be multilingual handles the other languages by its own. Queryable properties that are implemented for multilinguarity already: keyword, title, abstract, alternateTitle, geographicDescriptionCode and (for INSPIRE) specificationTitle. The use of translation files has been ignored at the moment.
9. Implementation
10. GetCapabilities
10.1. getRecords
Difference between a CodeList and Enumeration is not so clear...so have taken Enumeration -> like resultType: Spec says to take a CodeList, but in the Code taht is as Enum implemented. The same is true for CONSTRAINTLANGUAGE, by the way.
10.1.1. Parsing
10.1.1.1. startPosition
Is there any usecase? just for info...
10.1.1.2. outputFormat
- should control the format of the output
value is a MIME type
defaultvalue is "application/xml"
implies that there is a schemaLocation defined
10.1.1.3. ElementName/ElementSetName
ElementName:
- xPath expression with qNames
- is based on the information model of the catalogue
if typeNames contains only one value then the catalogue should infer the xPath, so it can be omitted
if the specification is insufficient to generate a valid XML response doc -> CSW shall augment a list of elements to the client to be able to generate a valid XML doc (nice to have??or mandatory in this context??)
corresponds with the outputSchema parameter
the examples are taken from the spec page 149 to summarize it in this doc:
<csw:Query typeNames="rim:ExtrinsicObject rim:Association">
<csw:ElementName>/rim:ExtrinsicObject/@status</csw:ElementName>
<csw:ElementName>/rim:ExtrinsicObject/@home</csw:ElementName>
...
</csw:Query>
(if taken the DC profile) this code can be abbreviated into:
<csw:Query typeNames="csw:Record">
<csw:ElementName>dc:identifier</csw:ElementName>
<csw:ElementName>dct:modified</csw:ElementName>
...
</csw:Query>
ElementSetName:
- in this parameter there can be declared which kind of response shall be presented (breif, summary, full)
additionally there is a typeName attribute to discriminate the typeName attribute of the Query element, if there are more element sets requested...apparently the value of typeName(ElementSetName) should be a value declared in typeName(Query)
=>ElementName and ElementSetName are mutually exclusive! Either an ElementSetName OR 1..* ElementName shall be specified in a query [thinking there is a fault in the Spec page 151]
10.1.2. DB-Access/Output
10.1.2.1. Structure
For handling the parsed request, there has to be implemented a transformation from the request (KVP/XML) to an SQL statement. At the moment there is a PostGres database maintainting the data. So this has to be an abstraction layer, because the maintaining can be managed by other database provider. In the Core-implementation of the RecordAPI there is a package for transforming the request into PostGreSQL.
To handle the various geometries, there is a WKTReader/WKBReader and ~Writer derived from the JTS implemented. Because of the complexity of the geometrymodel of deegree this is popular for the simple features but has to be expanded for the other geometries in deegree -> thinking of encapsuling from JTS completely.