GeoSciML through a wrapper
Document Sample


1). A GeoSciML WFS through a Cocoon wrapper
Eric Boisvert, Laboratoire de cartographie numérique et de photogrammétrie, Geological Survey of Canada, Earth Sciences
Sector, Natural Resources Canada. 490 de la Couronne, Québec City, Qc, Canada. G1K 9A9 eboisver@nrcan.gc.ca
Alex Smirnoff, Laboratoire de cartographie numérique et de photogrammétrie, Geological Survey of Canada, Earth Sciences
Sector, Natural Resources Canada. 490 de la Couronne, Québec City, Qc, Canada. alsmirno@nrcan.gc.ca
Boyan Brodaric, Geological Survey of Canada – GSC-Central. Earth Sciences Sector, Natural Resources Canada, 615
Booth Street, Ottawa, Canada, K1A 0E9, brodaric@nrcan.gc.ca
We assume the reader has working knowledge of WMS and WFS, has some knowledge on
how a web server works and has some experience in Web development (eg, HTTP GET and
POST protocol) and already dealt with some web based development framework (eg, PHP,
ASP, ASP.NET, JSP or the like). Knowledge of XSLT is definitively an asset. This section is
likely to be read “when everything else failed”.
Why should I read this section?
The technique described in this section addresses the problem of translating dynamically the
data from on schema to another. You are likely to end up reading this when you exhausted all
the option of the available software. The technique described here is not meant to replace
other solutions, but to supplement them. The more work that can be done in the custom
software, the better. The most likely situation would be “I’ve installed GeoServer, mapped my
database structure into GeoServer, but I can’t find a way to address such and such issue”. A
wrapper is often a good way to fix glitches and unresolved issues in an existing system.
The other situation is when you already have a working system that does not deliver
GeoSciML and you want to leverage it, but you just can‟t change anything in the legacy
system for various reasons ; Because it‟s a mission critical system, because it‟s not yours ,
because the data resides in several remote databases, etc.. The easiest solution is to copy all
this data in a single system you can control, but sometimes, you just can‟t and here where the
wrapper comes in.
Introduction
To serve GeoSciML out of database usually requires some translation from the schema used
in the physical implementation (in Oracle, Postgresql, or simply flat SHP files) as most, if not
all, implementations will differ substantially from GeoSciML model. Some WFS
implementations (such as GeoServer and Deegree) provide a „mapping’ mechanism, where
fields of the database can be matched to community schema and allows the software to
dynamically convert from one schema to the other. Those mapping techniques do have their
limits and sometimes it is impossible to achieve a proper translation with those mapping
techniques alone. Complex mapping cannot always be express in the declarative syntaxes
that are typically used in those systems. In those cases, we might have to turn to custom
programming to handle the translation. The solution that we propose in this section is using a
programming technique called a “wrapper”1
A Wrapper is a special kind of software that allows two systems to speak to each other when
they would normally not because they have incompatible interfaces. In our specific case, we
refer of a GeoSciML compatible client and a (non-GeoSciML) system. This section of the
cookbook presents two different ways to re-use existing services, or create a generic service
with an existing map service package and build a GeoSciML capacity on top of it. The goal is
to provide enough functionality to meet GeoSciML service conformances tests.
The technique consists in writing a piece of software, in an arbitrary language, that can be
executed by your web server (Apache, Microsoft IIS, etc..). This application will pretend to
be a WMS and/or WFS system, but in reality, it will just intercept message between the client
and the service and modify (translate) the incoming request and the outgoing result. The real
WMS/WFS service (or any non-OGC service for that matter) will be hidden behind this piece
of software. The original service does not need to be completely hidden or course, but as far
as the client can tell, they will look like two different services.
[client] <-> [Wrapper] <-> [server]
Figure 1.
In practice, a wrapper hides all the details of one the component being wrapped, and this
principle can be pushed to the extremes. This section will show some examples that can be
adapted to your specific needs.
We propose here to solve this problem using an XML processing framework (Apache
Cocoon).
Requirements
A Web Server, it can be a different web server than the one where your current W*S
is running. It can also be on a different network (ie, different city), as long as the web
server can „see‟ the other one.
Some sort of software development platform that your web server supports. It is not
restricted in any way, could be PHP, ASP.NET, Java servlet. Some environments are
better than other for manipulating XML, but there are no real limitation since most
development languages can handle XML.
Access to the web server to install custom software, or a development system that is
reasonably similar to the target system (if you don‟t have authorisation to deploy
yourself and must go through IT staff).
Some programming skills.
Using an XML framework
1
Altering the WFS package itself is an option – specially open source software - but we won‟t discuss this
option here and refer the user to the documentation of the their WFS system.
There are only 2 operations in WMS/WFS that does not imposes XML responses (although
they could), WMS.GetMap and WMS.GetFeatureInfo. Therefore, the bulk of what
the wrapper will manipulate is XML documents. Using a XML manipulation framework
instead of writing code from scratch is an interesting option.
An xml framework is a software that provides the core functionalities to manipulate XML
documents. There are many alternative in the market, but one particular product from the
Open Source community seems to be somewhat of a reference implementation. Apache
Cocoon http://cocoon.apache.org is essentially a java servlet that allows to process XML
document in various way. Cocoon itself does not provice any application out of the book, but
just a massive toolset to create XML document from various source types, manipulate the
document (often with XSLT but there are other options) and turn the result document into
some other kind of documents (HTML, images, zip archives, etc..)
Installing Cocoon
Apache Cocoon is an open source software that can be freely be downloaded at
http://cocoon.apache.org/mirror.cgi. It is not very difficult to install, but as it is often the case
in the Open Source world, software borrows a lot from other development and one often have
to harvest the web to fetch the components the software needs
You need
1. A JVM (Java virtual machine). OS often come with JVM already installed. You need
Java 1.5.
2. An applet execution framework, such as Tomcat or JBoss. Tomcat is the reference
implementation and is free. It runs on all popular Unix, Linux, Macs or Windows OS.
Therefore, if you don‟t already have one, we suggest you use Tomcat.
3. Cocoon itself
If you have GeoServer or Deegree installed on your server, you already have the JVM and
Tomcat parts dealt with (now, if versions are compatible, of course)
Tomcat download and installation : http://tomcat.apache.org/
Cocoon download and installation http://cocoon.apache.org/2.1/
Things you should consider
it is possible on a Win32 platform to install Tomcat as a service. You should do this,
otherwise, Tomcat will shut down as the user starting the application logs out (here)
You should set the default memory setup of Tomcat/Cocoon at higher values as the
default (http://tomcat.apache.org/faq/memory.html#adjust), doubling the values is a
good start.
It is possible on „hide‟ Tomcat under the web server application (either IIS or
Apache). Tomcat listen to its own port (generally 8080) while the web server listen to
another one (generally 80). Some organisation shut all port except 80 to reduce
intrusion attack and you might not be able to export Tomcat. There are ways to
„tunnel‟ Tomcat destined request through port 80 (or whatever port your web server
uses).
Cocoon architecture
Cocoon is a web development framework built on the concept of component pipeline. The
original intent of this design was to keep a clear “separation of concern”, where each
components could work in isolation one from the other. Information flows in the pipeline in
the form of a stream of XML events. This last bit is an important concept to grasp to
understand how pipeline works.
Pipeline architecture
Generator -> Transformer -> Transformer -> Transformer -> […] -> Serializer
Figure 2
The metaphor of a pipeline is quite useful to understand how Cocoon works. A typical
Cocoon application is made of a series of pipelines, each pipeline is typically made of 3 main
components types.
1. The generator: This component‟s task is to turn any outside source (a file, database
content, remote service, etc.) into XML and feed the pipeline
2. A series of transformers : These components intercepts the flow of XML, transform
it, and send the transformation results down the pipeline to the next transformer (if
any) in the pipeline.
3. A serializer : This last component is the other end of the pipeline. In receives the
flow of XML and turns it into a file or a stream of bytes, and send it to the caller (for
exemple, a browser)
Some examples of pipelines
Read in a file (mydocument.xml)
Transform in html using XSLT
Serialize as an HTML document
Read in a remote web service (WFS service)
Filter the geometries with STX2
Use the geometry id to query a database to extract colour
Turn Geometries into SVG shapes with XSLT
Serialise the SVG into a PNG
The pipelines are described in a special file called a sitemap. The sitemap can be seen like a
configuration file telling cocoon what pipeline to invoke and when. It‟s an XML file
(surprise, surprise !).
…
<map:pipeline>
2
STX is another transformation technology, similar to XSLT
<map:match pattern=”dictionary/*.html”>
<map:generate type=”file” src=”content/dictionaries/dict-{1}.xml”/>
<map:transform type=”xslt” src=”stylesheets/dict2html.xslt”/>
<map:serialize type=”html”/>
</map:match>
</map:pipeline>
…
What this example pipeline does is
When someone requests any .html file in the dictionary directory
(http://myhost.com/cocoon/my_application/dictionary/lithologies.html),
this pipeline is „matched’ and executed. (Cocoon will execute the first pipeline that
matches is more that one satisfies the filter pattern)
The generator starting the pipeline is a simple file generator. It will read a file
located on the server on content/dictionaries/dict-[the name of the
requested dictionary].xml. In our example, it will read in dict-
lithologies.xml. You can see now that the file the caller is requesting does not
really exist (there are no lithologies.html), but cocoon just pretend it does.
The single transformer applies a stylesheet that turn XML into a HTML document
The last component, the serializer, streams the HTML to the caller (most probably a
browser). This step involves setting up the details of what a browser expect from an
HTML document (eg, setting the mime-type, dealing with HTML version, filling the
http header, etc.)
(this should be in a note box or something alike)
The way cocoon works, as soon as one XML event (a tag) is read by the generator, it is sent down to the next component
(generally a transformer). The document is not read completely and then sent to the next component. This is an important
thing to understand. This is the reason why you can‟t decide which transformer to use dynamically from the content of the
incoming XML document while it‟s being read. It‟s “too late” for the pipeline, the first tags have been processed already and
might even be sent to the serializer and then to the calling application. This will impact the way we design our pipeline
because OGC WFS XML queries are made. The query parameters are presented as a small XML document and we might
want to use a different stylesheet according to the value of the TYPENAME. You can‟t have a conditional switch in the
pipeline that says “if TYPENAME = „LithodemicUnit‟ then use the stylesheet”. It‟s too late, the stylesheet specified in the
sitemap is loaded even before the generator has started read the document.
Note for Tim: One drawback of a pure wrapper approach is that application must behave like
a WFS at least to support the translation functions. For instance, it must be able to respond to
GetCapabilities or at least mangle the capabilities document of the original WFS, handle both
POST and GET, etc.. This complicates the application because we take care of all those
details. As I mentioned over the phone, Alex Smirnoff (the lab scientific programmer) in
looking into pluging a WFS component right into cocoon that will handle some of those
details. But this component is not ready yet – this document might change when the
component is published.
OGCCocoon components
OGCCocoon components is a suite of cocoon component developed by the Laboratoire de
Cartographie Numérique et de Photogrammétrie (LCNP) of the Geological Survey of Canada
in Québec City. These components were not always designed from scratch, but often existing
component from other projects, such as deegree (http://deegree.sourceforge.net) wrapped into
a cocoon components. What these components allow you is to bring the process of OGC
requests and responses into a cocoon pipeline and literally “take control” of the flow of
information. Once the information goes back and forth in the pipeline, you are free to
squeeze transformers in.
Normal flow of events
deegreeRequest -> deegree query -> deegree result ->
Cocoon flow of events
deegreeRequest -> cocoon -> deegree query -> cocoon -> deegree result -> cocoon ->
Getting our hands dirty
The preset cocoon application is available at this url : http://ngwd-
bdnes.cits.rncan.gc.ca/service/ngwd/ogc_framework.html (note, this page does not exists yet)
This framework contains an existing cocoon sitemap and all what is needed to start building
your wrapper.
Framework installation
Just unzip the archive in the webapp/cocoon subdirectory (or its equivalent). The structure of
the directories should look like
webapp/
cocoon/
ogc/
document/
style/
lib/
sitemap.xmap
…
In the lib directory, there are a series of jar (java archives) that are needed by some
components. You must copy all the files from this lib directory to the webapp/cocoon/WEB-
INF/lib. When this is done, just restart Tomcat (or whatever servlet service).
This framework does not fix all the problems by itself (sadly) but provide all the necessary
process flows that are common to a wrapping system. The part you have to do is to provide
the translation.
Pipeline structure
The main pipeline structure is organised into 4 sub pipelines (you can see them as
subroutines). The main pipeline is in fact an aggregation of those 4 blocks. Each block is
completely executed before the next one is initiated. Which, if you remember the description
of how a pipeline works, gives us a chance to peek inside the XML stream and alter the
configuration of the following pipeline.
Block Description
The initialisation block This block read the query and prepares all
the pieces required to process the query.
It turns the GeoSciML query into a local
query
The query block This block gathers the informations from
various sources.
The process block This block turns the result into GeoSciML
The clean up block This block cleans up the potential
temporary files, etc.
Setting up your framework
There are a couple of files that must be altered to reflect your current system. First and
foremost, is the GetCapabilities document. This document is located at document/wfs-
cap.xml. What must be changed in this document is the Service portion on the top of the
file.
<Service>
<Name>the name of your service</Name>
<Title>the title of the service</Title>
<Abstract>A description of what the service contains</Abstract>
<Keywords>WFS,OGC,GeoSciML,Geology,IUGS,OneGeology</Keywords>
<OnlineResource>This will be filled in by cocoon</OnlineResource>
<Fees>none</Fees>
<AccessConstraints>Public</AccessConstraints>
</Service>
The rest of the section should remains the same as they provide information about the
standard features a GeoSciML service should provide.
Wrapping a database
(I‟m going ahead of the Service Architecture working group as I expect this is what will come
out of this meeting)
The standard GeoSciML service works through profiles. The GeoSciML schema is very
complex and attempting to express a query using the complete schema would force the server
to deal with all possible permutations. This is also a problem for the user that have to create
a query that can be understood by all GeoSciML server. For this reason, GeoSciML service
are required to honor a more constrain set of query template, that can be use to create
simpler queries. Therefore, the schema the request are expressed in are different than the
response uses. WFS software out there have a identical request and response schemas, and
this worked just because most WFS serves quite trivial schemas (a.k.a GML Level 0) which
consist of a series of simple literal (string and numbers) properties values without any nesting
structure. The goal of profiles is to keep the simple query structure, but allow the full
complex schema to be returned. Therefore, the wrapper structure allows to deal with a simple
query and then inflate the (simple) result to a full blown GeoSciML.
Configuring the datastore.
The cocoon ogc component is based on deegree (http://www.deegree.org/), which is an open
source W*S service implemented in java. The real strength of cocoon is exploited here as we
wrapped some key components of deegree into cocoon component. So the deegree service
become a cocoon transformer that takes incoming request and transforms them into a stream
of XML. We then harness the full power of deegree inside the cocoon framework which
allows us to intervene at any point in the flow of information.
To configure a datastore (a source of data), you need to configure the files located in
ogc/deegree. How to configure the datastore is explained here :
http://www.deegree.org/docs/wfs/deegree_wfs_configuration_2006-10-26.html
(specifically chapter 3 and 4). One important section to read is how to map your database
structure to an arbitrary schema. The closer you get to standard GeoSciML views (as defined
here), the less work is needed in the wrapper.
At this point, you have a pipeline that sends unchanged request and return unchanged
responses if you try your new service
(http://you.server.xx:8080/cocoon/ogc/wfs?REQUEST=GetCapabilities), you‟ll get
the GeoSciML capabilities document. And if you send a GetFeature request, you‟ll get
whatever your deegree server will respond.
Now the fun part starts. You can interfere with the flow of information at 2 key locations
1- if you mapped features fits the GeoSciML feature profiles, you don‟t need to change the
incoming requests, as they will be handled gracefully. But if you do have some tweaking to
do, you must alter the wfs-init block
<map:match pattern="wfs-init">
<map:select type="request-method">
<map:when test="GET">
<!-- turn the GET parameters into a conformant WFS query document
-->
<map:generate type="WfsXmlGenerator"/>
</map:when>
<map:when test="POST">
<map:generate type="stream"/>
</map:when>
</map:select>
<map:transform src="style/analyseRequest.xslt" type="saxon"/>
<map:transform type="session"/>
<!-- your code here-->
<!-- ====================================== -->
<!-- insert transformer here -->
<!-- <map:transform type="saxon" src="style/gsml2local.xslt"/> -->
<!-- ====================================== -->
<map:serialize type="xml"/>
</map:match>
By adding a transformer here, you can alter the incoming query. How you do this is really up
to you, by using any series of cocoon transformer you care to. Here is an example using a
simple XSLT transformer
Case scenario. Your borehole are mapped as 2 different tables in your database because you
have mine exploration wells and water wells (you could have done a sql UNION, but for the
sake of demonstration, let’s pretend we have 2 tables). We have wellBorehole and
mineBorehole. A standard GeoSciML query will ask for Borehole, so we must substitute
gsml:Borehole with wellBorehole,mineBorehole.
Incoming request
<?xml version="1.0" encoding="ISO-8859-1"?>
<wfs:GetFeature xmlns:wfs="http://www.opengis.org/wfs" service="WFS"
version="1.1.0" resultType="result" outputFormat="text/xml;
subtype=gml/3.1.1">
<wfs:Query typeName="Borehole">
<ogc:Filter xmlns:ogc="http://www.opengis.net/ogc">
<ogc:PropertyIsGreaterThan>
<ogc:PropertyName>depth</ogc:PropertyName>
<ogc:Literal>40</ogc:Literal>
</ogc:PropertyIsGreaterThan>
</ogc:Filter>
</wfs:Query>
</wfs:GetFeature>
XSLT (Unchecked )
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:fn="http://www.w3.org/2005/xpath-functions">
<xsl:output method="xml" version="1.0" encoding="UTF-8"
indent="yes"/>
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="@typename"> < !—not sure this works .. unchecked
<xsl:attribute name="typename">
<xsl:choose>
<xsl:when
test=".='Borehole'">mineBorehole,wellBorehole</xsl:when>
<xsl:otherwise><xsl:value-of select="."/></xsl:otherwise>
</xsl:choose></xsl:attribute>
</xsl:template>
<xsl:template match="*|@*">
<xsl:copy><xsl:apply-templates/></xsl:copy>
</xsl:template>
</xsl:stylesheet>
Transformed request
<?xml version="1.0" encoding="ISO-8859-1"?>
<wfs:GetFeature xmlns:wfs="http://www.opengis.org/wfs" service="WFS"
version="1.1.0" resultType="result" outputFormat="text/xml;
subtype=gml/3.1.1">
<wfs:Query typeName="wellBorehole,mineBorehole">
<ogc:Filter xmlns:ogc="http://www.opengis.net/ogc">
<ogc:PropertyIsGreaterThan>
<ogc:PropertyName>depth</ogc:PropertyName>
<ogc:Literal>40</ogc:Literal>
</ogc:PropertyIsGreaterThan>
</ogc:Filter>
</wfs:Query>
</wfs:GetFeature>
To insert this transformation, just insert it in the flow.
<map:match pattern="wfs-init">
<map:select type="request-method">
<map:when test="GET">
<!-- turn the GET parameters into a conformant WFS query document
-->
<map:generate type="WfsXmlGenerator"/>
</map:when>
<map:when test="POST">
<map:generate type="stream"/>
</map:when>
</map:select>
<map:transform src="style/analyseRequest.xslt" type="saxon"/>
<map:transform type="session"/>
<!-- your code here-->
<!-- ====================================== -->
<!-- insert transformer here -->
<map:transform type="saxon" src="style/gsml2local.xslt"/>
<!-- ====================================== -->
<map:serialize type="xml"/>
</map:match>
Obviously, you can make several changes.
The second place you must write a transformation is to handle the outcoming response from
the WFS service. Remember that the WFS will serve a simpler schema and what you need to
deliver and full blown GeoSciML. This is done basically the same way you transformed the
incoming request, but adding a transformer to the processing block.
[TODO]
Wrapping an existing WFS
Wrapping a WFS is more complex. How to address a subsumed WFS depends on what is
available from the WFS, either because the database the subsumed WFS does not have all the
information required to produce a meaningful GeoSciML document. Either the data is not in
the database, or either the WFS is not configure to deliver any of this and you can‟t alter the
service. In this situation, you‟ll have to bring in information from another source to complete
the dataset.
[TODO]
Optimisation and testing
A chain of XSLT transformer works like a slinky (it‟s a weird metaphor, but think about it),
because XSLT can access any part of the document. Therefore, the full document must be
loaded in memory before the XSLT can be evaluated. Each XSLT transformer somewhat
clogs the pipeline until it has a full copy of the incoming document and then start to send
XML events to the next one. Hence the picture of a slinky recoiling completely on one step
before getting to the other one. As far as Cocoon is concerned, it still see the process as a
continuous flow of even, it simply does not “know” that these components are holding up the
content.
This design can cause performance problems when you are dealing with large documents, and
since GeoSciML is GML, you are likely to have geometries descriptions, which are big. Big
document are obviously longer to process and if the server must deal with several
simultaneous requests involving large files, it can bring it to its knees.
There are several strategies to consider, but of course, the first thing to do is to test the
performance of your system.
Do an extra effort to make the most constrained query to the datasource
Filter early. In a chain of transformer, try to trim the size of the dataset in the first
transformer.
Consider alternatives to XSLT, such as STX or write up your own transformer,
which does not involve buffering the whole document.
External References
Wrapping using MapScript and Python, PHP or Java :
http://mapserver.gis.umn.edu/docs/howto/wxs_mapscript
Query Rewriting literature pointers
http://www.isi.edu/info-agents/courses/csci548/Papers/essid.pdf
http://www.gisdevelopment.net/magazine/years/2005/may/data2.htm
http://www.geoinfo.info/geoinfo2004/papers/6366.pdf
http://dipa.spatial.maine.edu/NG2I03/CD_Contents/EA/Zaslavsky_Ilya_01.pdf
--end of document--
Old chapters to be recycled or deleted
Complexity
The truly complex part will be to handle the translation programmatically. And this is true
whatever the technical solution you choose. The more control you have on each sub
component (for example, if you can alter the database structure, or create a view to deals
partially with the translations), the more you are likely to succeed. GeoSciML is a complex
schema and it allows a lot of room to express the same information in various ways.
Note to myself and Tim. I just can‟t see how anyone can provide a complete wrapping of any
database to GeoSciML. Even if the database is customised to support GeoSciML, the schema
allows just too much degree of freedom (CGI_Value is intricate enough – If I‟d have to
constrain anything, this is where I would start). We really have to consider either a WFS
profile of GeoSciML or, better yet (otherwise, why the trouble of designing this beast), adopt
the Filtrer profile proposed in the draft WFS specification.
XML <-> XML
OGC specifications make extensive use of XML to exchange information. The translation
problems are likely to turn into transforming a XML into another XML. There are several
ways to transform an XML document into another one and there are a lot of tools and libraries
suitable for your programming platform to manipulate XML.
You might choose
to read the XML as a regular string and manipulate directly.
use a SAX3 library to monitor an incoming stream of XML.
use a DOM4 library to load the XML document in memory and navigate its content in
the host language.
Use a transformation library where the transformation is expressed in XSLT (the host
application merely
There is one outstanding technique called XSLT (Extensible Stylesheet Language
Transformation). XSLT is a W3C standard to define a set of transformation rules
Things you must do in the wrapper
- redirect all url to your wrapper
- redirect all relevant xlink:href to your wrapper
- transform the incoming request to match your local schema
- transform the outgoing result to match GeoSciML
3
Simple API for XML. This API models the XML document as a stream of „events‟, as each tags, attributes and
strings trigger a processing event that can be trapped by the host application.
4
Document Object Model. This API models the XML document as a tree of nodes. Any node of the document
can be reached at any moment, at the cost of loading the entire document in memory.
2). A GeoSciML WFS through Geoserver
Specific Deployment Requirements
Geoserver application
This component requires deployment of Geoserver against a database managed by
MRT.
The deployment will include a configuration of Geoserver that maps MRT’s database
schema to a standardised “Landslides” data model.
This mapping involves the use of lookup tables created for this project, and creation
of database views to encapsulate table relationships and extract the data required to
deliver the information products.
The deployment will also include configuring Geoserver’s demo facility with relevant
demo queries to support documentation and testing of the installed service.
Version:
The version of geoserver to be deployed is a snapshot of the geoserver “trunk” –
currently equivalent to Geoserver 1.6 – using the community-schemas/web-c module.
This module bundles a modified wfs service handler and cloned administration and
demo UI.
This version is downloadable from the SEEgrid subversion repository at:
https://www.seegrid.csiro.au/subversion/xmml/Projects/tools/geoserver.war
This version will be ultimately distributed from the geoserver sourceforge site as a
separate WAR application, pending either the bundling of the capability within the
core geoserver build, or abiity to download it as a separate plug-in extension to
the core geoserver releases.
Improvements over previous version
The target implementation contains substantial improvements over the previous
version of geoserver-cf used in the SeeGrid Geochemistry demonstrator.
Notable improvements are:
Streamlined configuration file for community schema mappings:
o Schema mapping does not need to manually include all relevant parent
schemas;
o Separate Id mapping sections replaced by simple flag in attribute
mappings;
o GroupBy tags need only to identify a representative
XML Schema files need less “hacking” (still under investigation, aiming for
zero hacking)
Aligned with Geotools trunk, and geoserver trunk,
o common environment and installation instructions to be maintained
o test cases included in build, minimising threat of loss of functionality in
future versions
Availability of updates
It is expected that updates to the geoserver implementation will be available as
geoserver core matures, further testing and debugging of the geoserver community-
schema support modules occurs, etc.
The Geotools community has committed to bringing the underlying “ISO General
Feature Model” into the core of all Geotools modules in the next major release (2.5).
LDIP Configuration
The LDIP configuration consists of a standard Geoserver configuration directory.
This will be under version control and contain 3 versions of catalog.xml
Copy catalog_tas.xml to catalog.xml to use the relevant configuration.
Within this folder, the featureTypes folder contains the mappings for implementation of each featureType (data
product) supported.
Demo/ contains sample queries. These should be edited to reflect data available from the MRT landslides
inventory.
Database
The database must be able to be connected via jdbc from the host machine. The access
required is purely to select data from tables.
Deployment environment
Current Status
Deployment environment owner to complete
Component Recommendation/Supported Actual
Hardware Platform : Sun, PC
Operating System Version : Solaris 8, Linux (specify distribution), Windows XP
Apache Server Version : Apache 2.x
Tomcat Application Version Tomcat 5.5 x
:
Apache/Tomcat Connector : Reverse proxy
Java Version : JDK 1.5 latest release
Java extensions JAI 1.1.3 and JAI imageIO 1.1
Database : Oracle, PostGIS ArcSDE
Graphics Solaris/Linux: XVNC
Components to be installed Geoserver build (as specified)
:
Configuration requirements
Tomcat
The application parses significant XML schemas. Consequently, the JVM must be configured
to have at least 512 Mb of memory available.
Thread stack size of 1024Kb seems to be satisfactory.
Benchmarking memory requirements under different load conditions are beyond the scope of
this project.
Other requirements
1.1 Database availability
The geoserver application must be restarted if the database is restarted, or the connections
cached by geoserver are invalidated in some way.
Risks and mitigation strategies
Tomcat
Tomcat 5.5 is recommended, as this is the platform that testing is mainly based on.
Tomcat 4.1.31 can probably be used.
Tomcat 5.0.x should not be used.
Deployment of multiple geoserver versions/ configurations
Geoserver since 1.5 allows multiple deployments within a Tomcat context – i.e. /geoserver1
/geoserver2
It is recommended that, at this stage, only one geoserver instance per Tomcat install is used,
to assist with scheduling version upgrades and isolating problems.
Each geoserver instance can use a single configuration directory. It is regarded as difficult to
manage and test multiple overlapping configurations within the same directory, in the absence
of a specific set of benchmarked processes and policies to support this overhead. It is assumed
therefore that each project will identify either an upgrade to a specific configuration, or a new
configuration, as appropriate.
Responsibility for testing co-existence of multiple geoserver versions within a single Tomcat
lies outside this project. Advice on reporting and interpreting support requests and responses
to the geoserver development community will be provided if required.
XSL/Java
Geoserver trunk is compiled under JDK 1.5, and hence requires 1.5 or 1.6 to run.
NB. Significant performance benefits over JDK 1.4 are reported.
NB Note that some applications (eg ArcIMS) may install legacy XML parser and XSL
libraries inside Tomcat. It is recommended to dedicate a Tomcat instance to each
software component to avoid the need of extensive compatibility testing.
Deployment and Development plan
Environment Setup
System Architecture
The underlying platform is a standard J2SE application accessing web services via standard
OGC Web Services/ASDI notional architecture.
All components are assumed to be properly decoupled using interfaces that would allow any
component to be replaced by an alternative technology, version or custodian.
Configuration
No special configuration of the base environment is now necessary to run geoserver under
Tomcat 5.5.
It may be necessary to ensure that the Tomcat “context” is configured to recognise its
hostname as the same name the service is visible as from the public network.
i.e. when geoserver advertises its URL in the capabilities document, or in the demo pages as
the URL the semo request gets sent to, this URL must not map to an internal hostname. It gets
this hostname from the standard Servlet context.
For more information see http://tomcat.apache.org/tomcat-5.5-doc/config/host.html.
Quick check: deploy geoserver and access the capabilities document, eg
http://myhost.org/geoserver-c/ows?service=wfs&version=1.1.0&request=GetCapabilities
Access arrangements
The deployment team will need access and user privileges that allow:
1. stop/start Tomcat
2. install WAR files
3. clean up Tomcat work directories (there seems to be an occasional need to do
this when updgrading)
4. Edit text files to configure applications.
Installation responsibility
Social Change Online will assist personnel designated by the deploying agency.
Social Change Online contact: Rob Atkinson (rob@socialchange.net.au)
Deploying agency (MRT) contact: Rohan SedgeWicke
Analysis of dependencies
The project covers many of the issues that will be faced in creating interoperable data servers,
even though resolution of each is beyond scope.
The dependency chain is complex and realistic:
Client accesses data service
Data service accessible to client (external visibility)
Data service deployed on host server
Data service connected to target database
Target database deployed (replication, etc) and accessible to data service host
Target database replication design finalised
Geoserver configuration tested against target database
Target database created against source database
Lookups and cross-references between standard vocabularies and source data
Standard vocabularies published
Common (interoperable) Data model in near-final draft
The key issue is to provide the service configuration step with data that can be matched to the
conceptual model, to finalised the target database design.
Data Management
All data that needs to be migrated, extracted, installed etc for the purposes of getting
a pilot implemented in minimal timeframe should be noted here.
The goal is, however, to implement a complete set of interoperable Web services
using OGC standards, however some services may be simulated using static XML
files.
Responsibilities
MRT will undertake data management according to requirements noted here.
Database views
The very complex interrelationships between multiple partially-populated data tables will
require creation of database views.
MRT will examine the template database views supplied and work out the most effective
approach to using these to support replication to the hosting database.
Database replication
Replicate the data to the target environment. This need not be dynamically updated at this
stage, and creation of materialised views would be appropriate.
Updates
The update cycle for this data is not yet defined, for the pilot phase there is no
requirement for regular updating, however considering the process early would help
the next phase which will need formalisation of data supply expectations.
Documentation requirements
Sample data has been provided as Oracle schema in Oracle dump format.
SQL statements that retrieve the data contained in the required data products are
desired as concise documentation of table relationships, however it has proved
necessary to undertake some analysis and MRT will need to check the translation
from MS-Access query and table structure assumptions into the equivalent Oracle
views.
Implementation Plan
Delivery
Deployable build to be delivered as:
Configuration package containing a standard Geoserver configuration
directory structure. This will be accessible from:
o https://www.seegrid.csiro.au/subversion/xmml/Projects/Landslide
sPilot/geoserver_conf/
Geoserver binary available as a WAR from
https://www.seegrid.csiro.au/subversion/xmml/Projects/tools/geoserver.
war
o NB This build will contain a copy of the Oracle JDBC driver ojdbc14.jar
Installation
Deploy war file to Tomcat
Deploy configuration directory to the same server as the tomcat installation
o Copy catalog_tas.xml to catalog.xml
o Edit WEB-INF/web.xml to reference config directory
o Edit configuration featureTypes/*/*_tas.xml to insert correct user name, password
and jdbc connection parameters.
Restart tomcat (or the geoserver application only)
Test
Testing
A range of possible tests can be used to identify correct behaviour.
The first test, recommended before any further deployment, is to copy the SQL
statements from within featureTypes/*/*_tas.xml and run these directly in the
Oracle environment, ideally via the same JDBC driver and connection
parameters.
If geoserver context fails to start at all, this is usually because the
GEOSERVER_DATA_DIR is incorrect and nor pointing to a valid config.
If geoserver starts, the ability to run the test queries is available via
http://the.host/geoserver/demoRequest.do
Run the GetCapabilities sample to check that all 3 feature types are present.
Report any failures for further diagnosis.
Get documents about "