Needs Heirarchy Assistive Technology - DOC

Document Sample
Needs Heirarchy Assistive Technology - DOC Powered By Docstoc
					GISIN background document—summary of online e-discussions
Edited and reorganized... in an attempt to group relevant points that have been covered.



        LIZ SELLERS: In an article published in the journal BioScience (2000) 50(3): 239-244,
Ricciardi, et. al., outlined a list of key information (fields) that should be included in a standardized
IAS database. In 2002, the Conference of the Parties to the Convention on Biological Diversity
discussed "possible formats, protocols and standards for improved exchange of biodiversity-related
data." In order to collect quality global IAS data, provide global data-access to professionals, and
effective database linkage and collaboration, it is widely agreed that a minimum group of standard data
fields must be identified. These fields must then be populated with data in a form that is species and
database independent, and that does not exclude the participation of contributors, or users in the global
IAS information system.
        What basic database fields should be included in an IAS database standard set?
        Ricciardi et. al., provided "key information" covering - Diagnostics, Distribution, Basic Biology,
Dispersal, Impacts, Biotic Associations, Modes of Dispersal, Control Methods, Bibliographies and Expert
Contact Information.
        Geospatial data has risen to the forefront of data collection as predictive and modeling technologies gain
popularity and discover new applications in the natural world--and especially in relation to IAS population and
invasion ecology. Should we therefore include geographic coordinates in the standard set of IAS Database

        MICHAEL BROWNE: Yes, at the meeting we should focus on minimum data required for the different
data types. The document "GISD Database Elements" describes minimum data required for lists of species,
species fact sheets, distribution data, eradication projects, and pathways and dispersal data. Please bear in mind
that the GISD's goals and data requirements differ somewhat from those of collection databases.

        CHARLOTTE CAUSTON: It would be particularly useful if the IAS database fields are compatible
with questions and criteria used in predictive methodologies/risk assessments that determine the potential of a
species to be introduced into a country, the potential invasiveness of a species once it has established and the
feasibility of control/containment. Weed and pest RAs [risk assessments] from Australia, NZ and Galapagos are
useful sources for this type of information as they also consider the impact of the species on the environment.

[See Table, “Desirable Database Elements” fields comparison]
         BRIAN STEVES: As a next step, assuming we can agree about the use of XML as an exchange
protocol, I propose we consider Bob Morris' earlier comment that "Indeed the problem is reduced to writing an
XML Schema" and try to work towards that. In this case however, we might need to consider a wide variety of
schemas for various IAS topics (species summaries, regional species lists, species observations, management
efforts, etc..).
         In the case of species occurrence data for IAS, we should consider whether we can use the DiGIR
protocol and an extension/subset of the Darwin Core. The addition of a few key fields to indicate whether a
species at a particular location is considered native or not, a potential pathway, and whether or not this species
observation represents a viable population would make a DiGIR system a very useful tool for most of us.
DiGIR is also flexible enough that we could develop our own federated schema's for some of the other topics
beyond occurrence data.

        MICHAEL BROWNE: As Brian points out, a variety of different types of data are created depending on
the intentions, topics and resources of the agency or person doing the collecting. The most basic IAS data
collected are the names of invasive species in a country, region or location. I believe that a record should
contain the following elements as a minimum: species name, location, biostatus (see below) and documentation.
Should status (native/alien) be a core element? Additional requirements should be handled at 'lower' levels
because there is so much variation. If we can agree on a minimum record, the next steps are to propose good
ways of recording these elements and, as Charlotte Causton says, to agree on definitions in order to avoid
confusion. Then on to XML if that is the preferred option.

       1.0 Absent
       1.1 Recorded in error
       1.2 Extinct
       1.3 Eradicated
       1.4 Border intercept
       2.0 Reported
       2.1 Established
       2.2 Established and expanding
       2.3 Established and stable
       2.4 In captivity/cultivated
       2.5 Sometimes present
       2.6 Present/controlled
       3.0 Uncertain

       1.0 Alien
       2.0 Native
       2.1 Native - Endemic
       2.2 Native - Non-endemic
       3.0 Not specified
       4.0 Biostatus uncertain

       1.0 Invasive
       2.0 Not invasive
       3.0 Not specified
       4.0 Uncertain
        BRIAN STEVES: I like the apparent heirarchy of the vocabularies Michael has presented here. It will
allow us to compare between those who report at a coarse level (e.g. Occurence = "Absent") with those who
report at a fine level (e.g. Occurence = "Eradicated") by knowing the relationship between the two terms.
        Where would these standardized vocabularies reside? Is there a centralized location they can be posted?
We can also incorporated these lists into our XML schemas as they are created to promote their use.

         ANGELA SUAREZ-MAYORGA: I understand we agree on „the minimum‟ like who is where (and
maybe when). But I‟m in doubt about if we can define minimal contents for a global database in the way we are
doing (maybe that way is too weak to support data from multiple sources, as Bob says).
         Talking about other core elements, I too think that the way to define contents (no matter if they are the
minimum or not) is much like the one Michael used. Just for example: our idea in the information system I
represent is to obtain the terms from authority files (controlled vocabularies) or thesaurus. Thesaurus is, at the
same time, a reference data set for the system, where we can know the relationships between terms just
following the hierarchical structure they bear. It makes possible to describe every field of interest as detailed as
it is required. A very useful tool also is to link the thesaurus with a glossary; to be sure we all understand the
same meaning of the term.
         By the way, we defined for plants a level called „naturalized‟ that identifies well established invasive
species that have viable wild populations --level 2.3 of Occurrence in Michael‟s schema --. However, I think
that level (same as 2.1 and 2.2) is describing status, not occurrence.

         BRIAN STEVES: If we keep the minimal required set of elements short enough and the controlled
vocabularies with general enough options we should be able to do better than just defining "what, where, and
when". If this is all we can agree upon, why don't we just adopt an unmodified Darwin Core as our standard?
What we really want is to develop something that's a bit more specific to IAS. A nice thing about Michael's
authority files is that he gives an easy solution for people who don't have the answers to the extra IAS specific
fields... they can simply put down "Uncertain".
         I would however agree that combined terms like "Established and Stable" can be better represented by
separate fields. In the case of the datasets I work with, occurrence terms like "Established" and "Absent" are in
one field while range status terms like “Stable”, “Expanding”, and “Declining” are in another. One concern with
this sort of splitting is that it introduces a new potential error into our system; records with incompatible values
for separate fields.

        PANKAJ OUDHIA: If possible, the section on 'visitors comment' may be added at the end of the
standard set. I think it can add the new information automatically. While visiting many databases when I have to
add some additional information, in general I find no such section. The visitors comments may be approved by
the moderators.
        In general impact part as sub heading 'Allelopathic impact' may be added, because the impact can be
described more scientifically through Allelopathy.
        To make the database more useful the emphasis on local names must be given, as the language or name
of plant changes in every mile.


       LIZ SELLERS: The newly developed “NISBase” online at
       is a distributed database system or portal developed by the Smithsonian Environmental Research Center
(SERC) to provide “simultaneous search access to multiple invasive species databases”. There are currently 5
databases included in the system. In this forum, NISBase will be examined as an example of a distributed
database system for IAS information databases.
        ANNIE SIMPSON: If participants choose to accept the NISbase model and make their database
compliant with that system, these are the requirements (according to a draft document posted by Brian Steves,
accessible in the Data Standards & Formats / Documents / NISbase Documentation):
        NIS information
            1. A web server with the ability to dynamically create html and xml through some scripting
    language (php, asp, perl, jsp, coldfusion, etc..)
            2. A database that the dynamic webpages can draw data from (ms-sql, access, mysql, oracle,
    postgress, etc..)
            3. Ability to query database with the current NISbase search criteria
                    a. Taxonomic Group
                    b. Genus
                    c. Species
                    d. Common Name
            4. Ability to limit the returned result set based on the record limit parameter.
            5. Ability to return query results in XML following the NISbase format.
            6. A static IP address on the server
            7. Creation of metadata for your database provider
            8. Acceptance by the NISbase charter members
                    a. Verification that other provider requirements (1-8) have been successfully meet
                    b. Acceptable of server response time
                    c. Acceptable content
        This is one model that has been suggested as an integrated database solution, where you have your own
database that you continue to manage and populate, but it would be cross searchable through the integrative
interface of NISbase.

         STEVE CITRON-POUSTY: It seems like these fields are aimed at a general description of invasives not
for field or herbarium specimens. Is there a proposed standard for minimum record here? IPANE data could
provide some of this data but it‟s too general for most of our data which is about occurrences.
         Other minimum fields for field records might be:
                certaintity of Identification
                certifying authority
                data set name (which references a data set record)
         We are currently working on this minimum set for our DB so that we can accept and import records
from other DBs containing invasives records in New England. I will report back as we get closer on our
minimum set. It might help discussion and we would love the feedback.
         Just had a thought that perhaps there could be different "levels" of this species occurence data. We have
different standards for herbarium specimens and for field forms. Perhaps we might want a field or a different
profile of XML for each.

         BOB MORRIS: NISbase may be too weak to support integration of data from multiple sources. For
example, it would be impossible for an application to determine whether two "Factsheet"s represent the same or
different concepts. With reference to the Steves document and the "basic" results described by its return schema
         Beyond its simple taxonomic data, NISbase appears content to return URLs to other web sites, rather
than actual XML, or protocols and parameters by which those sites can be induced to return XML. To this
extent it is dedicated to human-centric applications such as the nemesis application at With the data returned per SpQueryResults.dtd, one can not expect that
applications can integrate the returns from several NISbase providers since there is not(?) any provision for
discovering the schema for the results of the embedded URLs (if they even return XML at all, which is probably
not guaranteed).
        Is there some documentation of the Advanced Provider Implementation mentioned in the document?
        The Steves document suggests that NISbase may adopt DiGIR. May we know what is the status of that
consideration, and---in technical terms (perhaps best discussed in a different thread)---what is the proposed
integration schema?

         BRIAN STEVES: While I'll admit that NISbase is fairly simple in its current design, I prefer to dwell on
the strengths of what it can do for us now, rather than on what it doesn't do for us yet. For one it is here today, it
is running, and it does work. It allows us to select and search on multiple databases located around the world
from a single portal and return a single species list of results with links to further information.
         After attending many of the previous workshops held on IAS databases and hearing about the promise of
XML, I'm happy to report here is a system that is actually using it. Its current simplicity can also be considered
one of its strengths in that it allows for rapid integration of new data providers. Like myself, most of the IAS
web/data managers I know are biologists first and programmers second. With this in mind, I tried to design
NISbase so that it could be quickly implemented using existing technical skills and data sets.
         Many of my colleagues with databases on the web are capable of scripting pages (using asp, jsp, php,
perl, cold fusion, etc) that can search their database and return a table of results. These same skills can easily be
used to modify their scripts to output a simple but standardized XML result set. For NISbase, this result set is
the list of species from their database that matches the search parameters and includes URLs to the species
summaries (fact sheets) and collections records that they've already placed on the their websites either
dynamically or statically.
         As for the advanced NISbase implementation I mentioned in my documentation, that's still under
development at this time. I will tell you that I've been working on creating an XML schema for a standardized
species summary page, adding more searchable parameters to the current system, and attempting to modify and
use DiGIR for IAS information.
         At the moment, DiGIR seems to be my most promising path to NISbase improvement. Currently I've
managed to generate drafts of two DiGIR conceptual schemas for IAS information. The first draft schema is an
extension of Darwin Core that adds IAS specific fields (pathway, status, and occurrence). The second schema is
designed to handle information similar to the current implementation of NISbase (returning species list with
links to further existing information). I have even managed to create a few DiGIR test providers and set up a
DiGIR portal to use both of these schemas. Right now this system is running locally on my desktop computer,
but it seems to work fairly well.

       LIZ SELLERS: What are the techology (hardware/software) and connectivity issues to be considered
when planning and maintaining a distributed IAS database system with the goal of providing online-accessible
information about IAS? Can we build on the NISBase model?

        V. PANOV: In general the online distributed database system is a good concept, and NISBase model
can be considered as one of the approaches during development of the regional and subregional information
hubs. I believe that this approach will be most effective on the subregional level (in such region as Europe it
could be Nordic/Baltic, Ponto-Caspian, Mediterranean and other subregions). In the Baltic Sea area we are
working on developing of this approach in frameworks of the relevant HELCOM project. I do not believe in
effectiveness of one global distributed database system: it should be a network of such systems, or may be even
network of networks. I think that developing European information system on IAS will be a network of
subregional distributed database systems.

       BRIAN STEVES: While I agree that we shouldn't have a single global distributed database system
(DDS), a heirarchy of networks might pose a few problems.
       One such problem is how to give proper acknowledgement to each network as it passes the data up the
system. If we use a piece of information for a species observation from such a system, do we then have to
acknowledge the original observer for the species, the data provider (database that compiled the information),
the subregional DDS, the regional DDS, as well as the global DDS? This seems a bit excessive to me.
        Another concern is whether such a system of multiple DDS would slow down the system. A distributed
search on distributed searches of distributed searches could potentially be quite slow.
        I think a better system would be one in which regional and subregional portals exist alongside other
thematic/regional portals to information from a wide array of data providers. Ideally these regional/thematic
portals would assist the development of new providers and submit pertinent metadata information to a global
registry of such providers. In turn, regional/thematic ports could search for new providers from this registry that
they might wish to add to their own portal. Proper acknowledgement to portals would be limited to any value-
added products they develop (Maps, reports, modeling outputs, etc) but not for passing data from one DDS to
the next.


         MICHAEL BROWN: Database Integration: It seems to me that we can make rapid progress if we use
the NISbase as a model for the GISIN (see Brian Steves' document). Are there other models that we should
         Capacity building: The I3N Cataloguer model can be used to develop a data capture tool with XML
output for those at the early stages of database development. What other capacity building products will make
the GISIN more useful and where are the models?
         Synthesis & Outreach: I think that it is reasonable that the GISD (Global Invasive Species Database) be
examined as a model to facilitate broad access to IAS information, and the place where one may go to download
fact sheets, view images, get assistance with identifications of potential IAS, etc.
         The GISD echnolog and presents on the Internet (and via a CD-ROM proposed for 2004) information
that is currently widely dispersed and difficult to access, presenting a global picture. It addresses gaps (e.g.
providing data about IAS that impact regions which currently have little expertise and information available).

        LIZ SELLERS: IABIN‟s I3N Project offers a cataloguing tool and associated documentation as an
assistive technology/resource for those with IAS databases that they wish to serve online, or for those who have
IAS information that they wish to migrate to a database format to eventually be served online. For reference, the
I3N Project can be found online at

        BADRUL AMIN BHUIYA: In Bangladesh BRGB is preparing a species checklist of all living
organisms recorded so far from this region although we have very little expertise and so very scanty information
available. Information collected by BRGB will be presented on the internet through our own site, but I suggest
that information about IAS of Bangladesh can be digitized by GISD to form GISIN.


        BOB MORRIS: GBIF has deployed its prototype portal See brief explanation.
        The current protocols are the DiGIR protocols of TDWG, the Taxonomic Data Working Group with the
Darwin Core Metadata standard for collection records, and later with TDWG‟s ABCD Access to Biological
Collection Data.
        For descriptive data, TDWG has just released a draft XML Schema for the Structure of Descriptive
Data. See
        In general has a long history of data exchange standards making, recently with attention to
distributed databases.

       HANNU SAARENMAA: A few words about GBIF‟s possible role.
        GBIF is trying to provide an information infrastructure for biodiversity data. This infrastructure has
components such as portal, providers, and registry.
        The providers currently only provide “primary biodiversity data” in the Darwin Core format. Darwin
Core is good for expressing things like “a species has been found in a certain place at certain time”. In future
other formats and types of data will be included.
        All these types of data an protocols are defined in the registry. The providers advertise their data and
services there, with entries like “I provide type A data with protocol B”. The registry is open for anybody to
register their provider and any portal and search engine to discover the right providers.
        Now how does that fit with GISIN? If we look at the data types required (Diagnostics, Distribution,
Basic Biology, Dispersal, Impacts, Biotic Associations, Modes of Dispersal, Control Methods, Bibliographies
and Expert Contact Information) , we echnol that only Distribution can today be implemented with existing
Darwin Core. Diagnostics can soon be covered with SDD. For others we need to select/write data exchange
format and protocol, and data provider application.
        Technically speaking, GBIF could include all these information/provder types in its registry. This would
be kind of a global phonebook of available IAS data and information. However, as we are not talking here only
about biodiversity data but pest control etc, it might be more appropriate that another registry similar to GBIF‟s
for IAS is established. The registries, if made using compatible echnology          approaches like UDDI, can
share their information where needed, like data on distribution and diagnostics.
        So, I hesitate to include all these information types in GBIF, but I think GBIF provides a model for an
infrastructure that works and the linkages/data flows between GBIF and GISIN should be strong.
        The GBIF architecture is described at

       BOB MORRIS: Some ideas for a technical discussion of exchange protocols, metadata schemas and
content schemas. This is meant to center not on what should be represented, but rather how.
       The discussion is likely to bore anyone who is more inclined to google owl+species than owl+ontology
       Some sample topics: DiGIR vs SOAP?
       Does Z39.50 matter?
       Should the ADL Digital Gazeteer protocol be adopted?
       Is it a technical or a social issue to suggest that GISIN should just be a component of GBIF?


        STEVE CITRON-POUSTY: I am not sure if those are high requirements or not. It seems to me that this
DB interoperability is a perfect scenario for Web services. Perhaps we should try to derive a web service profile
for the exchange of invasives data. In this way most people would not have to write the code to write the XML
by hand. Just a thought…

         BOB MORRIS: If you mean Web Services in the sense of WSDL certainly you are correct and this has
the biggest chance of cross-platform success. However, more generally, nobody should ever have to write code
to generate XML. That is what databinding frameworks like Castor do. See
[Which states: “Castor is an open source data binding framework for Java[tm]. It’s basically the shortest
path between Java objects, XML documents and SQL tables. Castor provides Java to XML binding, Java to
SQL persistence, and then some more.” Also available for download at this URL.]
         That said, Web Services frameworks like the open source Apache Axis, and the proprietary Microsoft
.NET leave even less work to do to exchange data with SOAP (if you make your NET play nice, which is not
actually its default). Indeed the problem is reduced to writing an XML Schema, which is where most
communities trying to address this issue nowadays focus their attention. For non-trivial domains, that is a non-
trivial task, but one based on standards.
        STEVE CITRON-POUSTY: SOAP messaging is exactly what I was talking about. I mean this is even a
case where UDDI actually makes sense. =)
        So perhaps at the meeting we should focus on schemas for minimum data required for the different data
types (field records, data set records, species records, expert records…?). In this way we don‟t have to worry
about what people implement on the backend as we can all send data to each other and do with it what we will.

         BOB MORRIS: Once a server can emit XML and has a published XML Schema, open source tools
make it easy to deploy Web Services based on SOAP. For example, the Apache Axis framework will manage
all the infrastructure in ways that allow applications to simply call api‟s to get make connections and fetch data.
It‟s particularly easy to write wrappers around any such service. A set of demonstrations I did for a course I
teach is described in the READ.ME.txt at
         It tells both how to invoke the demos and what their architecture is. The code invoking the Axis APIs is
also in that directory.
         This all means that once again, the big deal is getting the Schema right. Plus, the really interesting thing
is: what do you do with the XML once you have it?

        BRIAN STEVES: When it comes to using Web Services for sharing our IAS information, I‟m a little
concerned that we might not be able to get many (if any) IAS data managers to develop, deploy, and consume
them at this time. However, I agree this is probably where this group should be headed in the future, so it‟s good
that we seem to have a few experts on the topic attending GISIN (maybe we‟ll discover that Web Services will
be easier to implement than I think). Until then, I still consider NISbase a positive step forward in achieving our
goals of a distributed network of IAS information.


       LIZ SELLERS: Formats recommended by the Conference of the Parties to the Convention on Biological
Diversity at the 6th meeting held in April 2002 included: The Dublin Core Metadata Initiative, the Federal
Geographic Data Committee (FGDC) – ISO 19115; the FGDC Biological Data Profile; BIB-1, XML as a
description echnolo, and HTML 3.1 as a presentation language. How will adherence to these formats effect
decisions on collecting, providing and maintaining IAS information in online databases and a Global Invasive
Species Information Network?
       Reference documents loaded in this project: “Scientific and Technical Cooperation and the Clearing-
House Mechanism (cop-06-inf-18-en).pdf” and “COP6 Recommended Formats.pdf”

         BOB MORRIS: The document at the Species Analyst site
         is pretty easy to read, but is about a year out of date. It is a good place to for biologists to start and end,
and a good place for informaticists to start. See also the corresponding thing at Manis,
         To stay current on DarwinCore and other DiGIR issues, it is better to track
         and thence to The current DC2 is 1.24. The changes are mostly technical, and you have to look at the
DgGIR developer‟s mailing list archive to understand why there is now a Darwin Core and a Darwin Mantle. It
doesn‟t matter much except to people trying to implement DiGIR, and the Species Analyst concept document is
a good introduction.
         To me, using DC2 or ABCD as the specimen record schema seems a no-brainer, but it is probably
missing a lot of what is important about invasive species, as distinct from just species.
         One important thing is that in its present form, DC2 is rather bound to the DiGIR protocol schema
(which describes how DiGIR queries are made). I find that regrettable, but probably not fatal. When GBIF gets
around to adding a SOAP interface---which is a stated aim of Donald Hobern, the GBIF Data Access
archictecture program---I guess there will need to be a cleaner separation. Also, since the BioCase ABCD gang
have a small technical problem with this tie, maybe the separation will come sooner.

      DAVE VIEGLAS: A SOAP interface to DiGIR / Darwin Core is currently in development as part of the
SEEK project ( and currently exists in prototype form.

      BOB MORRIS: Dublin Core was designed for the kinds of things that concern library collection
management. Bad choice for metadata about organisms and the things that describe them.
      Googling “FGDC” provokes 234,000 responses;
      googling “XML” provokes 32,500,000 responses
      Perhaps more importantly:
      FGDC+middleware gives 643 googles;
      XML+middleware gives 291,000
      Is there something more to say?

        ANNIE SIMPSON: Vishwas Chavan gave a presentation at a Regional Meeting in Asia concerning the
link between Taxonomy and Invasives, as a representative of the GISP Informatics Working Group. I have
uploaded both audio and non-audio versions of his ppt presentation into the general documents section of the
“Data Standards & Formats” project.
        On his slide 13, XML (as a descriptive language), Dublin Core, FGDC, and even BIB-1 are all
mentioned as recommended formats (and HTML 3.1 as a presentation language). These recommendations came
out of a global meeting held in Montreal by GISP and the CBD in February of 2002, which several of our group
        Though Dublin Core was created by “library types” and FGDC by “geography types,” both have been
modified/expanded for biological use on the Web. Is using XML to describe these formats an inclusive and
satisfactory solution?

        BOB MORRIS: This depends on the modifications and expansions. For Dublin Core, the major ones I
know of are the Darwin Core and the Biocase/TDWG Access to Biological Collection Data (ABCD), both of
which focus on specimen data.
        An effort at the Cornell Lab of Ornithology is contemplating extending ABCD to deal with
        Are there references to other extensions of Dublin Core that we can look at to form an opinion?
        Ditto for FGDC? Most of the content on the web site appears to have been last updated in 2000. Is there
more recent activity somewhere that we should look at? At a glance, it appears that the FGDC Biological Data
Profile extensions comprise taxonomy, methodology, and geologic age stuff. If I‟m correct, I would have
thought it is of minimal help to GISIN. It would be most excellent if this work continued somewhere, since
among its contributors are people in the SEEK project and other more modern ways of doing distributed data on
the web.


       LIZ SELLERS: So you‟ve got data, and you want to put it into a database. Or you‟ve got a database
already. What software package should you use to enter/manage your data/database? Will the software you
choose make it difficult for you to share your data with others? What are the key components and functions we
should look for in this type of software package? (e.g. Export/import functionality that allows data to be shared
with others in a delimited or spreadsheet format?).

       ANDREA GROSSE: Take a look at the I3N Cataloguer, downloadable from
       This tool is used in several countries in the Western Hemisphere.

        MAC-Caribbean Yahoo Group listserv: The main focus of these discussions is indeed the development
of Invasive Alien Species (IAS) databases and distributed database systems. However, type specimen
collections, herbaria and baseline/taxonomic collections are a valuable resource often referenced in support of
IAS databases.
        Caribbean museums and the UK are reported to be using “MODES” (Museum Object Data Entry
System) to manage and exchange specimen/collection data
        which is purported to be a cheaper alternative to U.S. software packages. A similar application called
“PAST PERFECT” was also mentioned. To support successful exchange of data between museum databases,
any software application must have the ability to export/import data in either Excel, Access or CSV format.
However, XML is quickly becoming the popular markup language for data exchange. A good example of the
use of XML is the I3N Cataloguer.
        Another issue highlighted by Bruce Potter (listserv member), is that of the difficulty in reaching
agreement [e.g. between database developers/owners] on the meaning/use of individual data fields. For example
does the “Address” field refer to postal address, street address, or both?
        [Submitted with permission: Bruce Potter, MAC-Caribbean Yahoo Group, Island Resources Foundation,]

        ANNIE SIMPSON: I‟m a splitter, not a lumper, so I tend to think “address” should be “address1”
“address2” etc.
        Agreeing to disagree can also work, if programmers can create code to make translations between all the
differing fields of the interoperating databases. At least that is the way I think it should be able to work.

        BOB MORRIS: In this context, normally this is the reason to have an agreed upon “integration schema”.
It then becomes the task of the data provider to map their internal field names and data organization into those
in the integration schema and respond to queries in a way that is described by that schema. When “schema”
means XML schema, this is usually a straightforward task aided by the dbms itself nowadays. The main
problems come if the db has stuff that cannot be expressed by the schema or vice-versa. Then some information
will be lost when the query is answered.

         STEVE CITRON-POUSTY: Just a note to say I agree with Bob. Send standard fields across the wire
and let people decide where they go on either end. So if the address data is echnolog on sending and receiving
it‟s trivial to lump it together to put in your db while the inverse is not always true unless there are consistent
delimeters in the address information.

        ROB EMERY: The group may be interested in the approach used in Australia to bring a dozen or so
disparate collection databases together to form the Australian Plant Pest Database at
        The query form will load but can only be submitted by collaborators but I‟m sure you will get the idea of
how it works.
        All of the collaborating collections are working taxonomic collections so only records of actual
specimen data labels are returned.
        The first page returned lists the collection by name and the number of specimens held at each collection.
There are links to an Australian distribution map as well as a specimen details link which returns: Family,
Genus, Species, Common name, host (scientific and common names), location, lat/long, Collector, ID method
and Stage.
        There are quite a few different databases used at the collections, CSIRO‟s Biolink is used at several as
well as (I think) Texpress. My echnology           developed an Access database in-house about 10 years ago and
this had been used by a couple of collaborators as well. Our database holds about 130,000 specimen records. I
think there is even an Excel spreadsheet out there that is part of the APPD.
        CSIRO Maths and Information developed the “Internet Marketplaces” software which involved Apache,
broker and gateway software being installed on each collaborator‟s webserver along with a schema which maps
the different field names. I‟m sorry if my description does not do this extensive project justice.

       STEVE CITRON-POUSTY: I talked about this over in the other discussion but maybe rather than going
down a proprietary route (and by this I mean someone‟s custom format that uses only their custom software) we
should think about using web services to exchange data. In this way all we have to agree on is the contents of
the envelope, not how we make the letter or how we send and receive it. This would also allow us to be
software, platform, and language agnostic.

       BOB MORRIS: Works for me. I prefer my web services with capital W and capital S, but not all the
good ones use WSDL. Sigh.

       STEVE CITRON-POUSTY: yup I mean Web Services. And as far as playing nice I think thats the
responsibility of the providers. There must be a set of technology tests (put out by oasis or somebody) that you
would have to comply with to publish.
       The other big win for WS is that they work much better in low bandwidth situations and we could also
use them for both push and pull of data.

        BOB MORRIS: I started a new Discussion named “Exchange Protocols” since this current Discussion
seems mostly intended to discuss end-user data management issues, not software architecture.
        Showing Web Services access to IPANE would be great and I think we have time to pull it off. Let‟s
take the discussion off line. Send me email as
        You presumably know that we have recently submitted an NSF ITR proposal to build a toolkit for
generating image-aware, spatially referenced observation systems like Our partners are the Cornell
Lab for Ornithology, IPANE, and the MIT SeaGrant, with invasive species monitoring as the main proof of
concept. It‟s an ambitious project with payoff several years out, and there were 1500 ITR proposals submitted,
so we will look at other funding opportunities too.


        ANGELA SUAREZ-MAYORGA: The majority of the standards for biological records are too general–
even ours, probably because biodiversity is bio-complexity, then data models cannot go to the specific items if
they seek to record any biological unit. However, as standardization is a must, minimal fields should be
established. For me it is difficult to conceive a biological record (at the individual level, species level or
community level) without considering a spatial and temporal reference. Consequently, geographic coordinates
are very useful to set the spatial reference, but sometimes [it] may be better to use a different way to do so. If
we deal with taxa of non-restricted geographic ranges (i.e. widely distributed species), probably a descriptive
way is more helpful than recording many pairs of coordinates.
        If you want to take a look, the biological standard (in Spanish) of the Biodiversity Information System
of Colombia is available at
[one copy of this document will be available at the meeting for reference purposes]

       BUDDHISRILAL MARAMBE: Inclusion of geographic coordinates would no doubt be of significant
importance with respect to future reference, monitoring the invasive behaviour of species, etc.

        STEVE CITRON-POUSTY: I think pairs of coordinates in Lat Long WGS84 would be great. Or at least
pairs of coordinates with a definition of projection and datum. This geographic information will certainly be
vital for field records. I think the benefit of coordinates is that we then don‟t have to worry about maintaining a
“global” gazetteer that we all match our records against. In our database alone (which is based off a subset of
the GNIS) which is only 6 states in the U.S., there are 32.5k records. Using names would also force a level of
conformity for entering field data that is rarely achieved. Perhaps names could work at the data set and expert


         CHARLOTTE CAUSTON: It is also necessary that the fields distinguish between those species that are
human and agricultural pests and species that are invasive according to the IUCN definition (i.e. environmental
pests). In many cases the definition of the term invasive is confused.


        RAUF ALI: The subject of legal issues while dealing with IAS appears to be important. I worked in the
Andaman Islands, which are an Indian territory and India‟s laws apply there, including the wildlife laws. The
greatest invasive on the islands is the spotted deer or chital (Axis axis) I believe this is a problem in Hawaii as
well. I‟ve been documenting the vegetation changes caused by it, and have come to conclude (insofar as one can
conclude anything) that major degradation of vegetation is taking place due to this.
        The issue is, these are protected in mainland India, being endemic there. The law therefore prevents any
culling on the islands! Ministry of Environment and Forest officials are reluctant to set any precedent, because
the issue of invasives has never come up before. However, it seems that we have a legal obligation under Sec 8
h) of CBD to eradicate these.
        Some nations probably lack any regulatory framework to eradicate invasives, and in some, like the case
given above, existing legislation actually hampers efforts to control invasives. It would appear that a database of
case studies dealing with problems in the regulatory framework may help decision makers in considering
alternatives or changes to laws. I‟m not sure about how one would set up network-driven solutions to this.

         BOB IKIN: I am not sure how a database of case studies would assist countries in drafting legislation
that gives effective control of organisms that have become a threat to the environment.
         A database of information would be very complex and difficult to use. The example you quote is an
interesting one as the issue seems to fall between a number of pieces of legislation. This includes the
Environment Protection Act (1986), the Wildlife Act (2003), the Biodiversity Act (2002) and even the Draft
legislation on quarantine. All are so specific that they do not identify this particular scenario.
         What is needed is a guideline for a framework that identifies all the activities that are required to cover
the management of organisms that can have a deleterious effect on agriculture, the environment and possibly
         I have been involved in the development of such a framework that builds upon the already existing
legislative powers and capability of quarantine services, and developes linkages with the national authorities
(such as environmental, wildlife, marine and others) so that all issues with introduction,
management/contingency planning/and eradication can be dealt with. The need is for the recognition of the
various powers to control organisms at all stages, and as a consequence the requirement for consultation at
inter-deparmental level for this to be achieved. As an example that illustrates this problem consider the case of a
plant that has been brought into a country as an ornamental and because of a particular set of circumstances
becomes a weed in an aquatic environment. Departments of Agriculture, Environment, wildlife, fisheries,
irrigation etc would all have a stake in its control eradication.
         In the particular case you outline the legislation would have to be able to identify the special status of
the islands (which is already possible under some legislation) and the capacity to eradicate the animal (after
identifying it as a pest/invasive) by killing it (wildlife legislation in India only permits control by relocation).

       LIZ SELLERS: Perhaps a source of information about examples of other national regulatory
frameworks that have successfully defined and addressed the IAS problem would be useful to
researchers/professionals that are consulting with governments/policy makers that are just beginning to develop
or build on their IAS regulatory framework – where existing IAS regulations are only partially applicable (with
respect to IAS), or where they do not exist at all?
        Of course, the first step to regulating anything is defining it… and defining any species as an IAS (and
one that requires control or close attention) can be a complex task, and one that must be addressed by each
nation according to its needs and priorities… e.g. conserving/protecting the national economy can mean
regulating agriculture-related species, biodiversity-related species (e.g. tourism-based economy), or both, or
conserving other economic sources effected by IAS.
        One of several organizations addressing this issue, the IUCN has produced several references addressing
the issue of law-development and application with respect to IAS. I must admit, that I have not read these
documents, but I think that perhaps a collection including these and similar reference material along with case
studies of national attempts to implement a framework of IAS regulation may be a useful component of an IAS
Legal Toolkit – available online or on CD?
        These references [are] located online at:
        * Shine S., N. Williams and L. Gündling (2000). A Guide to Designing Legal Institutional Frameworks
on Alien Invasive Species. IUCN, Gland, Switzerland, Cambridge and Bonn. Xvi + 138pp (English version
        * Legal and Institutional Dimensions of Alien Invasive Species Introduction and Control. Proceedings of
the Workshop on the Legal and Institutional Dimensions of Alien Invasive Species Introduction and Control.
Held at the IUCN Environmental Law Centre, Godesberger Allee 108-112, Bonn, Germany 10-11 December

        BOB IKIN: In addition to the publications listed by Liz the following –
        APPLICABLE TO INVASIVE ALIEN SPECIES. Technical Review No 2. Secretariat of the CBD,
Montreal, 2001.
        Not only lists the scope of the many legal instruments, but identifies areas of convergence in their
        Note para 124 –Capacity to address environmental, economic and social challenges posed by invasive
alien species is not remotely sufficient. From the legal and institutional perspective, this paper has highlighted
the complexity of existing regimes as well as strengths, gaps and inconsistencies. And para 126 –The task
facing policy-makers is how to strengthen capacity to protect native biodiversity against invasion impacts
without adding extra complexity or duplicating what already exists. ….
        I am currently looking at working with FAO Legal Office on an update of global phytosanitary
guidelines, so would it be useful to explore the possibility of addressing the identified inconsistencies in this


        BOB IKIN: Stakeholders (users) of this biological information database that enables assessments to be
made on the capacity of species to enter through pathways to become invasive, have varied responsibilities and
backgrounds. Regulatory authorities who use this type of information include environmental scientists and
administrators as well as quarantine authorities (National Plant Protection Organisations) who maintain the
point of entry barriers at which regulatory action is taken. Those involved in making risk assessments also
include research scientists and universities. The aim of any database is to serve its clients, to be inclusive and
promote cooperation between those concerned with conservation and environmental impact and those with the
impact of incursions on agriculture and related fields. In most developing countries the distinction between
these two areas is almost non existent.

       BOB MORRIS: Traditional database design principles assert that before anything else, you should find
out how your data will be used.
        That tradition is a death trap. YOU CAN NOT KNOW HOW YOUR DATA WILL BE USED and
should design without assumptions.
        Nowadays, that often means being prepared to use ontology mechanisms to map whatever concepts you
built into your database to those that meet the needs of some other audience than the builders. See the NSF
SPIRE and SEEK projects, and respectively. Jim Quinn is
one of the Pis on SPIRE and hopefully there will be mention of its work at the meeting.

        MICHAEL BROWNE: This topic is about the role of the GISIN – what difference will the GISIN make
and how will it do so?
        As Bob says, it will only make a difference in most developing countries if it provides access to data
about species both with agricultural and biodiversity impacts. If we stick to this inclusive principal, it will serve
the „data rich‟ just as well as „data poor‟ regions (e.g. how do we get data on species that are only invasive in
central Africa), and it will meet research requirements as well as those of land managers and quarantine
agencies. It will also serve those who have minimal access to the Internet or none at all. IAS are a global
problem. Is the GISIN to be inclusive or exclusive?

        “JAMALIENS [SUZANNE DAVIS?] Consider the question “What are the objectives of a GISIN?” The
answers to that question will greatly determine how inclusive or exclusive the GISIN is.
        Possible answers are:
        1. To provide direct and easy access to global online IAS resources. [I see metadata being instrumental
here]. Dealing with metadata would result in a high level of inclusivity.
        2. To facilitate making existing databases interoperable and accessible via the Internet. The more
technical persons could address this in more detail, but clearly the levels of technology used e.g. Internet
accessibility and communication, types of database software and their compatibilities, etc. would probably limit
the participation of some contributors to the GISIN.


       MINGGUANG LI: Low bandwidth and lack of technical maintenance in some countries might hinder
the accessibility of data from these areas. A mirror set up in a host providing high bandwidth is thus

         BOB IKIN: To which can be added the internet access costs in developing countries. Having worked for
the last 4 years in Asia, the Pacific, Africa and the Caribbean I believe it is essential that users also should have
access to data in CD-Rom format. CD readers are now widely available, CD production is now simple and
cheap and by distributing data, say annually, access by internet for updates would not be too difficult or

        LIZ SELLERS: So perhaps we should refocus our initial efforts on encouraging the development of CD-
ROM versions first, followed by online versions. Or a “preferred accessibility requirement” of developing a
companion-CD-ROM version for any/all online IAS databases. Thus falling more in line with a typical Decision
Support System approach where IAS database CDs can be incorporated into the existing suite of software and
data tools already being used by clients.
        Which leads me to question whether a short list of “preferred accessibility requirements” for IAS
databases might be a good product to develop during the GISIN pre-meeting discussions/experts meeting?
        Another option to add to the list as Mingguang_Li states – is a requirement for providing low and high
bandwidth viewing options for online tools – such as those provided for viewing this online community (see
„Portal Settings‟ at the top of your GISIN community page).

       MICHAEL BROWNE: Mingguang_Li „s recommendation to provide a high bandwidth mirror will
resolve problems related to his server bandwidth and GISIN users accessing his data.
        Liz said we should provide low and high bandwidth viewing options at GISIN – should contributing
databases be asked to offer low bandwidth access options (minimal graphics, short pages, text only) for users
with low access bandwidth?
        I agree with Bob that annual CD-ROMs (“a companion-CD-ROM version for any/all online IAS
databases”) would be a good way to deliver information where there is no Internet or poor/expensive access.
The question is what content to include in the CD-ROM and how to present it effectively. ISSG is planning to
produce a CD-ROM containing 300 invasive species profiles in late 2004, so we will soon be dealing with these
        Liz has suggested we focus on some GISIN guidelines for accessibility, so let‟s do that. We can use
Brian Steves‟ Provider Requirements as a basis for discussion (see NISbase Information for Developers.pdf
under „Documents‟).

        BADRUL AMIN BHUIYA: Since the discussion started I had the same problem of low bandwidth and
since today I could not acess to the discussions. At home I use a dial connection with 56k modem. This
condition is similar in most of the places of Bangladesh. As a result, unsatisfactory maintenance of the GISIN
will be a factor.
        Now at Chittagong University we are using 128kbps bandwidth through VSAT and in the office we do
not have problem.
        So, suggestions for low as well as high bandwidth facilities are recommended for GISIN site.

         BOB MORRIS: Most such users would be well served by optical media PLUS a subscription based
notification system that would tell them when a particular record has been updated or added, along with a
mechanism for updating records of interest.
         CDs are rapidly becoming obsolete in favor of DVDs, which, however, are not fully standardized yet as
to their encoding.

        MICHAEL BROWNE: What proportion of potential users of the GISIN, and what proportion of
potential participating databases face problems associated with low bandwidth (or no Internet access at all)? I
found a July 1999 map of global Internet access at Has anyone got
more recent data?

        BOB MORRIS: This map original comes from Matrix Map Quarterly,, which stopped publishing them in 2001. It shows number of internet
users, and doesn‟t offer much insight into bandwith of enduser computers.
        For Africa, see
        and other stuff at the same site.
        The problems with looking for this stuff on the web are several:
         (1) Most data is about transborder bandwidth, not enduser bandwidth.
        (2) Most enduser bandwidth data is gathered as market research for retail e-commerce, so is about
        (3) Most GISIN clientele are probably investing in bandwidth much faster than households.
        (4) 4G wireless has data bandwidth up to 20Mbs, and many developing countries are leapfrogging
developed countries by installing advanced wireless telecommunications faster (per capita) than many
developed countries. Even 3G wireless can go to 2Mbs and is rapidly being deployed at data rates faster than

        MICHAEL BROWNE: Thanks for sharing the African report, Bob. It states that 5 million out of 800
million Africans use the internet and the rate of growth is slowing due to cost. It complements the 1999 Global
Internat Access map at which shows that more than half the planet is in
a similar situation. This map shows the geographic locations of the Internet hardware (networked computers,
known as hosts). The number of hosts is aggregated for major cities and countries and then represented on the
map by the coloured circles. This is not a map of the number of internet users.
        If we knew that most developing countries will soon have advanced wireless telecommunications, and
that providers and users of data in those countries would have reasonably unrestricted and low-cost access to the
Internet, we could spend less time ensuring that the GISIN also caters for the needs of the Internet „have nots‟.
How do we find out if this is the case?
        In simple terms, GISIN clientele are made up of potential providers and users of data. It would be
helpful to understand who these providers and users are and what their chances of participating in, and
benefiting from, the GISIN in the near future are. For example, the telecommuncations infrastructure in most of
the Pacific region is such that people working there do not expect significant improvements in the next 5 years.
Some contributers to these discussions on the portal have already described their less-than-optimal access to the
Internet. Perhaps they could offer a prognosis for their regions over next 5 years.

        BOB IKIN: Having visited most of the Pacific countries in the last four years on assignments to
undertake training in risk analysis I would support the observation by Michael that little is likely to change with
regard to internet access by the likely users of GISIN. Although national data might paint a picture of
availability within specific parameters, those in whom we have interest are likely to be less well served.
        Issues include the effect of local infrastructure on local line delivery (I am not aware of any special
treatment of technical government persons above those of the general public); the excessive cost of access
which is seen as a government revenue earner, and limited access to the service within even official authorities
(limited to senior staff, due to perceived capacity to surf non-work related sites).
        As a mechanism of exchanging technical information, the Internet is not a stable environment for these
countries at the moment, and I have had to rely on other mechanisms.
        Hence my comments earlier on the need to continue to support exchange of information on CDs, with
updates to be provided through Internet access.
        I have had similar but limited experiences with access in countries in Africa, where cost and local
service reliability are limiting factors, particularly in locations remote from the capital and main towns.

        PHILIP THOMAS: My contention is (and always has been) that resources for whom part of the
audience is the underserved population without internet access should be developed in such a way that they
ARE accessible to “internet-free” zones (e.g. via CD). If the product is developed to work well on CD, it seems
superfluous and a waste of valuable resources to spend much (if any) time developing other interfaces that serve
the same data. That time would likely be better spent improving the CD-version interface (which–as has been
mentioned earlier–should simultaneously be available on the web).
        The Pacific Island Ecosystems at Risk project (PIER; (Jim Space, US Forest
Service) seems to be the perfect example of how things “should be done.” His CD and website are identical, and
he has given thought to how to best include appropriate information on the CD. This information is, of course,
also available online. Virtually no modifications are made between the CD and web-based versions. The
information is (to some extent, and soon to a much greater extent) automatically produced based on information
in a database, but since the product is completely HTML (and PDF)-based, no special server side
software/maintenance is required (therefore it works nicely as a CD product). (The raw PIER database will soon
be available online as XML and/or SQL server, so other interfaces can be created based on the data. In fact,
some of the data is already being used via
        However, my point is that such auxiliary interfaces for those with higher-end capabilities should be
viewed as PURELY OPTIONAL, and that the primary interfaces should be designed to work from CDs. If
well-designed, products from this type approach will (by definition) be as useful as other interfaces which
preclude important audience segments.

       ROB EMERY: We released our entomology website on CD for farmers without Internet access and it
was very well received, in fact it was self propagating as people burned copies (with our permission) for
neighbours and so on. The cost of CD production was minimal for large quantities.
        One of the problems we had was that many of our webpages which would be of interest to farmers (e.g.
information data sheets) were database-driven and therefore needed to be individually saved as html pages.
Also, I wish we had put some sort of “expiry header” so that people working with old CDs would know that the
information is out of date.
        We also purchased a licence for a CD search tool, the name of which I can‟t remember, which improved
CD navigation.
        Our plan was to maintain the website as the primary information source and press CDs a regular


        LIZ SELLERS: The Integrated Taxonomic Information System (ITIS) provides “authoritative
taxonomic information on plants, animals, fungi, and microbes of Norther American and the world.” Are there
other taxonomic systems out there that should be considered as a chosen taxonomic authority for use in IAS
information systems?
        For reference: In the Davis Declaration, that resulted from the 2001 “Workshop on Development of
Regional Invasive Alien Species Information Hubs, Including Requisite Taxonomic Services, In North America
and Southern Africa”, participants called upon the ITIS, the Global Biodiversity Information Facility (GBIF),
BIONet International and the Global Taxonomy Initiative (GTI) to “make IAS a priority, establish global
standards for IAS taxonomic classification, and improve the availability of accurate IAS taxonomic
information. Reference the „Davis Declaration‟ included in this project (Davis Declaration (February 2001).pdf)

        GORDON RODDA: For reptiles outside of North America, I don‟t find ITIS very complete. However,
the EMBL site ( is not only complete and up to date, but
also well linked to useful sites. It has a large number of corporate and NGO sponsors, including HL and SSAR,
the leading North American scientific societies of relevance. It is unfortunate that there is no single site of
preference for all taxa, but I prefer authority (EMBL is the only definitive site for global reptiles) over

        BOB MEESE: For vertebrates, the situation is pretty well in hand. As noted, the EMBL database is best
for reptiles.
        For fishes, it would be Eschmeyer‟s Catalog of Fishes
( or FishBase (nomenclature
supplied by Eschmeyer),
        amphibians it‟s AMNH (,
        and for mammals probably SI (
         For birds one could use Zoonomen (, but this lists current
names only, while the others have synonyms. The sites for fishes, amphibians, and reptiles have complete
listings including authors and references, but birds and mammals lack such completeness.
        For invertebrates, the situation is much more difficult, as there are primarily regional lists with narrow
taxonomic and geographic focus. And for plants, there really is nothing approaching a global standard, but IPNI
( is a useful name-checking resource which links several large

        BOB MORRIS: The problem with most such resources is less their coverage---that‟s a matter of time
and slogging---it is whether they are accessible by software. ITIS has a pretty good XML story, which in
particular means it is possible to write applications against it, which does not presently seem to be the case for
most of the other databases mentioned.
        Whenever I see the words “upload” or “download” uttered by end users of data, I know they are getting
poor service, pretty much guaranteed to be out of date as soon as they have finished their *load.
        ANGELA SUAREZ-MAYORGA: I agree with the list and with the problems –invertebrates are
difficult. As the Biodiversity Information System in Colombia, we recommend to our users the RBG database
for vascular plants ( and W3MOST for non-vascular plants
( As long as in Colombia we have to deal with many species,
we started to build our own authority files. Soon (four months from now) we will have online taxonomic
authority files for Carabidae and Cicindelidae (Colombian species) plus Formicidae (to the genus level for the
Neotropical region).
        Anyway, maybe the point here is not the completeness of the database but the quality of the information
that the database gives to our purposes. Probably we don‟t need many names, but THE names (I mean, verified

       MICHAEL BROWNE: I too prefer authority over convenience. It is worth stating that we were able to
use ITIS as the taxonomic authority for 87% of the 300+ IAS that will be in the Global Invasive Species
Database by July 2004. Perhaps out of this discussion and our meeting will can encourage the various sources
of taxonomic information to cooperate and give IAS a high priority. We should be able to indicate to them
where the gaps are.

        BOB IKIN: In terms of practicality many countries are already using the datasheets and taxonomic
information that is contained in the Commonwealth Agricultural Bureau International Crop Protection
Compendium (CABI CPC) to make decisions on the invasiveness of a wide range of organisms. Initially
designed for decisions in crop protection, the system has evolved into a dataset that can be used for invasiveness
decisions on plants, pathogens and many vertebrates and invertebrates. It is truly global. Each datasheet,
produced by a technical expert on the organism, contains information on the taxonomy, distribution and biology
of the organism that is fully referenced. Decisions can therefore be made on the likelihood of entry,
establishment and spread for each.
        In making the case for harmonisation of systems I would like to emphasise the need for agreement on
terminology. In the plant protection area, the Glossary of phytosanitary terms (produced in five languages by
FAO) has done much to assist with the common understanding of invasive concepts, but this is not the case with
the AIS agreements (which define phrases not words). As an example the process of movement of an organism
to a new area is considered to be covered by entry, establishment and spread (introduction is entry and spread). I
understand that recently a meeting was held between phytosanitary experts from FAO and representatives of the
CBD to come to a common understanding of terminology, which is essential since the phytosanitary/biosecurity
services of countries are often the only regulatory authority at a point of entry who are able/permitted to make
decisions on the import and export of AIS (pests).

       LIZ SELLERS: For Reference: The Food and Agriculture Organization‟s Glossary of Phytosanitary
Terms may be reviewed online at:
       I have also loaded two PDF versions of the FAO‟s Glossary of Phytosanitary Terms into the Food and
Agriculture Organization (FAO) Folder.

       BOB IKIN: I have uploaded the 2002 version of the Glossary of phytosanitary terms to the FAO
documents folder. The glossary is revised every year as new terms and words are incorporated into the
Phytosanitary international standards.

        SOETIKNO SASTAROUTOMO; Additional information for CABI-CPC. The 2004 revised version
(will be available in July) will also include information on: “invasive pests of economic and environmental
importance, focusing on alien species affecting agricultural and plantation crops and rangelands”. They are
currently busy echnology the content for this edition which will add at least another 150 new full data sheets
on invasive plants plus around 50 on other new invasive crop pests and some new data for selected existing data

       LIZ SELLERS: Many online databases are presented on the Internet using the English language. How
can we provide quality IAS information to customers that speak other languages?

       CHRISTINE CASAL: FishBase is able to provide language translation for some of the fields in the
SpeciesSummary page.
       The language translation was achieved by:
       1. Creating a table of all labels, headers, and notes in the 3 pages with corresponding translation to
          different languages, which is then accessed from a database and displayed on user browsers.
       2. Utilizing the web service offered by Systran. Systran is the engine behind the translation routines in
          Google, AOL, Alta Vista and others. FishBase used Systran to translate the 3 fields: Diagnosis,
          Distribution, and Biology in the SpeciesSummary page.
       For Systran to provide better quality translations FishBase is currently doing this:
       1. Bernd U. (University of Kiel) is building up a dictionary that will later be sent to Systran. The
          Systran engine should then detect if the request comes from FishBase, and should translate it
          according to our dictionary. For example, the word “Order” should not mean “Command or
          Instruction” but rather “Fish Order”.
       2. Rainer (Froese also of the Univ. of Kiel) now is directing the FishBase encoders to make simple,
          complete sentences when entering data for Diagnosis, Distribution and Biology. When achieved,
          Systran will give better translations to these fields.

        LIZ SELLERS: You raise an interesting point. I think there are two issues here: 1) context-sensitive
translation of English to other languages (as in your example of the word 'Order') and 2) selection of a basic set
of languages to support?
        How many languages does Systran support with respect to context-sensitive translation (as in 'Order')?
        Did you choose a basic set of languages to translate that way?
        My online research shows that the top 5 spoken languages in the world are (1)Mandarin, (2)English,
(3)Hindu-stani, (4)Spanish and (5)Portuguese.
        However, a chart of web content, by language (online at
indicates a ranking of (1)English, (2)Japanese, (3)German, (4)Chinese and (5)French.
        Should participation in the GISIN 'lightly' require (perhaps the word is 'request') support of a basic set of
spoken languages (assuming support is also provided to help participants meet the requirement)? If yes - then
which ones should we choose?

Shared By:
Description: Needs Heirarchy Assistive Technology document sample