Docstoc

Implementing Search Spelling Suggestions using the Google Web Services API

Document Sample
Implementing Search Spelling Suggestions using the Google Web Services API Powered By Docstoc
					Implementing Search Spelling Suggestions using the Google Web Services API

Dave Costakos
Software Developer, Systems Engineering Division May 2nd, 2002

Agenda
• Overview of the Google Web Service API

• Overview of SOAP and related technologies • Description of Google’s Web Service API • Description of how Google’s Spelling Suggestion service was integrated. • Explanation of how it all works • Example integration walkthrough

Google’s Web Service API
• A BETA web program that enables developers to access Google services via SOAP. • Allows clients to connect to Google and use their search, cached page and spelling suggestion software inside the client application • Available for non-commercial use from: http://www.google.com/apis/ • It can be used commercially with written permission from Google. • Limited to 1,000 accesses per day. May allow more accesses at a later date for a commercial fee.

• Provides a client API in Java and .NET but any language could be used to access the services.

Web Services Overview
• A Web Service is a piece of business logic located somewhere on the internet accessible through a standardized XML messaging system. • Because Web Services use XML, they are not tied to any specific platform or Operating System. • Very similar to how the library community standardized and shares information via the Z39.50 protocol. • A main differences between Z39.50 and Web Services are cross-industry support and the use of XML.

SOAP Overview
• Simple Object Access Protocol (SOAP) • Provides a standard packaging structure for transporting XML messages over many standard protocols like HTTP, SMTP or FTP. • This standard transport mechanisms allows heterogeneous clients and servers to be interoperable. • The Cornerstone protocol of Web Services.

SOAP Overview Example Envelope
<?xml version='1.0' encoding='UTF-8'?>

<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" xmlns:xsd="http://www.w3.org/1999/XMLSchema">
<SOAP-ENV:Body> <ns1:doSpellingSuggestion xmlns:ns1="urn:GoogleSearch" SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"> <key xsi:type="xsd:string">00000000000000000000000000000000</key> <phrase xsi:type="xsd:string">britney speers</phrase> </ns1:doSpellingSuggestion> </SOAP-ENV:Body> </SOAP-ENV:Envelope>

Web Services Description Language
• WSDL (Web Services Description Language) – describes the interface to a web service in a standardized way. • Tools are available that take a WSDL file and turn it into a Web Service client or take Web Service code and turn it into a WSDL file. • Google’s WSDL: http://api.google.com/GoogleSearch.wsdl

Universal Description, Discovery, Integration: UDDI
• UDDI (Universal Description, Discovery and Integration) – provides a registry for web services for advertisement, discovery and integration • WSDL files can be published in UDDI registries making them available to the general public for discovery. A shopping Mall for computers • ebXML is on the horizon

Web Service Tools
• Most programming languages now provide tools to make using and programming Web Services simple and easy.

• Some SOAP tools:
• IBM SOAP4J / Web Services Toolkit • Sun JWSDP (Java Web Services Developer Pack) • Apache SOAP / Axis • Perl SOAP::Lite (Perl) • Microsoft SOAP Toolkit (.NET)

Web Service Work Flow
SOAP Request

Web Service Client

SOAP Response

Web Service Server

Lookup Service

UDDI Registry

Publish WSDL

Integrating the Google API with SiteSearch
Requirements

• A license key obtained from Google • Conformance to the Google license agreement

• Limited to 1,000 accesses per day
• The googleapi.jar file • Java 2 (1.2.2 or better)

Integrating the Google API with SiteSearch
Files Touched • Java Files: GoogleSpellingSuggestion.java, QUERY.java, ZServer.java • ini/servers/ZBase.ini (ZBase_rb.ini if you have record builder)

• HTML: resultsnav.html, nfsort.html, nfbrief.html, nffull.html,nfrefine.html
• Scripts: ssmgr.HOSTNAME file (added googleapi.jar file to CLASSPATH

Integrating the Google API with SiteSearch
How it Works • GoogleSpellingSuggestion handles the details of obtaining the spelling suggestions from Google and caching results (we have a limited number of accesses available) • QUERY takes the client query, extracts the entered term, obtains the suggestion and loads it into the user data for display • ZServer: Loads the key into memory from the configuration file

• HTML: displays search suggestion stored in user data

Integrating the Google API with SiteSearch

User

Integrating the Google API with SiteSearch

Search

WebZ

User

Integrating the Google API with SiteSearch
Z39.50 Request
Search

JaSSI

WebZ

User

Integrating the Google API with SiteSearch
Z39.50 Request
Search

JaSSI
Search

WebZ

ZBase

User

Integrating the Google API with SiteSearch
Z39.50 Request
Search

JaSSI
Search

WebZ
SOAP Response

ZBase
SOAP Request

User

Google

Integrating the Google API with SiteSearch
Z39.50 Request
Search

JaSSI
Search Spelling Suggestion

WebZ

ZBase
SOAP Response SOAP Request

User

Google

Integrating the Google API with SiteSearch
Z39.50 Request
HTML Search HTML

JaSSI
Search Spelling Suggestion

WebZ

ZBase
SOAP Response SOAP Request

User

Google

Example Screen Implementation

Where do I get this Enhancement?
The SiteSearch Open Source Server http://www.sitesearch.oclc.org/projects/spelling/

Possible Improvements?
• Caching: The GoogleSpellingSuggestion object maintains an internal cache of suggestions. However, the cache was never performance tested and it is unclear how much good it will do in a production environment. • API Usage: The GoogleSpellingSuggestion object uses the standard Google classes provided to make SOAP calls. Users may be able to improve upon these APIs in terms of performance.

Where to Get Software
Apache SOAP: http://xml.apache.org/soap/

Apache Axis: http://xml.apache.org/axis/
JWSDP (Java Web Services Developer Pack): http://java.sun.com/webservices/webservicespack.html SOAP::Lite for Perl: http://www.soaplite.com/

IBM WSTK (Web Services Toolkit): http://www.alphaworks.ibm.com/tech/webservicestoolkit
PHP Soap Toolkit: http://sourceforge.net/projects/phpxmlp/ MS SOAP: Buy it from Microsoft

Questions?


				
DOCUMENT INFO
Shared By:
Tags:
Stats:
views:443
posted:8/12/2009
language:English
pages:25