Implementing Search Spelling Suggestions using the Google Web Services API Dave Costakos Software Developer, Systems Engineering Division May 2nd, 2002 Agenda • Overview of the Google Web Service API • Overview of SOAP and related technologies • Description of Google’s Web Service API • Description of how Google’s Spelling Suggestion service was integrated. • Explanation of how it all works • Example integration walkthrough Google’s Web Service API • A BETA web program that enables developers to access Google services via SOAP. • Allows clients to connect to Google and use their search, cached page and spelling suggestion software inside the client application • Available for non-commercial use from: http://www.google.com/apis/ • It can be used commercially with written permission from Google. • Limited to 1,000 accesses per day. May allow more accesses at a later date for a commercial fee. • Provides a client API in Java and .NET but any language could be used to access the services. Web Services Overview • A Web Service is a piece of business logic located somewhere on the internet accessible through a standardized XML messaging system. • Because Web Services use XML, they are not tied to any specific platform or Operating System. • Very similar to how the library community standardized and shares information via the Z39.50 protocol. • A main differences between Z39.50 and Web Services are cross-industry support and the use of XML. SOAP Overview • Simple Object Access Protocol (SOAP) • Provides a standard packaging structure for transporting XML messages over many standard protocols like HTTP, SMTP or FTP. • This standard transport mechanisms allows heterogeneous clients and servers to be interoperable. • The Cornerstone protocol of Web Services. SOAP Overview Example Envelope <?xml version='1.0' encoding='UTF-8'?> <SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" xmlns:xsd="http://www.w3.org/1999/XMLSchema"> <SOAP-ENV:Body> <ns1:doSpellingSuggestion xmlns:ns1="urn:GoogleSearch" SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"> <key xsi:type="xsd:string">00000000000000000000000000000000</key> <phrase xsi:type="xsd:string">britney speers</phrase> </ns1:doSpellingSuggestion> </SOAP-ENV:Body> </SOAP-ENV:Envelope> Web Services Description Language • WSDL (Web Services Description Language) – describes the interface to a web service in a standardized way. • Tools are available that take a WSDL file and turn it into a Web Service client or take Web Service code and turn it into a WSDL file. • Google’s WSDL: http://api.google.com/GoogleSearch.wsdl Universal Description, Discovery, Integration: UDDI • UDDI (Universal Description, Discovery and Integration) – provides a registry for web services for advertisement, discovery and integration • WSDL files can be published in UDDI registries making them available to the general public for discovery. A shopping Mall for computers • ebXML is on the horizon Web Service Tools • Most programming languages now provide tools to make using and programming Web Services simple and easy. • Some SOAP tools: • IBM SOAP4J / Web Services Toolkit • Sun JWSDP (Java Web Services Developer Pack) • Apache SOAP / Axis • Perl SOAP::Lite (Perl) • Microsoft SOAP Toolkit (.NET) Web Service Work Flow SOAP Request Web Service Client SOAP Response Web Service Server Lookup Service UDDI Registry Publish WSDL Integrating the Google API with SiteSearch Requirements • A license key obtained from Google • Conformance to the Google license agreement • Limited to 1,000 accesses per day • The googleapi.jar file • Java 2 (1.2.2 or better) Integrating the Google API with SiteSearch Files Touched • Java Files: GoogleSpellingSuggestion.java, QUERY.java, ZServer.java • ini/servers/ZBase.ini (ZBase_rb.ini if you have record builder) • HTML: resultsnav.html, nfsort.html, nfbrief.html, nffull.html,nfrefine.html • Scripts: ssmgr.HOSTNAME file (added googleapi.jar file to CLASSPATH Integrating the Google API with SiteSearch How it Works • GoogleSpellingSuggestion handles the details of obtaining the spelling suggestions from Google and caching results (we have a limited number of accesses available) • QUERY takes the client query, extracts the entered term, obtains the suggestion and loads it into the user data for display • ZServer: Loads the key into memory from the configuration file • HTML: displays search suggestion stored in user data Integrating the Google API with SiteSearch User Integrating the Google API with SiteSearch Search WebZ User Integrating the Google API with SiteSearch Z39.50 Request Search JaSSI WebZ User Integrating the Google API with SiteSearch Z39.50 Request Search JaSSI Search WebZ ZBase User Integrating the Google API with SiteSearch Z39.50 Request Search JaSSI Search WebZ SOAP Response ZBase SOAP Request User Google Integrating the Google API with SiteSearch Z39.50 Request Search JaSSI Search Spelling Suggestion WebZ ZBase SOAP Response SOAP Request User Google Integrating the Google API with SiteSearch Z39.50 Request HTML Search HTML JaSSI Search Spelling Suggestion WebZ ZBase SOAP Response SOAP Request User Google Example Screen Implementation Where do I get this Enhancement? The SiteSearch Open Source Server http://www.sitesearch.oclc.org/projects/spelling/ Possible Improvements? • Caching: The GoogleSpellingSuggestion object maintains an internal cache of suggestions. However, the cache was never performance tested and it is unclear how much good it will do in a production environment. • API Usage: The GoogleSpellingSuggestion object uses the standard Google classes provided to make SOAP calls. Users may be able to improve upon these APIs in terms of performance. Where to Get Software Apache SOAP: http://xml.apache.org/soap/ Apache Axis: http://xml.apache.org/axis/ JWSDP (Java Web Services Developer Pack): http://java.sun.com/webservices/webservicespack.html SOAP::Lite for Perl: http://www.soaplite.com/ IBM WSTK (Web Services Toolkit): http://www.alphaworks.ibm.com/tech/webservicestoolkit PHP Soap Toolkit: http://sourceforge.net/projects/phpxmlp/ MS SOAP: Buy it from Microsoft Questions?