Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out
Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

Fall 2008 Connections Conference

VIEWS: 7 PAGES: 38

									HSE304: Improving Relevance and
the Search Experience
Using the API
            Erik Mau
            Inetium
        emau@inetium.com
Objectives
• Understand how to “impact” relevance in MOSS

• Understand how to leverage the API to enhance the
  search user interface
IMPROVING RELEVANCE
Improving Relevance Topics

•   How does MOSS determine relevance?
•   How can farm admins impact relevance?
•   How can site admins impact relevance?
•   MOSS Relevance-related API
•   Demo
How is Relevance Determined in
MOSS?
• Anchor Text
  ●   <a href=“Benefits.aspx”>Benefits Information</a>
How is Relevance Determined in
MOSS?
• Anchor Text
• Metadata and Document Properties
How is Relevance Determined in
MOSS?
• Anchor Text
• Metadata and Document Properties
• Metadata Extraction
How is Relevance Determined in
MOSS?
•   Anchor Text
•   Metadata and Document Properties
•   Metadata Extraction
•   URL Length and URL Matching
    ●   http://portal/
    ●   http://portal/sites/MarketingTeam/Shared%20Docs/Doc.docx
How is Relevance Determined in
MOSS?
•   Anchor Text
•   Metadata and Document Properties
•   Metadata Extraction
•   URL Length and URL Matching
•   File Type Biasing
    ●   Web Pages
    ●   PPT / PPTX
    ●   DOC / DOCX
    ●   XML
    ●   XLS / XLSX
    ●   …
How is Relevance Determined in
MOSS?
•   Anchor Text
•   Metadata and Document Properties
•   Metadata Extraction
•   URL Length and URL Matching
•   File Type Biasing
•   Click Distance
    ●   Based on Authoritative Pages
How can Admins Impact Relevance
OOTB?
• Administrators can…
  ●   Include / Exclude Crawled Properties
How can Admins Impact Relevance
OOTB?
• Administrators can…
  ●   Include / Exclude Crawled Properties
  ●   Define Managed Properties
How can Admins Impact Relevance
OOTB?
• Administrators can…
  ●   Include / Exclude Crawled Properties
  ●   Define Managed Properties
  ●   Define and Configure Scopes
How can Admins Impact Relevance
OOTB?
• Administrators can…
  ●   Include / Exclude Crawled Properties
  ●   Define Managed Properties
  ●   Define and Configure Scopes
  ●   Control Authoritative Sites and Pages
How can Admins Impact Relevance
OOTB?
• Administrators can…
  ●   Include / Exclude Crawled Properties
  ●   Define Managed Properties
  ●   Define and Configure Scopes
  ●   Control Authoritative Sites and Pages
  ●   Define Keywords / Best Bets
How can Admins Impact Relevance
OOTB?
• Administrators can…
  ●   Include / Exclude Crawled Properties
  ●   Define Managed Properties
  ●   Define and Configure Scopes
  ●   Control Authoritative Sites and Pages
  ●   Define Keywords / Best Bets
  ●   Manage the Thesaurus
       • Expansions
       • Replacements
Impacting Relevance – The API

• Developers can…
  ●   Adjust Managed Property Weightings
       • Impact the relevance of content with metadata that applies to
         your organization.
       • Set the weight of a ManagedProperty to a float > 0

 Schema ssp = new
        Schema(SearchContext.GetContext(ServerContext.Default));
 ManagedPropertyCollection properties = ssp.AllManagedProperties;
 ManagedProperty prop = properties[“MyManagedProperty”];
 prop.Weight = weight; // float value from 0 to float.MaxValue
 prop.Update();
Impacting Relevance – The API

• Developers can…
    ●   Adjust Managed Property Weightings
    ●   Adjust Managed Property Length Normalization
         • Impact the relevance based on the amount of text contained
           in the property (i.e. Chapter vs Book)
         • Set the length normalization property on a ManagedProperty
           to a value between 0 and 1.
Schema ssp = new
       Schema(SearchContext.GetContext(ServerContext.Default));
ManagedPropertyCollection properties = ssp.AllManagedProperties;
ManagedProperty prop = properties[“MyManagedProperty”];
prop.LengthNormalization = len; // float value from 0 to 1
prop.Update();
Impacting Relevance – The API

• Developers can…
   ●   Adjust Managed Property Weightings
   ●   Adjust Managed Property Length Normalization
   ●   Modify Ranking Parameters
        • Provides developers with the ability to impact file type
          ranking, term frequency, click distance, etc.



 Ranking rank = new
     Ranking(SearchContext.GetContext(ServerContext.Default));
 RankingParameter rp = rank.RankingParameters[rankingParamName];
 rp.Value = value; // float value
 rank.StartRankingUpdate(RankingUpdateType.FullUpdate);
Ranking Parameters
Parameter                Default    Description
k1                       16.404     Saturation constant for term frequency.
Kqir                     2.12766    Saturation constant for click distance.
wqir                     36.032     Weight of click distance for calculating relevance.
Kud                      9.174312   Saturation constant for URL depth.
wud                      31.468     Weight of URL depth for calculating relevance.
                                    Weight for ranking applied to content in a language
languageprior            0
                                    that does not match the language of the user.
                                    Weight of HTML content type for calculating
filetypepriorhtml        166.983
                                    relevance.
                                    Weight of Microsoft Office Word content type for
filetypepriordoc         163.109
                                    calculating relevance.
                                    Weight of Microsoft Office PowerPoint content type
filetypepriorppt         163.367
                                    for calculating relevance.
                                    Weight of Microsoft Office Excel content type for
filetypepriorxls         153.097
                                    calculating relevance.
                                    Weight of XML content type for calculating
filetypepriorxml         158.943
                                    relevance.
                                    Weight of plain text content type for calculating
filetypepriortxt         153.051
                                    relevance.
                                    Weight of list item content type for calculating
filetypepriorlistitems   0
                                    relevance.
                                    Weight of Microsoft Outlook e-mail message
Filetypepriormessage     160.76
                                    content type for calculating relevance.
Relevance API Summary
•   Acquiring Search Context
     ●   SearchContext ctx = SearchContext.GetContext(ServerContext.Default);
     ●   Schema ssp = new Schema(ctx);


•   Working with Managed Properties
     ●   ManagedPropertyCollection properties = ssp.AllManagedProperties


•   Working with Ranking Parameters
     ●   SearchContext ctx = SearchContext.GetContext(ServerContext.Default);
     ●   Ranking rank = new Ranking(ctx);
     ●   RankingParameter rp = rank.RankingParameters[“filetypepriordoc”];
Relevance Demos

• OOTB Relevance Configuration Options

• Extending the Platform for Search
  Administrators

• Trimming Content from the Indexer
IMPROVING THE SEARCH
EXPERIENCE
Improving Search Experience Topics

•   OOTB Capabilities
•   Community-based Solutions
•   MOSS Search API
•   Demo
Out-of-the-box Capabilities

• Web Parts
  ●   Keyword Entry
  ●   Paging
  ●   Summary
  ●   Statistics
  ●   Keywords /
      Best Bets
  ●   High
      Confidence
  ●   Core Results
Community-based Solutions
•   Wildcard Search
•   Search as You Type
•   Faceted Search
•   SPAdvancedSearch




                         http://www.codeplex.com/SPAdvancedSearch
                         http://www.codeplex.com/FacetedSearch
Core Search API

• KeywordQuery
  ●   Allows developers to easily leverage search
      capabilities
  ●   Uses “Keyword” syntax (i.e. Author:Erik +SharePoint)
• FullTextSqlQuery
  ●   Allows developers to fully customize the search
  ●   Uses Enterprise Search SQL Syntax
• Web Services
  ●   Allows developers to use search in applications
      outside of SharePoint
  ●   Can use Keyword or FullText
Query Classes


                  Query




   KeywordQuery       FullTextSqlQuery
KeywordQuery

• Microsoft.Office.Server.Search.Query.KeywordQuery
• Constructed with an SPSite (or ServerContext)
• Properties of importance
   ●   SelectProperties (columns – SELECT clause)
   ●   QueryText (keywords – WHERE clause)
   ●   SortList (columns – ORDER BY clause)
   ●   StartRow
   ●   RowLimit
   ●   ResultTypes
KeywordQuery

• Returns a ResultTableCollection
   ●   Contains ResultTables (ResultTable implements
       IDataReader)
• Query Syntax
Query Type                    Format                  Example
Single keyword                [keyword]               Finch
Multiple keyword              [keyword1] [keyword2]   Purple Finch
Keyword inclusion/exclusion   +[keyword to include]   +finch –purple
                              –[keyword to exclude]
Managed Property match        [property]:[value]      Habitat:fields
KeywordQuery Example
KeywordQuery query = new KeywordQuery(SPContext.Current.Site);
query.RowLimit = 10;
query.ResultTypes = ResultType.RelevantResults;
query.SelectProperties.Add("Title"); // select
query.QueryText = “finch"; // where
query.SortList.Add("Rank", SortDirection.Descending); // order by
ResultTableCollection results = query.Execute();
ResultTable relevantResults = results[ResultType.RelevantResults];
FullTextSqlQuery

•   Microsoft.Office.Server.Search.Query.FullTextSqlQuery
•   Inherits from Query base class (just like KeywordQuery)
•   Constructed with an SPSite (or ServerContext)
•   Properties of importance
    ●   QueryText (SQL Syntax)
         • SELECT … FROM SCOPE() WHERE … ORDER BY …
    ●   StartRow
    ●   RowLimit
    ●   ResultTypes
FullTextSqlQuery Example
FullTextSqlQuery query = new
       FullTextSqlQuery(SPContext.Current.Site);
query.RowLimit = 10;
query.QueryText = “SELECT Rank, Title, Path, Habitat
       FROM SCOPE()
       WHERE FREETEXT(defaultproperties, ‘finch’)
       ORDER BY Rank DESC";
query.ResultTypes = ResultType.RelevantResults;
ResultTableCollection results = query.Execute();
ResultTable relevantResults = results[ResultType.RelevantResults];
Web Service

• http://server/_vti_bin/Search.asmx
• Methods of Importance
  ●   Query(string queryPacketXml) – Returns Xml String
  ●   QueryEx(string queryPacketXml) – Returns DataSet
  ●   GetSearchMetadata() – Returns DataSet of Scopes and
      Managed Properties


• Tips
  ●   Query Packet Schema: http://msdn.microsoft.com/en-
      us/library/ms563775.aspx
  ●   Can use Keyword or MSSQLFT syntax
Search User Interface Demos

• Wildcard Search Results Web Part

• Related Searches Web Part

• Advanced Search Web Part
Recap
• Understand how to impact relevance using OOTB
  features

• Carefully test relevance changes

• Roll your own search user interface using
  KeywordQuery, FullTextQuery, or other aspects of the
  SharePoint API

• Be creative!
Resources
• Search Admin Toolkit:
  http://www.codeplex.com/SPSearchAdminToolkit

• SPAdvancedSearch: http://www.codeplex.com/SPAdvancedSearch

• SPSearchBench: Query Testing
  http://www.codeplex.com/SPSearchBench

• Faceted Search: http://www.codeplex.com/FacetedSearch

• SQL Search Syntax: http://msdn.microsoft.com/en-
  us/library/ms519321.aspx

• Evaluating and Customizing Relevance:
  http://msdn.microsoft.com/en-us/library/bb499682.aspx
Your Feedback is Important

Please fill out a session evaluation form and
  either put them in the basket near the exit
      or drop them off at the conference
                registration desk.

                Thank you!

								
To top