Docstoc

Technical Overview of FAST Search Server for SharePoint 2010

Document Sample
Technical Overview of FAST Search Server for SharePoint 2010 Powered By Docstoc
					Technical Overview of FAST Search
   Server 2010 for SharePoint
            Sezai Komur
    SharePoint Solutions Architect
                CSG
A Special Thanks to…
 What is FAST Search Server 2010 for
            SharePoint?
• Microsoft buys FAST Search and Transfer in
  2008 for $1.2 Billion US.
• Port of FAST ESP integrated with SharePoint
  2010.
• FS4SP is a new enhanced search engine
  integrated with SharePoint Server 2010
• FS4SP, FSIS, FSIA, FIS-E
    How is FAST Search better than
      SharePoint Server Search?
• Better search result quality
   – Better quality search engine relevancy and use of linguistics,
     stemming & lemmatisation in search processing
• Extreme Scale Search - Search Billions of documents with
  sub second search times
• Search Platform Extensibility
• Advanced Content Processing
   – Property Extraction
   – Document Processing Pipeline Extensibility
• Deep Refinement
   – Exact refinement number, SharePoint refiners aren't over the
     entire result set
• Advanced Filter Pack
   – Support for indexing 200+ file formats without the need to
     purchase numerous iFilters.
       How is FAST Search better than
         SharePoint Server Search?
• Advanced Sorting
   – Sort on Managed Properties and Rank Profiles
• Tuneable Relevance with Multiple Rank Profiles
• FQL Query Language
• Contextual Search
   – Tailor results and refinement to user profile or audience
• Rich Web Indexing
   – Dynamic web content and Javascript, highly customisable connector
• Similar Results Detection & Results Collapsing
• Thumbnails and Previews
   – SharePoint 2010 Word and PowerPoint results via Office Web Apps
• Visual Best Bets
     FAST Search on the Internet
FAST Search for Internet Sites provides a search
solution no matter what technology or platform is
used to build a website.
• http://www.seek.com.au
• http://www.realestate.com.au
• http://www.domain.com.au
• http://www.smh.com.au
• http://www.drive.com.au
• http://www.nab.com.au
             SharePoint Farm

              Web Server           Site Collection Admin UI        PowerShell                      Central Administration UI
                                   •   Deployment                  •    Schema configuration       •   Property mapping
                       Web Parts   •   User Context Management     •    Admin configuration        •   Property extraction
                                   •   Promotion/Demotion          •    Deployment                 •   Spell-checking
                                                                        configuration
 Custom       Query Web
front-end       Service
                                   FAST Search Server 2010 for SharePoint Farm
                  Federation
                 Object Model
                                                           Administration



                   FAST Search                                                                              Content SSA
                                                        Query                                                                  Content
                     Query                                                                                  (FAST Search
                                          Query        Matching          Item                               Connector)
                                        Processing                     Processing                           - SharePoint
                                                                                                            - BDC
                                                                                                            - Exchange         Content
 External
                                                                                                            - Web
federation
               People Search
                                                                                      FAST
  sources                                                                             Indexing
               (query/crawl)
                                        FAST Search     Indexing                      Connectors
                                       Authorization                    Web Link
                                                                        Analysis                                               Content
                                           (FSA)
                 Query SSA
               (Search Service
                                                                       Monitoring
                 Application)

   User
  Profiles                             Active        Microsoft System Center
                                                                                               !
                                       Directory     Operations Manager                        !
  FAST Search Service Applications
Two SharePoint Service Applications Communicate
with FAST Servers
• FAST Content Search Service Application
  – Connector Configuration
  – Crawling
• FAST Query Search Service Application
  – Queries and Results from associated Web Applications
  – Managed Property Mapping configuration
  – People Search
            FAST Web Services
Services that SharePoint communicates with
• Content Distributors
• Query Service
• Administration Service
• Resource Store
• Log Server
• See Install_Info.txt in FAST install folder and
  look in IIS
Simple Conceptual Architecture
Topology Diagram
                 FAST Search Sizing
• At least one dedicated FAST server for
  production.
• Physical is better than Virtual.
• Good Disk IO is important.
• 1 x SharePoint + 1 x SQL + 1 x FS4SP server is an
  ‘extra small’ deployment.
• Estimate # and size of items crawled to work out
  disk space required.
• See Capacity Planning white paper:
http://www.microsoft.com/downloads/en/details.aspx?FamilyID=65b799e3-
825c-4398-8cd7-3311d3297997
                               Medium Farm
     SP2010 Farm                                             FAST Search for SharePoint 2010 Farm




   WFE               WFE           FAST-ADM-1               FAST-ADM-2           FAST-FSTIDX-11    FAST-FSTIDX-12    FAST-FSTIDX-13
 Query SSA         Query SSA          Admin              Content Distributor 2    Index (Search)    Index (Search)    Index (Search)
                                Content Distributor 1   Indexing Dispatcher 2      4 Docprocs+       4 Docprocs+       4 Docprocs+
                               Indexing Dispatcher 1        Web Analyzer
                                   Web Analyzer             6 Docprocs+
SP2010 Services Farm                                     (Enterprise Crawler)




                                                                                 FAST-FSTIDX-21    FAST-FSTIDX-22    FAST-FSTIDX-23
                                                                                  (Index) Search    (Index) Search    (Index) Search
  SP Crawl       SP Crawl                                                            QR Server         QR Server         QR Server
 People Crawl   People Crawl




   Crawl DB     Crawl DB
Search Admin DB

  SQL 2008 Cluster
          Search Engine Basics
• Crawling
  – Gathering content to store in an index
• Indexing
  – Storing content in an index optimised for
    searching
• Querying
  – Users execute searches against the index
Crawling
                     Crawling
• Connecting to sources of content to download
  files and data for processing
• Downloading documents or files (Items)
• Working through URLs
  – List or directory of items to crawl
  – Following links to other items
• Extracting information from files
  – Converting file formats to text for processing
  – Identifying properties or fields of information
DEMO


FAST SEARCH SERVICE APPLICATIONS
FAST SYSTEM DIRECTORY
FAST WEB SERVICES
CONNECTORS & CRAWLING
Processing & Indexing
                            Item Processing
•   Format conversion
     –   IFilters
     –   Advanced Filter Pack (Oracle Outside In) - 200+ formats
•   Language and encoding detection
•   Lemmatizer                                                     Optional content pipeline
     –   linguistics normalization                                 stages:
•   Tokenizer
     –   word breaking
                                                                   •   XML Properties mapper
•   Entity extraction                                              •   Offensive content filter
     –   companies, locations                                      •   Verbatim (whole word)
•   DateTimeNormalizer                                                 extractor (loads dictionary for
     –   Date normalization                                            custom extraction, e.g.
•   Vectorizer
                                                                       product names)
     –   Create document vector for similarity searching           •   Field Collapsing
•   WebAnalyzer                                                    •   Entity Extraction (persons)
     –   anchor text and link cardinality analysis                 •   Document Processing Pipeline
•   PropertiesMapper                                                   Extension
     –   Map to crawled properties
•   PropertiesReporter
     –   report detected properties


    FAST Search stores data to its Search Index after processing completes
  Mapper
                    …




                …
   Entity
 Extraction


Lemmatization


  Language
  Detection

   Format
 Conversion
                        Document Processing Pipeline
Property Extraction




 Extract metadata from unstructured content
     Document Processing Pipeline
            Extensibility
• Items are processed in the Document Processing
  Pipeline after they are crawled and before they
  are stored in the index.
• Create and alter crawled property data.
• You can run code and pass data to other systems
  – CRM/ERP and other Line-of-business systems
  – Geocoding
  – OCR
  – Audio and Video Transcription – ramp.com
  – ‘Deep’ Search of raw data
  … The sky is the limit!
DEMO

DOCUMENT PROCESSING PIPELINE
CRAWLED PROPERTIES
MANAGED PROPERTIES
Search UI
Web Parts
                  Refiners
• Refinement Panel Web Part
• Add and edit refiners displayed by changing
  filter category definition XML.
• Properties specified in lower case, managed
  property must have refinement enabled
               Rank Profiles
• Configure Multiple Rank Profiles
• Allow Selection of Rank Profile in Search UI to
  change sorting
• Defaulting based on user profile
DEMO

SEARCH UI
REFINERS
DOCUMENT PREVIEWS
VISUAL BEST BETS
QUESTIONS?
                         Related Links
Sezai’s Blog
http://sharepoint-sezai-moss-2007.blogspot.com/

Enterprise Search IT Professional Training
http://technet.microsoft.com/en-us/enterprisesearch/ff960998

Debugging and Tracing Pipeline Extensibility Stages
https://blogs.msdn.com/b/thomsven/archive/2010/09/23/debugging-and-
tracing-fast-search-pipeline-extensibility-stages.aspx
http://techmikael.blogspot.com/2010/12/how-to-debug-and-log-fast-
search.html

Shyam Nyaran’s blog – Visual Refiner Web Part
http://www.dotnetbounce.com/archive/2011/02/06/visual-refiners-for-
sharepoint-server-2010-and-fast-search.aspx
                         Related Links
Phonetic People Name Search
http://www.kowalski.ms/2010/07/09/sharepoint-server-2010-phonetic-and-
nickname-search/
http://www.dotnetmafia.com/blogs/dotnettipoftheday/archive/2010/03/11/a-quick-
look-at-phonetic-people-search-in-sharepoint-2010.aspx

Reasons to go with FAST Search for SharePoint instead of regular SharePoint 2010
Search
http://searchunleashed.wordpress.com/2011/04/13/reasons-to-go-with-fast-search-
for-sharepoint-instead-of-regular-sharepoint-2010-search/

Three Main Reasons Why You Should Upgrade to FAST for SharePoint
http://sharepointmagazine.net/articles/business-user/three-main-reasons-why-you-
should-upgrade-to-fast-for-sharepoint

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:32
posted:11/21/2012
language:English
pages:33