Enterprise Search Project by yaohongm

VIEWS: 19 PAGES: 58

									Enterprise Search Project




                                                Detailed Design




            for Department of Emergency Services




                    April 2008




Submitted by: Getronics Australia Pty Limited
Proprietary and Confidential
 Enterprise Search Project                                                                                     Detailed Design

                                                                                          for Department of Emergency Services


Document Control


       All rights are strictly reserved. No part of this document may be reproduced in any form or by
       any means without prior written permission from Getronics.
       This document contains Getronics Confidential and Proprietary information which is provided
       specifically for evaluation by Department of Emergency Services on the understanding that it is
       not to be disclosed to any third party except with Getronics' express permission.
       The Getronics logo and name is a trade mark owned by Getronics PinkRoccade Nederland B.V. and is used under licence
       by Getronics Australia, a UXC company.


       Document History
       Document Author            Version    Release       Status                Comment
                                             Date

       Paul Townsend              0.1        12/03/2008    Initial preparation

       Nathan Gropman             0.2        19/03/2008    Review

       Megan McDonald             0.3        25/03/2008    Review                Internal review

       Daniel Chee     /     Paul 0.4        26/03/2008    Review                Content provision
       Townsend

       Nathan Gropman             0.5        26/03/2008    Review                Internal review

       Ivan Wilson                0.6        27/03/2008    Review                Final review

       Nathan Gropman             0.7        27/03/2008    Draft                 Customer review

       Nathan Gropman             0.8        1/04/2008     Draft                 Following customer review

       Nathan Gropman             1.0        3/04/2008     Release               Release for approval

       Megan McDonald             1.1        8/04/2008     Review                Changes made:
                                                                                       Database and service accounts to meet
                                                                                       DES naming standards;
                                                                                       User Interface showing BSS & SPES
                                                                                       graphic elements;
                                                                                       Add start addresses for remote sites in
                                                                                       Section 5.1.3.

       Megan McDonald             1.2        15/04/2008    Release               External     access   file   system    results
                                                                                 functionality confirmed.


       Distribution List
       Name                       Title                                                   Action
                                                                                          (approve, review, information)

       Department of Emergency Services

       Nina Meyers                Executive Manager                                       Approve

       Stephanie Forster          Manager, Business Solutions                             Approve

       Trung Tran                 Manager, Development Co-ordination                      Review

       David Horton               Manager, Application Management                         Review

       Arlene Fernandez           Project Coordinator                                     Review




Getronics Australia Pty Limited                                                                                        Page 2
V1.2 April 2008                                            Proprietary and Confidential
 Enterprise Search Project                                                                                      Detailed Design

                                                                                           for Department of Emergency Services


Document Control


       Name                       Title                                                    Action
                                                                                           (approve, review, information)

       Getronics

       Brett Lightfoot            Solutions Manager – Microsoft                            Approve

       Megan McDonald             Project Manager                                          Review

       Paul Townsend              SharePoint Consultant                                    Review

       Daniel Chee                SharePoint Developer                                     Review




Getronics Australia Pty Limited                                                                                        Page 3
V1.2 April 2008                                             Proprietary and Confidential
 Enterprise Search Project                                                                                Detailed Design

                                                                                     for Department of Emergency Services


Document Control


       The following signatories have been authorised to approve this document on behalf of the Project
       Board:

       For Department of Emergency Services:

       Name:                      Nina Meyers

       Title:                     Executive Manager

       Signature:

       Date:




       Name:                      Stephanie Forster

       Title:                     Manager, Business Solutions

       Signature:

       Date:



       For Getronics:


       Name:                      Brett Lightfoot

       Title:                     Solutions Manager – Microsoft

       Signature:

       Date:



       Note: Any work not explicitly included in this document is implicitly excluded from the project.




Getronics Australia Pty Limited                                                                                  Page 4
V1.2 April 2008                                       Proprietary and Confidential
 Enterprise Search Project                                                                                                         Detailed Design

                                                                                                        for Department of Emergency Services


Table of Contents


       1.       Introduction...................................................................................................................... 7
       1.1      Background .........................................................................................................7
       1.2      Related Documents...............................................................................................7
       1.3      Assumptions ........................................................................................................7
       1.4      Glossary Of Terms ................................................................................................8

       2.       Solution Overview ........................................................................................................... 10
       2.1      Current State ..................................................................................................... 10
       2.2      Future State ...................................................................................................... 10

       3.       SharePoint Configuration .................................................................................................. 12
       3.1      Overall Farm Design ........................................................................................... 12
       3.2      Server Configuration Servers In Farm.................................................................... 12
       3.3      Hardware Configuration ....................................................................................... 13
       3.4      Farm / Site Design.............................................................................................. 17
       3.5      Server Services .................................................................................................. 17
       3.6      Farm Level Configuration ..................................................................................... 18

       4.       External Access ............................................................................................................... 25
       4.1      File Based Content Removal ................................................................................. 27

       5.       Search Configuration........................................................................................................ 28
       5.1      Content Sources ................................................................................................. 28
       5.2      Crawl Schedules ................................................................................................. 33
       5.3      Scopes.............................................................................................................. 34
       5.4      File Types and iFilters.......................................................................................... 36

       6.       TeamSite Metadata ETL Process......................................................................................... 37
       6.1      High Level Architecture........................................................................................ 38
       6.2      Current Data Volumes ......................................................................................... 38
       6.3      SQL Server Database Design ................................................................................ 38
       6.4      Open Deploy Package.......................................................................................... 45
       6.5      SQL Server Transformation Package...................................................................... 45
       6.6      Housekeeping .................................................................................................... 46
       6.7      MetaData Deployment Schedules .......................................................................... 46

       7.       Top Ten Query ................................................................................................................ 47
       7.1      High Level Architecture........................................................................................ 47
       7.2      SQL Server Database Design ................................................................................ 47
       7.3      Data Transformation: SSIS Package ...................................................................... 48
       7.4      Process Schedules .............................................................................................. 49
       7.5      Query Logging ................................................................................................... 49
       7.6      Top Ten Query Web Part...................................................................................... 49

       8.       User Interface/Experience................................................................................................. 50
       8.1      Design Mock-ups ................................................................................................ 50
       8.2      Search User Interface ......................................................................................... 51
       8.3      DESPortal Simple Search ..................................................................................... 52
       8.4      Advanced search ................................................................................................ 52
       8.5      Search Results ................................................................................................... 55

       9.       Administration ................................................................................................................ 57
       9.1      Backups ............................................................................................................ 57
       9.2      Monitoring ......................................................................................................... 58
       9.3      Antivirus ........................................................................................................... 58

Getronics Australia Pty Limited                                                                                                             Page 5
V1.2 April 2008                                                       Proprietary and Confidential
 Enterprise Search Project                                                                                                            Detailed Design

                                                                                                           for Department of Emergency Services


Table of Contents


       Table   1 - Servers in Farm ..............................................................................................................12
       Table   2 - Web Applications.............................................................................................................14
       Table   3 - Farm Level Configuration..................................................................................................20
       Table   4 - Farm Level Security.........................................................................................................20
       Table   5 - Shared Services Provider Configuration ..............................................................................22
       Table   6 - Shared Services Provider Security .....................................................................................22
       Table   7 – Search Centre Configuration.............................................................................................24
       Table   8 - Search Centre Security ....................................................................................................24
       Table   9 – DNS Names ...................................................................................................................26
       Table   10 – URL Translations ...........................................................................................................26
       Table   11 - TeamSite Content..........................................................................................................29
       Table   12 – WorkSpace Content.......................................................................................................30
       Table   13 – Windows File Share Content ...........................................................................................33
       Table   14 – Crawl Schedule.............................................................................................................34
       Table   15 - Scopes.........................................................................................................................34
       Table   16 – Property Mapping..........................................................................................................35
       Table   17 - iFilters .........................................................................................................................36
       Table   18 – File Types ....................................................................................................................36
       Table   19: MSS_TS_MetaData_Staging_PDB......................................................................................39
       Table   20: TSS_Search_MetaData ....................................................................................................40
       Table   21: TSS_Import_Jobs ...........................................................................................................40
       Table   22: MSS_TS_MetaData_Staging_PDB – Database Security .........................................................41
       Table   23: MSS_TS_MetaData_PDB ..................................................................................................41
       Table   24: TS_Search_MetaData......................................................................................................42
       Table   25: TS_Ref_Document_Type .................................................................................................43
       Table   26: TS_Ref_Language...........................................................................................................43
       Table   27: TS_Ref_Security.............................................................................................................44
       Table   28: TS_Ref_Content_Type.....................................................................................................44
       Table   29: MSS_TS_MetaData_PDB – Database Security .....................................................................44
       Table   30: MetaData Deployment Schedules ......................................................................................46
       Table   31: MSS_TopTenQuery_PDB ..................................................................................................47
       Table   32: TTQ_Data......................................................................................................................48
       Table   33: TTQ_Log .......................................................................................................................48
       Table   34: MSS_TopTenQuery_PDB – Database Security .....................................................................48



       Figure   1 - Logical Topology of the Search Solution.............................................................................11
       Figure   2 – Overall Farm Design.......................................................................................................12
       Figure   3 - Site Hierarchy................................................................................................................17
       Figure   4 - External Access..............................................................................................................25
       Figure   5 – External Access Results ..................................................................................................27
       Figure   6 - Search Solution High Level Architecture ............................................................................38
       Figure   7 – Top Ten Query Architecture.............................................................................................47
       Figure   8 – Top 10 web part on the Advanced Search page ..................................................................49
       Figure   9 – Search Results ..............................................................................................................50
       Figure   10 – Advanced Search Page .................................................................................................51




Getronics Australia Pty Limited                                                                                                               Page 6
V1.2 April 2008                                                         Proprietary and Confidential
 Enterprise Search Project                                                                                   Detailed Design

                                                                                        for Department of Emergency Services


Introduction



       1.            Introduction

       1.1           Background
       This document provides the high-level design for the Enterprise Search implementation for
       Department of Emergency Services (DES). Specifically this document details:
                             Solution Overview;
                             Physical Design;
                             SharePoint Configuration;
                             Search Configuration;
                             TeamSite Metadata;
                             Design diagrams;
                             User Interface representations;
                             Backups and Administration;



       1.2           Related Documents
       Document Name                                                             Author                    Version

       Functional Specifications V1.4                                            Nathan Gropman            V1.4

       MSS2008-POC-Report-v1.1.doc                                               Nathan Gropman            V1.1

       20080311 - Development Environment Meeting Minutes.doc                    Megan McDonald            V1.0

       20080311 - Open Deploy Meeting Minutes.doc                                Megan McDonald            V1.0

       20080318 - Open Deploy Meeting Minutes - #2.doc                           Megan McDonald            V1.0

       DES - Medical Thesaurus - Proof of Concept - v1.0.doc                     Nathan Gropman            V1.0

       20080318 - Design Workshop.doc                                            Megan McDonald            V1.0




         1.3            Assumptions
       This document has been produced with the following assumptions:
              The design is to meet the requirements of the Functional Specification document, v1.4;
              The Design will take into account future development dependencies, such as Medical
              Thesaurus;
              DES – Getronics Project Teams will collaborate towards the deliverables;
              Assumptions included in the Enterprise Search Project Initiation Document will apply;




Getronics Australia Pty Limited                                                                                      Page 7
V1.2 April 2008                                          Proprietary and Confidential
 Enterprise Search Project                                                                                     Detailed Design

                                                                                          for Department of Emergency Services


Introduction



       1.4           Glossary Of Terms
       Term                       Definition

       AGLS Metadata Standard Australian Government Locator Service Metadata Standard is an Australian standard for
                              cross-domain resource description. The AGLS Metadata Standard is a set of 19
                              descriptive elements which government agencies can use to improve the visibility and
                              accessibility of their services and information over the Internet.

       DES Portal                 An intranet-based site owned and about DES, which allows users to access corporate
                                  information, policies and procedures, web-based applications, departmental
                                  phonebooks and calendars and links to other Government and non-Government
                                  websites.

       ETL                        Extraction, Transformation and Loading. The process of extracting data from one
                                  system and transforming it and loading into another system.

       Faceted Searching          Faceted Searching is the process of allowing users to refine searching based on a
                                  dynamic taxonomy of attributes returned in the search results. Typically this achieved
                                  by providing a list of possible attributes (facets) to choose from. E.g. Authors, Subject

       iFilter                    Plug-ins that allow the Windows Indexing Service and the Windows Desktop Search to
                                  index different various file formats so that they become searchable. Without an
                                  appropriate iFilter, contents of a file cannot be indexed.

       ISAM                       Index Sequential Access Method: a method for storing data for fast retrieval. In an
                                  ISAM system, data is organized into records which are composed of fixed length fields.
                                  Records are stored sequentially, originally to speed access on a tape system. A
                                  secondary set of hash tables known as indexes contain "pointers" into the tables,
                                  allowing individual records to be retrieved without having to search the entire data set.
                                  Relational databases can be easily built on an ISAM framework with the addition of
                                  logic to maintain the validity of the links between the tables. Typically the field being
                                  used as the link, the foreign key, will be indexed for quick lookup.

       Master Page                A global Microsoft ASP.Net page which is included by default in all referencing pages to
                                  provide a global layout and colour scheme.

       MSS2008                    Microsoft Search Server 2008

       Ontolica                   3rd Party Components that provide an enhanced search query experience, such as
                                  Wildcard searching.

       Protocol handler           The mechanism by which MSS2008 Enterprise understands how to gather the content
                                  to be indexed. Different Protocol Handlers exist for different content source types such
                                  as File, HTTP and Exchange Public Folders.

       Search Scope               A logical grouping of indexed items that meet specific criteria based on the indexed
                                  content. The content in the Search Scope can be limited to a single Content Source or
                                  span multiple content sources and use a Metadata attribute to group the items
                                  together.
                                  E.g. The Intranet scope can be defined as all content that has been sourced from the
                                  DESPortal content source.
                                  Alternatively, the Policy and Procedures scope can be defined as all content that have a
                                  Document Type of Policy or Procedure.

       SharePointWorks            3rd Party Universal protocol handler that integrates within the MSS2008 indexing
                                  engine. This component is required to enable the indexing of the TeamSite Meta Data.

       SQL Server Ent             Microsoft SQL Server 2005 Enterprise



Getronics Australia Pty Limited                                                                                       Page 8
V1.2 April 2008                                            Proprietary and Confidential
 Enterprise Search Project                                                                                    Detailed Design

                                                                                         for Department of Emergency Services


Introduction



       Term                       Definition

       SSIS                       SQL Server 2005 Integration Services: Provides ETL functionality to transform and
                                  migrate data from one source to another.

       SSP                        Shared Service Provider (part of MOSS), provides services for User Profiles, My Sites,
                                  Search and other shared functionality that can be used across the numerous web
                                  applications within the farm.

       STSADM                     SharePoint command line administration utility.

       TeamSite                   TeamSite is the Interwoven Content Management System. It provides a suite of web
                                  content management features to help DES staff create, manage and deliver content to
                                  be published on the DES Portal and selected DES Internet sites.

       Web Part                   A user interface element present on a web page which provides users with a discrete
                                  set of functionality. An example of a Web Part is the Search Query text box. This web
                                  part accepts the search query and passes the query to the results.

       Web Part Page              A MSS2008 web page that allows for one or more Web Parts to be added.

       WSS 3                      Microsoft Windows SharePoint Server V3.




Getronics Australia Pty Limited                                                                                      Page 9
V1.2 April 2008                                           Proprietary and Confidential
 Enterprise Search Project                                                                                  Detailed Design

                                                                                       for Department of Emergency Services


Solution Overview



       2.            Solution Overview
       2.1           Current State
       DES currently utilise Interwoven TeamSite (Version 6.5.0 SP2) for Content Management and
       intranet portal (DESPortal – http://desportal). The Interwoven TeamSite provides the content
       creation, workflow approval and content management for DES. Metadata is collected by
       TeamSite (using the AGLS – Scheme) to categorise and create the Taxonomy for the content.
       TeamSite is currently hosted on a Windows 2003 Server (BNETSI01) while the DES Portal
       Intranet is hosted on a Unix Server located within the DES Brisbane Data Centre.
       The current search solution has the following short comings:
                             Only the metadata of TeamSite documents is indexed and not the entire
                             document content;
                             Embedded documents, such as PDFs are not indexed for searching;
                             Search results return a number of irrelevant or zero results;
                             Popular searches are not recorded and therefore not available to users;
                             Multiple steps to launch .pdf document from search results;
                             Does not accommodate for common words or phrases;
                             Does not check for misspelt words or offer alternative keywords;
                             Users cannot search within search results



       2.2           Future State
       The solution to be implemented will continue to use Interwoven TeamSite (Version 6.5.0 SP2) for
       Content Management and intranet portal (DESPortal – http://desportal) as the point of access.
       DES’ Enterprise Search Solution will utilise Microsoft Search Server 2008 (MSS2008) to index the
       content stored within TeamSite as well as other content sources within the DES Environment. In
       addition MSS2008 provides the enhanced search experience providing relevant, fast and accurate
       results back to the users.
       The search portal will be delivered as part of the DES Portal and leverage the corporate intranet
       as the access gateway to search.




Getronics Australia Pty Limited                                                                                  Page 10
V1.2 April 2008                                         Proprietary and Confidential
 Enterprise Search Project                                                                                                                                                          Detailed Design

                                                                                                                                               for Department of Emergency Services


Solution Overview



       The following figure provides the logical architecture of the Search Solution.

                                                                           Microsoft Search Server 2008
         DesPortal – Browser Application                                  Query

              DES Portal Content Pages and navigation                         Search Querying
                                                                                                                        MOSS Index

                    MOSS Search Pages




                     Embedded MOSS Content
                        within a frameset

                                                                                                          Indexer
                                                                                                                                                                               Open Deploy
                                                                                                                    SharePoint Works                      Transformed                         Team Site
                         Ontolica Search Web                                                                                                                                    Meta Data
                                                                                                                    Team Site Indexing                      MetaData                          Meta Data
                                 Parts                                                                                                                                           Exporter
                                                                                                                                                          (SQL Server)



                                                                                                                        SharePoint
                                                                                                                         Indexing



                                                                                                                                                                         Workspace - MOSS
                                                                                                                        File share
                                                                                                                         Indexing
                                                                                                                                     Authorisation
                                                                                                                                                        File Share
                                                                                                                                     Authorisation


                                                        Authentication / Authorisation


                                                                                                                                                     DESQLD



       Figure 1 - Logical Topology of the Search Solution
       In addition to Microsoft Search Server 2008, the following 3rd party components will also be
       utilised:
                                           SharePoint Works Universal Connector (SPWorks)
                                           Ontolica Search

       2.2.1                   SharePoint Works Universal Connector
       The SharePoint Works Universal connector (SPWorks) provides the protocol handler for indexing
       TeamSite metadata and associated file content. The SPWorks Universal connector crawls and
       indexes the metadata stored for each content item in TeamSite and then performs a full text
       search upon the actual content item that is stored on the TeamSite file server.

       2.2.2                   Ontolica Search
       Ontolica Search (a 3rd party component) provides the enhanced search functionality not present
       within the base MSS2008 product. These enhancements include the ability to perform wild card
       searching, enhanced result sorting and result filtering.




Getronics Australia Pty Limited                                                                                                                                                              Page 11
V1.2 April 2008                                                                                 Proprietary and Confidential
 Enterprise Search Project                                                                                  Detailed Design

                                                                                       for Department of Emergency Services


SharePoint Configuration



       3.            SharePoint Configuration
       The following section details the architectural topology and configuration of the WSS
       implementation for the support of the DES Search Project and future possible enhancements.

       3.1           Overall Farm Design




       Figure 2 – Overall Farm Design



       3.2           Server Configuration Servers In Farm
       The following table lists the servers in the farm.

       Server Name                Role / Purpose                       Comments

       BNEAPP19                   Web Front End /          Application All user interfaces will be installed on this
                                  Server/ User interface               server.

       BNEAPP16                   Indexing / Farm Administration       All items that the user does not see.

       BNESQL04                   SQL Server Database                  Content and Configuration databases

       Bneeml01.desqld.internal   Outgoing Email                       SMTP Server within the domain
       (existing server)

       Table 1 - Servers in Farm




Getronics Australia Pty Limited                                                                                  Page 12
V1.2 April 2008                                         Proprietary and Confidential
 Enterprise Search Project                                                                                    Detailed Design

                                                                                         for Department of Emergency Services


SharePoint Configuration



       3.3           Hardware Configuration
       The following table lists the hardware configuration for Windows Server 2003.

       Server                        CPU                  Memory          HDD

       BNEAPP19                      2 CPU cores          2048            TBA – 15 gig

       BNEAPP16                      2 CPU cores          4096            TBA – 15 gig

       Note: The storage capacity indicated in this table is for the core operating system and program
       files and excludes index sizing.

       3.3.1         Software Versions and Pre-Requisites
       The implementation will utilise the 64bit version of Microsoft Search Server 2008 which includes
       Windows SharePoint Services V3
       In addition, the following software pre-requisites will be required:
                             Windows Server 2003 SP2 with IIS installed
                             Microsoft .Net Framework V2
                             Microsoft .Net Framework V3
                             Microsoft .Net Framework V2 SP1
                             Microsoft .Net Framework V3 SP1
                             Microsoft SQL Server 2005 with SP2
                             Adobe PDF Reader V8 (iFilter support)
                             Microsoft Filter Pack
                             Ontolica Search
                             SharePoint Works

       3.3.2         Storage Estimates
       The following table provides an estimate on the database and local disk storage requirements for
       the databases and full text indexes:

       Server          Requirement                               Estimated Size      Comments
                                                                 (GB)

       BNEAPP19        Full Text Index for TeamSite content      5 GB

       BNEAPP19        Full Text Index for File Shares (Kedron 500 GB                Rough estimate based           on   the
                       Only)                                                         information available

                       Total                                     505 GB

       BNEAPP16        Full Text Index for TeamSite content      5 GB

       BNEAPP16        Full Text Index for File Shares (Kedron 500 GB                Rough estimate based           on   the
                       Only)                                                         information available

                       Total                                     505 GB




Getronics Australia Pty Limited                                                                                    Page 13
V1.2 April 2008                                           Proprietary and Confidential
 Enterprise Search Project                                                                                           Detailed Design

                                                                                               for Department of Emergency Services


SharePoint Configuration



       Server            Requirement                                    Estimated Size     Comments
                                                                        (GB)

       BNESQL04          Database Metadata:                             5GB
                         MSS_SSP_Search_PDB (TeamSite
                         Content)

       BNESQL04          Database Metadata:                             720 GB             Rough estimate based            on     the
                         MSS_SSP_Search_PDB (File Share                                    information available
                         Content – Kedron Only)

                         Total                                          725 GB

                         Grand Total                                    1,735 GB



       Note: The file share index sizing is based on a rough estimate using information provided in the
       Functional Specification 1.4 document, §6.4.1.1 and §6.4.2.1.
       It is assumed that the storage for the full text indexes will be stored on a SAN attached drive.

       3.3.3          Web Applications
       The following table describes the web applications required for the farm.

 Web              Purpose         Server       Server   Load            Load       Host Header Authentication       Port   Security
 Application                                   IP       Balanced        Balanced   Name        Mode                        Model
                                                        URL             IP

 SharePoint     Provides          BNEAPP16     TBA                                 N/A             Integrated       8081   NTLM
 Central        administration
 Administration functionality
                for the farm

 Shared           Shared          BNEAPP16     TBA      http://mss-                mss-           Integrated        80     NTLM
 Services         Services                              ssp.desqld.in              ssp.desqld.int
 Provider         Provider                              ternal                     ernal

 Search           Provides    the BNEAPP19     TBA      http://searc               Search.desql    Integrated    / 80      NTLM
                  site collection                       h.desqld.inte              d.internal      Basic
                  for MSS2008                           rnal                       Extsearch.des
                                                                                   qld.internal

       Table 2 - Web Applications



       3.3.4          Security Account Summary
       The following table provides a summary for the required security accounts and groups:

       Account Role                     Object Type        Name                                   Comments

       Windows          SharePoint User                    DESQLD\MSS_Search                      Service Account to start the WSS
       Services Search Account                                                                    Search Service (For WSS Help
                                                                                                  content)

       Office SharePoint         Server User               DESQLD\MSS_Search                      (Identical account as above)
       Search Account




Getronics Australia Pty Limited                                                                                            Page 14
V1.2 April 2008                                                 Proprietary and Confidential
 Enterprise Search Project                                                                                      Detailed Design

                                                                                          for Department of Emergency Services


SharePoint Configuration



       Account Role                     Object Type    Name                                 Comments

       Web     Application          Pool User          DESQLD\MSS_WebApp                    Service Account for the Web
       Account                                                                              Application Identity

       SSP Service Account              User           DESQLD\MSS_SSP                       Service   account   for    the   SSP
                                                                                            Service

       Default    Content       Access User            DESQLD\MSS_CRAWL                     Service Account for the Default
       Account                                                                              Content Access Account

       MSS2008        Install         / User           DESQLD\MSS_Admin                     User account for the Root
       Administration Account                                                               Administrator of the MSS Farm

       TeamSite    Meta      Data   ETL User           DESQLD\MSS_TS_MetaDataETL            User account to access the
       Account                                                                              TeamSite MetaData SQL content

       SQL     Database         Access User            DESQLD\MSS_SQL_Admin                 Service Account for access to the
       Account                                                                              SQL Servers for all Farm
                                                                                            Database connectivity

       Top 10 Query SQL Access User                    DESQLD\MSS_SQL_TopTenQuer User account to access the Top
       Account                                         y                         10 Search Queries Database

       Farm Administrators Group        Group          DESQLD\SEC_MSS_PR_Farm_A             Security   Group      for        Farm
                                                       dministrators                        Administrators

       Shared   Service   Provider Group               DESQLD\SEC_MSS_PR_SSP_Ad             Security    Group         for    SSP
       Administrators Group                            ministrators                         administrators

       Search Site Administrators Group                DESQLD\SEC_MSS_PR_Search_            Security group for Search Site
       Group                                           Administrators                       administrators




       3.3.5         E-Mail Account Summary
       The following table provides a summary of the required e-mail accounts:

       e-mail Address                           Role                            Comments

       search_admin@emergency.qld.gov.au        Outbound from/reply to e- The email address used for the from and
                                                mail address              reply to of all emails originating from the
                                                                          SharePoint Farm

                                                Indexing    contact     e-mail The email address that external sites can
                                                address                        contact if issues arise when indexing their
                                                                               site




       3.3.6         Database Summary
       The following table provides a summary of the Databases used in the search solution:

       Database Name                            Role                            Comments

       MSS_Sharepoint_Config_MSS_PDB            Farm configuration

       MSS_WSS_Search_BNEAPP19_PDB              Windows SharePoint services
                                                search database



Getronics Australia Pty Limited                                                                                         Page 15
V1.2 April 2008                                            Proprietary and Confidential
 Enterprise Search Project                                                                           Detailed Design

                                                                                for Department of Emergency Services


SharePoint Configuration



       Database Name                  Role                              Comments

       MSS_WSS_SSP_Content_PDB        Shared    services     provider
                                      content database

       MSS_SSP_PDB                    SSP configuration database

       MSS_SSP_Search_PDB             Metadata index database

       MSS_SearchCentre_Content_PDB   Search       centre      web
                                      application content database

       WSS_AdminContent_<GUID>        Central       administration This database is created automatically
                                      content database             when the farm is created.     The term:
                                                                   <GUID> is replaced with a globally unique
                                                                   identifier.

       MSS_TS_MetaData_Staging_PDB    TeamSite Metadata staging
                                      database

       MSS_TS_Metadata_PDB            TeamSite            Metadata
                                      transformed    database for
                                      indexing

       MSS_TopTenQuery_PDB            Top Ten       search   queries
                                      database




Getronics Australia Pty Limited                                                                           Page 16
V1.2 April 2008                                  Proprietary and Confidential
 Enterprise Search Project                                                                                             Detailed Design

                                                                                           for Department of Emergency Services


SharePoint Configuration



       3.4           Farm / Site Design
       The following section details the configuration for the farm, web applications and sites.

       3.4.1         Site Hierarchy
       The following diagram shows the planned Site Hierarchy:
                                            MSS 2008 Server Farm




                                                                                                              Search
                              Central Administration        MSS-SSP




                                                                                     Standard Top Level Site




                                                                                           Sub Site /search




       Figure 3 - Site Hierarchy

       3.5           Server Services

       3.5.1         Services On Server: BNEAPP19
       The following services have been configured to run on the server: BNEAPP19:
                             Windows SharePoint Services Web Application
                             Ontolica Search



       3.5.2         Services On Server: BNEAPP16
       The following services have been configured to run on the server: BNEAPP16:
                             Central Administration
                             Office Search Service
                             SPWorks Universal Connector
                             Windows SharePoint Services Web Application
                             Windows SharePoint Services Search



Getronics Australia Pty Limited                                                                                             Page 17
V1.2 April 2008                                             Proprietary and Confidential
 Enterprise Search Project                                                                                          Detailed Design

                                                                                               for Department of Emergency Services


SharePoint Configuration



       3.6           Farm Level Configuration
       The following table details the configuration for Farm Level attributes:

       Configuration              Item                   Value                                 Comments
       Section

       Configuration              Database Server        BNESQL04
       Database
                                  Database Name          MSS_Sharepoint_Config_MSS_
                                                         PDB

                                  Database         Access DESQLD\MSS_SQL_Admin                 Passwords will be supplied during
                                  Username                                                     the deployment phase

       SharePoint      Central Server                    BNEAPP16
       Admin             Web
       Application                Port                   8081

                                  Authentication         NTLM
                                  Provider

                                  Default Time Zone      GMT+10 (Brisbane)

                                  Outbound         Email search_admin@emergency.qld            The email address used for the
                                  Address                .gov.au                               from and reply to of all emails
                                                                                               originating from the SharePoint
                                                                                               Farm

       Office     SharePoint Indexing Server             BNEAPP16
       Server        Search
       Service                    Query Server           BNEAPP19

                                  Indexing      contact search_admin@emergency.qld             The email address that external
                                  email address         .gov.au                                sites can contact if issues arise
                                                                                               when indexing their site

                                  Farm Search Service DESQLD\MSS_Search                        The service account      the    search
                                  Account                                                      service will run under
                                                                                               Note: Passwords will be supplied
                                                                                               during the deployment phase

                                  Indexer Performance    Partly Reduced                        Default value.   Balance between
                                                                                               load on the SQL Server and good
                                                                                               indexing performance.

                                  Web Front End and Use  all   web    front            end
                                  Crawling          computers for crawling




Getronics Australia Pty Limited                                                                                               Page 18
V1.2 April 2008                                                 Proprietary and Confidential
 Enterprise Search Project                                                                                        Detailed Design

                                                                                             for Department of Emergency Services


SharePoint Configuration



       Configuration              Item                   Value                               Comments
       Section

                                  Index Location         E:\Program       Files\Microsoft SAN Connected Local Drive
                                                         Office
                                                         Servers\12.0\Data\Office
                                                         Server\Applications

       Windows SharePoint Service Account                DESQLD\MSS_Search                   The service account the       WSS
       Services   Search                                                                     Search service runs under.
       Service                                                                               Note: Passwords will be supplied
                                                                                             during the deployment phase

                                  Content          Access DESQLD\MSS_CRAWL                   The service account the       WSS
                                  Account                                                    Search service runs under.
                                                                                             Note: Passwords will be supplied
                                                                                             during the deployment phase

                                  Search Database      – BNESQL04
                                  Database Server

                                  Search     Database MSS_WSS_Search_BNEAPP19
                                  Name                _PDB

                                  Search      Database Windows Authentication
                                  Authentication

                                  Indexing Schedule      Not configured, executed on As this is only for the help content
                                                         demand.                     then the schedule will be on-
                                                                                     demand as content will not often
                                                                                     update




Getronics Australia Pty Limited                                                                                        Page 19
V1.2 April 2008                                               Proprietary and Confidential
 Enterprise Search Project                                                                                        Detailed Design

                                                                                             for Department of Emergency Services


SharePoint Configuration



       Configuration              Item                  Value                                Comments
       Section

       Usage        Analysis Enable Logging             Yes
       Processing
                                  Log File Location     C:\Program      Files\Common
                                                        Files\Microsoft   Shared\Web
                                                        Server Extensions\12\Logs

                                  Number of log files   1

                                  Enable         Usage Yes
                                  Analysis Processing

                                  Processing time:      Start: 3am
                                                        End: 4am


       Table 3 - Farm Level Configuration

       3.6.1.1       Security
       The following table lists the Farm and Server administrators to be assigned to the farm:

       Display Name          Username / Group Name                           Permissions                  Comments

       MSS-Farm-             DESQLD\SEC_MSS_PR_Farm_Administrators Full Access to the Farm
       Administrators                                                        Member     of the local
                                                                             server     administrators
                                                                             group

       A-MSS-                DESQLD\MSS_Admin                                Site          Collection
       Administrator                                                         Administrator
                                                                             Farm Administrator
                                                                             Member of local server
                                                                             Administrators group


       Table 4 - Farm Level Security




Getronics Australia Pty Limited                                                                                        Page 20
V1.2 April 2008                                               Proprietary and Confidential
 Enterprise Search Project                                                                                        Detailed Design

                                                                                             for Department of Emergency Services


SharePoint Configuration



       3.6.2          Shared Services Provider
       The Shared Services Provider (SSP) is a logical construct in Microsoft Search Server that provides
       the environment with specific services across all of the web applications and sites that exist
       within the boundaries of the SSP. A single SSP provides the following services for its members:
              Office SharePoint Server Search: Indexing and querying and search results from the
              various content sources within and external to the organisation.
       The following section details the configuration of the Shared Services Provider:

       Configuration         Item                      Value                                             Comments
       Section

       SSP          Web IIS Web Site Name              SharePoint - mss-ssp.desqld.internal80            Default Value Set
       Application:
       IIS Web Site          Port                      80

                             Authentication Type       Integrated and Basic

                             Host Header               mss-ssp.desqld.internal

                             Path                      c:\Inetpub\Wwwroot\wss\VirtualDirectories\ Default Value Set
                                                       mss-ssp.desqld.internal80

                             Authentication Provider   NTLM                                              Default Value Set

                             Allow Anonymous           No                                                Default Value Set

                             Use SSL                   No

                             Load Balanced URL         http://mss-ssp.desqld.internal                    Default Value Set

                             Zone                      Default                                           Default Value Set

                             Application Pool Name     SharePoint - mss-ssp.desqld.internal80            Default Value Set

                             Application           Pool DESQLD\S-MSS-WebApp                              Note: Passwords will
                             Username                                                                    be supplied during the
                                                                                                         deployment phase

                             Database Server           BNESQL04                                          Default Value Set

                             Database Name             MSS_WSS_SSP_Content_PDB

                             Database          Access Windows Authenticated                              Default Value Set
                             Scheme

                             Search Service Provided Office SharePoint Server Search                     Default Value
                             By

                             Default Time Zone         GMT+10 (Brisbane)


Getronics Australia Pty Limited                                                                                          Page 21
V1.2 April 2008                                               Proprietary and Confidential
 Enterprise Search Project                                                                                     Detailed Design

                                                                                          for Department of Emergency Services


SharePoint Configuration



       Configuration         Item                     Value                                           Comments
       Section

                             Self-Service        Site Disabled                                        Default Value
                             Creation

       SSP                   SSP Name                 MSS-SSP
       Configuration
                             Web Application          SharePoint - Mss-ssp.desqld.internal80

                             SSP Service Username     DESQLD\S_MSS_SSP                                Default value

                             SSP Database Server      BNESQL04                                        Default value

                             SSP Database Name        MSS_SSP_PDB                                     Note: Passwords will
                                                                                                      be supplied during the
                                                                                                      deployment phase

                             SSP Database      Access Windows Authenticated                           Default value
                             Scheme

                             Search Database Server   BNESQL04

                             Search Database Name     MSS_SSP_Search_PDB                              Default value

                             Search Database Access Windows Authenticated                             Default value
                             Scheme

                             Index Server             BNESQL04

                             SSL For Web Services     No                                              Default value


       Table 5 - Shared Services Provider Configuration
       The following table lists the SSP administrators:

       Display Name               Username / Group Name          Permissions                          Comments

       MSS-SSP-                   DESQLD\SEC_MSS_PR_SSP_A Full Access to the SSP Site
       Administrators             dministrators           Site Collection Administrator

       MSS-Admin                  DESQLD\MSS_Admin               Full access to the SSP Site
                                                                 Site Collection Administrator

       S-MSS-CRAWL                DESQLD\MSS_CRAWL               Read Access


       Table 6 - Shared Services Provider Security




Getronics Australia Pty Limited                                                                                       Page 22
V1.2 April 2008                                            Proprietary and Confidential
 Enterprise Search Project                                                                                         Detailed Design

                                                                                              for Department of Emergency Services


SharePoint Configuration



       3.6.3          Search Centre
       The Search Centre Site Web Application provides the portal or site collection for the Search
       functionality. The Search site is based on the default Ontolica Search Centre.
       The following table details the configuration for the Search Centre web application:

       Configuration         Item                  Value                                                  Comments
       Section

       Search    Centre IIS  Web              Site SharePoint-search.desqld.internal80                    Default Value Set
       Web Application: Name
       IIS Web Site
                             Port                  80

                             Host Header           search.desqld.internal
                                                   extsearch.desqld.internal

                             Path                  c:\Inetpub\Wwwroot\wss\VirtualDirectories              Default Value Set
                                                   \search.desqld.internal80

                             Authentication        NTLM & Basic Authentication                            Default Value Set
                             Provider

                             Allow Anonymous       No                                                     Default Value Set

                             Use SSL               No

                             Load       Balanced http://search.desqld.internal                            Default Value Set
                             URL

                             Zone                  Default                                                Default Value Set

                             Application     Pool SharePoint-search.desqld.internal80
                             Name

                             Application     Pool DESQLD\S-MSS-WebApp                                     Note: Passwords will
                             Username                                                                     be supplied during the
                                                                                                          deployment phase

                             Database Server       BNESQL04                                               Default Value Set

                             Database Name         MSS_SearchCentre_Content_PDB

                             Database      Access Windows Authenticated                                   Default Value Set
                             Scheme

                             Search    Service Office SharePoint Server Search                            Default Value
                             Provided By

                             Default Time Zone GMT+10 (Brisbane)



Getronics Australia Pty Limited                                                                                           Page 23
V1.2 April 2008                                                Proprietary and Confidential
 Enterprise Search Project                                                                                    Detailed Design

                                                                                         for Department of Emergency Services


SharePoint Configuration



       Configuration         Item              Value                                                 Comments
       Section

                             Self Service Site Disabled                                              Default Value Set
                             Creation


       Table 7 – Search Centre Configuration



       3.6.3.1       Security
       The following table lists the Search Centre security permissions:

      Display Name            Username / Group Name                         Permissions                 Comments

      MOSS-Search-            DESQLD\SEC_MSS_PR_Search_Administrators Search Owners
      Administrators

      NT Authenticated DESQLD\NT Authenticated Users                        Search Visitors             Members of this
      Users                                                                                             group can use the
                                                                                                        search system

      MOSS Admin              DESQLD\MSS_Admin                              Full  Access    to  the
                                                                            Search    Centre   Web
                                                                            Application
                                                                            Search     Centre    site
                                                                            collection administrator

      S-MSS-CRAWL             DESQLD\MSS_CRAWL                              Read Access


       Table 8 - Search Centre Security




Getronics Australia Pty Limited                                                                                    Page 24
V1.2 April 2008                                           Proprietary and Confidential
 Enterprise Search Project                                                                                 Detailed Design

                                                                                      for Department of Emergency Services


External Access



       4.            External Access
       External Access will be provided by DES’ current External Access solution. This solution is a
       reverse proxy that manages the caching of user credentials and request brokering to servers
       located on the trusted internal network.
       URL mapping of the MSS2008 and search results will be performed by a combination of the
       reverse proxy and the native Microsoft Search Server 2008 Alternate Access Mapping solution.
       The following diagram highlights the URL mapping through the External Access system:




       Figure 4 - External Access
       The reverse proxy acts as a publishing server which receives login credentials and the query
       string from the external client and forwards to BNEAPP19. The proxy also acts as the interpreter
       between the two domain names of the external domain and the internal domain name.
       There are two types of Authentication used for external access:
                             Integrated authentication: Allows internal DES network clients to seamlessly
                             authenticate to the MSS2008 site;
                             Basic Authentication: Allows the reverse proxy to pass the user’s credentials on
                             to the MSS2008 site.
       The following table describes the DNS names used for the Search Solution:

       DNS Name                                       Usage

       search.desqld.internal                         DES Internal Network use only and will be configured to point
                                                      to the Web Front End server




Getronics Australia Pty Limited                                                                                 Page 25
V1.2 April 2008                                        Proprietary and Confidential
 Enterprise Search Project                                                                                     Detailed Design

                                                                                          for Department of Emergency Services


External Access



       DNS Name                                           Usage

       extsearch.desqld.internal                          The reverse proxy server will forward all external search
                                                          requests to this address, which points to the Web Front End
                                                          server.
                                                          This will permit MSS to distinguish between internal and
                                                          external requests.

       https://desportal.emergency.qld.gov.au/search      DES External Access use only and will be configured to point
                                                          to the Reverse Proxy server.


       Table 9 – DNS Names
       MSS2008 Alternate Access Mapping and External Resource Mapping provide a convenient
       mechanism to allow MSS2008 to automatically replace the URLs being returned to the end client
       from the search site.
       The following table defines the Alternate Access mappings and External Resource Mappings:

       Access Mapping Source           Access Mapping Replacement          Zone              Comments

       http://search.desqld.internal   N/A                                 Default           This is the default zone used
                                                                                             for internal network access.
                                                                                             No mapping occurs for this url

       http://search.desqld.internal   http://extsearch.desqld.internal    Extranet          For extranet access, the URL
                                                                                             will be mapped from the
                                                                                             internal default zone to the
                                                                                             Extranet Zone and vice-versa
                                                                                             by MSS2008

       http://desportal                http://desportal.emergency.qld.g    Extranet          External Resource Mapping to
                                       ov.au                                                 translate the search results
                                                                                             URL from the internal DES
                                                                                             Portal content to the external
                                                                                             accessible URL.

       File://bnefil01                 http://noserver-                    Extranet          External Resource Mapping to
                                       bnefil01.emergency.qld.gov.au                         translate the search results
                                                                                             URL from the internal file
                                                                                             share    location BNEFILE01
                                                                                             content to a detectable url.
                                                                                             See Section 4.1.
                                                                                             File Based Content Removal,
                                                                                             below for explanation.


       Table 10 – URL Translations




Getronics Australia Pty Limited                                                                                     Page 26
V1.2 April 2008                                            Proprietary and Confidential
 Enterprise Search Project                                                                              Detailed Design

                                                                                   for Department of Emergency Services


External Access



       4.1           File Based Content Removal
       When accessing the site from External Access, it is not possible to retrieve search results for file
       share based content.
       By using the External Resource Mapping feature, the search results can detect the url:
       http://noserver.emergency.qld.gov.au and replace the locations and hyperlinks to that content
       with a textual string such as: “This file is not available from External Access. Please search again
       when you are inside the DES network.”. A “No Entry” button, as shown in the screenshot below,
       will be shown for each file based content result.
       The file path will be presented to the user. All hyperlink features, such as underlining and html
       links, will be removed.
       Filtering out the file share content allows for the ranking algorithm and results paging to continue
       functioning while minimising the complexity of selecting search scopes based on internal and
       extranet based clients.
       Extra resource mapping are required for each file share to be removed from the search results.




       Figure 5 – External Access Results
       Note: Screenshot is from POC environment developed to demonstrate functionality. Final user
       interface will be as shown in Section 8. User Interface/Experience.



Getronics Australia Pty Limited                                                                              Page 27
V1.2 April 2008                                     Proprietary and Confidential
 Enterprise Search Project                                                                                    Detailed Design

                                                                                         for Department of Emergency Services


Search Configuration



       5.            Search Configuration
       The following sections detail the configurations that will be applied to the MSS2008
       implementation to satisfy the search requirements.

       5.1           Content Sources
       Content Sources provide the source of the data to be indexed. The following content sources will
       be created:

       5.1.1         TeamSite Content
       TeamSite content will be indexed via the SharePoint works universal connector. The following
       defines the configuration for this content source:

       Content Source        Item           Value                                                Comments
       Configuration

       Name and              Content        SPWORKS_DesPortal
       Type:                 Source Name

                             Content        SharePoint Works Universal Connector
                             Source Type

       Data Source:          SQL Server:    BNESQL04

                             Database       MSS_TS_MetaData_PDB                                  The transformed database.
                             Name:                                                               See    section: 6.3.2   -
                                                                                                 Database:
                                                                                                 MSS_TS_MetaData

                             Content        DESQLD\MSS_TS_MetaDataETL                            Windows          integrated
                             Access                                                              security
                             Account:

       Content               Item           select ID as SPW_ID, D_Date_Modified as Returns all of the modified
       Definition            Enumerator:    SPW_LASTUPDATE, F_Deleted as SPW_DELFLAG content since the last
                             (SQL)          from TSS_Search_MetaData                 index.
                                            where D_Date_Modified < [SPW_LASTUPDATE]

                             Primary   Key [T_Path]                                              The path column will be
                             (SPW_ID):                                                           treated as the primary key
                                                                                                 to identify each record.

                             Item       Data SELECT        ID,
                             (SQL)                         T_Path,
                                                           T_Title,
                                                           T_Keywords,
                                                           T_Description,
                                                           T_Subject,
                                                           T_Creator_PersonalName,
                                                           T_ContactName,


Getronics Australia Pty Limited                                                                                    Page 28
V1.2 April 2008                                           Proprietary and Confidential
 Enterprise Search Project                                                                                          Detailed Design

                                                                                               for Department of Emergency Services


Search Configuration



       Content Source        Item             Value                                                    Comments
       Configuration

                                                                 T_ContactUrl,
                                                                 T_Publishing_Division,
                                                                 T_Document_Type,
                                                                 D_Date_Created,
                                                                 D_Date_Under_Review,
                                                                 T_Audience,
                                                                 D_Date_Modified,
                                                                 D_Date_Valid,
                                                                 T_Version,
                                                                 T_Child_Procedure,
                                                                 T_Update_Comment,

                                                       T_Creator_CorporateName,
                                                             T_Creator_Jurisdiction,

                                                       T_Publisher_CorporateName,

                                                    T_Publisher_Jurisdiction,
                                                          T_Function,
                                                          T_Language,
                                                          T_Status,
                                                          T_Security,
                                                          T_Type
                                              FROM TSS_Search_MetaData
                                              WHERE ID = [SPW_ID]

                             Unstructured     File (network share)
                             Data Options:

                             File Location:   \\bnetsi01\default\main\desportal\S
                                              TAGING\[PATH]

                             Mapped URL:      http://desportal/[PATH]

                             File Extension Dim ext As String                                          Visual Basic Code to return
                             (Script)       ext= HOST.GetStringValue("SPW_ID")                         the file extension of each
                                              Dim p as Integer                                         content      item.       File
                                              p=ext.LastIndexOf(".")                                   extensions are used to
                                              p +=1                                                    detect appropriate iFilter
                                              ext=ext.SubString(p)                                     for the full content index.
                                              Return ext

                             Common Meta              Title: T_Title
                             Data hints
                                                      Author: T_Creator_PersonalName


       Table 11 - TeamSite Content




Getronics Australia Pty Limited                                                                                           Page 29
V1.2 April 2008                                                 Proprietary and Confidential
 Enterprise Search Project                                                                                Detailed Design

                                                                                     for Department of Emergency Services


Search Configuration



       5.1.2         Workspace – Current MOSS Collaboration Platform
       The Workspace content will be indexed via the standard Microsoft Office SharePoint Protocol
       Handler. The following defines the configuration for this content source:

      Content Source         Item                  Value                                               Comments
      Configuration

                             Content Source Name   Workspace – SharePoint Site

                             Content Source Type   SharePoint Sites

                             Start Addresses       http://bneapp13

                             Crawl Settings        Crawl everything under the hostname for each Default Value
                                                   start address


       Table 12 – WorkSpace Content
       Note: This content will be indexed as part of Phase III in the search project.

       5.1.3         Windows File Shares
       The Windows File Shares within the DESQLD domain will be included into the search solution.
       The indexing will be performed utilising the standard File Shares protocol handler. This content
       will be indexed as part of Phase III in the search project. It has been included here for
       completeness, however actual shares and detailed analysis will be performed during that stage of
       the project. As part of this analysis, separate content sources may further be defined to assist in
       managing the indexing process.

      Content Source         Item                  Value                                               Comments
      Configuration

                             Content Source Name   File Share – Kedron Central

                             Content Source Type   File Shares

                             Start Addresses       file://bnefil01/QAS
                                                   file://bnefil01/QFRS
                                                   file://bnefil01/CDRS
                                                   file://bnefil01/SESD
                                                   file://bnefil01/BSS
                                                   file://bnefil01/DG
                                                   file://bnefil01/Home
                                                   file://bnefil01/Global
                                                   file://bnefil01/OSB

                             Crawl Settings        The folder and all subfolders of each start Default Value
                                                   address




Getronics Australia Pty Limited                                                                                Page 30
V1.2 April 2008                                       Proprietary and Confidential
 Enterprise Search Project                                                                                Detailed Design

                                                                                     for Department of Emergency Services


Search Configuration



      Content Source         Item                  Value                                               Comments
      Configuration

                             Content Source Name   File Share – Remote

                             Content Source Type   File Shares

                             Start Addresses       \\BDGROF02\Global
                                                   \\BDGROF02\Groups
                                                   \\BDGROF02\Home
                                                   \\BNLROF01\Global
                                                   \\BNLROF01\Home
                                                   \\BNLROF02\Global
                                                   \\BNLROF02\Groups
                                                   \\BNLROF02\Home
                                                   \\BNLROF03\Groups
                                                   \\CALROF01\Global
                                                   \\CALROF01\Groups
                                                   \\CALROF01\Home
                                                   \\CALROF02\Global
                                                   \\CALROF02\Groups
                                                   \\CALROF02\Home
                                                   \\CALROF03\Global
                                                   \\CALROF03\Groups
                                                   \\CALROF03\Home
                                                   \\CHLROF01\Global
                                                   \\CHLROF01\Groups
                                                   \\CHLROF01\Home
                                                   \\CNSROF01\Global
                                                   \\CNSROF01\Home
                                                   \\CNSROF02\Global
                                                   \\CNSROF02\Groups
                                                   \\CNSROF02\Home
                                                   \\CNSROF03\Global
                                                   \\CNSROF03\Groups
                                                   \\CNSROF03\Home
                                                   \\CNSROF04\Groups
                                                   \\COOROF01\Global
                                                   \\COOROF01\Groups
                                                   \\COOROF01\Home
                                                   \\EFMROF01\Global
                                                   \\EFMROF01\Groups
                                                   \\EFMROF01\Home
                                                   \\GLDROF01\Global




Getronics Australia Pty Limited                                                                                Page 31
V1.2 April 2008                                       Proprietary and Confidential
 Enterprise Search Project                                                                Detailed Design

                                                                     for Department of Emergency Services


Search Configuration



      Content Source         Item   Value                                              Comments
      Configuration

                                    \\GLDROF01\Groups
                                    \\GLDROF01\Home
                                    \\INFROF01\Global
                                    \\INFROF01\Groups
                                    \\INFROF01\Home
                                    \\IPSROF01\Global
                                    \\IPSROF01\Home
                                    \\IPSROF01Groups
                                    \\KMPROF01\Global
                                    \\KMPROF01\Groups
                                    \\KMPROF01\Home
                                    \\MBOROF01\Global
                                    \\MBOROF01\Groups
                                    \\MBOROF01\Home
                                    \\MCYROF01\Global
                                    \\MCYROF01\Groups
                                    \\MCYROF01\Home
                                    \\MFDROF01\Global
                                    \\MFDROF01\Groups
                                    \\MFDROF01\Home
                                    \\MISROF01\Global
                                    \\MISROF01\Groups
                                    \\MISROF01\Home
                                    \\MKYROF01\Global
                                    \\MKYROF01\Groups
                                    \\MKYROF01\Home
                                    \\MKYROF02\Global
                                    \\MKYROF02\Groups
                                    \\MKYROF02\Home
                                    \\ROKROF01\Global
                                    \\ROKROF01\Home
                                    \\ROKROF02\Global
                                    \\ROKROF02\Groups
                                    \\ROKROF02\Home
                                    \\ROKROF03\Groups
                                    \\RSTROF01\Global
                                    \\RSTROF01\Groups
                                    \\RSTROF01\Home
                                    \\SHLROF01\Global
                                    \\SHLROF01\Groups
                                    \\SHLROF01\Home



Getronics Australia Pty Limited                                                                Page 32
V1.2 April 2008                       Proprietary and Confidential
 Enterprise Search Project                                                                                    Detailed Design

                                                                                         for Department of Emergency Services


Search Configuration



      Content Source         Item                      Value                                               Comments
      Configuration

                                                       \\SPTROF02\Global
                                                       \\SPTROF02\Groups
                                                       \\SPTROF02\Home
                                                       \\TVLROF01\Global
                                                       \\TVLROF01\Home
                                                       \\TVLROF02\Global
                                                       \\TVLROF02\Groups
                                                       \\TVLROF02\Home
                                                       \\TVLROF03\Global
                                                       \\TVLROF03\Groups
                                                       \\TVLROF03\Home
                                                       \\TVLROF04\Groups
                                                       \\TWMROF01\Global
                                                       \\TWMROF01\Home
                                                       \\TWMROF02\Global
                                                       \\TWMROF02\Groups
                                                       \\TWMROF02\Home
                                                       \\TWMROF03\Groups
                                                       \\WGBROF01\Global
                                                       \\WGBROF01\Groups
                                                       \\WGBROF01\Home
                                                       \\WILROF01\Global
                                                       \\WILROF01\Groups
                                                       \\WILROF01\Home

                             Crawl Settings            The folder and all subfolders of each start Default Value
                                                       address


       Table 13 – Windows File Share Content



       5.2           Crawl Schedules
       Crawl Schedules are an important aspect of the search design. A properly designed set of crawl
       schedules will ensure that the index is kept up to date while ensuring environmental resources
       (e.g. Network traffic, Databases, Search Performance) are not impacted.
       The following crawl schedules will be implemented to start with. These schedules will need to be
       reviewed once the search solution has been running for some time.

       Content Source             Full Crawl              Incremental Crawl                      Comments

       TeamSite                   Weekly – Occurring every Week days occurring every 30
                                  Tuesday at 7am           minutes

       File Shares: Kedron Initially                      At 4am occurring every Sunday of Implemented in Phase III.


Getronics Australia Pty Limited                                                                                    Page 33
V1.2 April 2008                                           Proprietary and Confidential
 Enterprise Search Project                                                                                            Detailed Design

                                                                                                 for Department of Emergency Services


Search Configuration



       Content Source             Full Crawl                      Incremental Crawl                      Comments
       Central                                                    every 2 weeks                          Note: New content will not
                                                                                                         be indexed for up to 2
                                                                                                         weeks.

       File Shares: Remote        Initially                       At   4am      occurring    every Implemented in Phase III
                                                                  Wednesday of every 3 weeks       Note: New content will not
                                                                                                   be indexed for up to 3
                                                                                                   weeks

       Workspace                  Initially                       Daily occurring every 2 hours.         Implemented in Phase III

       Table 14 – Crawl Schedule
       Note: Due to the size and quantity of files contained in the File shares, it could be possible for
       the crawling to take weeks to complete. Therefore, the incremental crawl schedule has been
       spaced out in an attempt to reduce network and server load. However, file share crawl
       schedules are indicative and will be validated during Phase III development.

       5.3           Scopes
       Search scopes are the pre-defined groupings that allow users to target specific content. Scopes
       can be populated with content based on one or more properties matching a specific value or on a
       content source or a mixture of both.
       The following Search Scopes will be created:

       Scope                                  Rules

       Everywhere                             All Content

       Intranet                               Content Source = “SPWORKS_SOURCE_DesPortal Content” (Or equivalent value
                                              according to SharePointWorks Content Source)

       Medical Director                       Content source of following DESPortal locations:
                                                      Home Our Organisation QAS Medical Director
                                              http://desportal/content/Our_Organisation/QAS/Medical_Director/*
                                                      Home Our Organisation QAS Deputy Commissioner
                                              http://desportal/content/Our_Organisation/QAS/Deputy_Commissioner/*
                                                      Home Education and Research Training QAS Education
                                              http://desportal/content/Education_and_Research/Training/QAS_Education/*
                                                      Home News and Events      Publications and Newsletters      Circulars   Medical
                                                      Directors Circulars
                                              http://desportal/content/News_and_Events/Publications_and_Newsletters/Circulars
                                              /Medical_Directors_Circulars/*

       QFRS Bookshelf                         Content source of DESPortal locations to be provided.

       Workspace                              Content stored on DES file systems as described in section 5.1.1 - TeamSite
                                              Content.

       File System                            Content stored on DES file systems as described in section 5.1.3 - Windows File
                                              Shares.

       Table 15 - Scopes


Getronics Australia Pty Limited                                                                                               Page 34
V1.2 April 2008                                                   Proprietary and Confidential
 Enterprise Search Project                                                                                      Detailed Design

                                                                                           for Department of Emergency Services


Search Configuration



       These search scopes will be created at the Search site collection. The default shared scope “All
       Sites” will not be used so there is no confusion when performing future maintenance.

       5.3.1.1        Property Mapping
       The following MetaData property mapping will be performed for the TeamSite metadata:

       TeamSite Property                SQL Database                       SharePoint Managed Property

       Title                            T_Title                            Title

       Keywords                         T_Keywords                         Keywords

       Description                      T_Description                      Description

       Subject                          T_Subject                          Subject

       Creator (Personal Name)          T_Creator_PersonalName             Author

       Contact Name                     T_ContactName                      Contact Name (New Property)

       Contact URL                      T_ContactUrl                       TBD

       PublishingDivision               T_Publishing_Division              PublishingDivision (New Property)

       Document Type                    T_Document_Type                    DocumentType (New Property)

       Date Created                     D_Date_Created                     Created

       Date Under Review                D_Date_Under_Review                ReviewDate (New Property)

       Audience                         T_Audience                         Audience (new Property)

       Date Modified                    D_Date_Modified                    LastModifiedTime

       Date Valid                       D_Date_Valid                       ValidDate (New Property)

       Version                          T_Version                          Version (New Property)

       Child Procedure (For Policies)   T_Child_Procedure                  TBD

       Update Comments                  T_Update_Comments                  Comments (New Property)

       Creator (Corporate Name)         T_Creator_CorporateName            CreatorCorporateName (new Property)

       Creator (Jurisdiction)           T_Creator_Jurisdiction             CreatorJurisdiction (new Property)

       Publisher (Corporate Name)       T_Publisher_CorporateName          PublisherCorporateName (New Property)

       Publisher (Jurisdiction)         T_Publisher_Jurisdiction           PublisherJurisdiction (New Property)

       Function                         T_Function                         AgencyBusinessFunction (new Property)

       Language                         T_Language                         Language (New Property)

       Status                           T_Status                           Status

       Security                         T_Security                         Security (new Property)

       Table 16 – Property Mapping




Getronics Australia Pty Limited                                                                                      Page 35
V1.2 April 2008                                             Proprietary and Confidential
 Enterprise Search Project                                                                                     Detailed Design

                                                                                          for Department of Emergency Services


Search Configuration



       5.4           File Types and iFilters
       The following iFilters will be installed within the search solution:

                   iFilter Name                                   Purpose

                   Standard (default) MOSS Supplied iFilters      Indexes standard        office   documents   and
                                                                  HTML pages

                   Adobe PDF iFilter (Provided by the Acrobat     Provides indexing of PDF documents
                   Reader V8)

                   Microsoft Filter Pack                          Provides enhanced iFilters for ZIP archives,
                                                                  Microsoft One Note and Visio

       Table 17 - iFilters


       The following table includes the file types that will be supported with a full text index using an
       appropriate iFilter:

                  Description                                     Extension

                  Microsoft Office (2003) Word Document           Doc, docx, docm, dot

                  HTML Document                                   Htm, html, jhtml, jsp, mhtml, mht, asp,
                                                                  aspx

                  MS Office OneNote                               One

                  Adobe Acrobat Document                          Pdf

                  MS PowerPoint                                   Ppt, pptm, pptx

                  Microsoft Publisher                             Pub

                  Text file                                       Txt

                  Visio document                                  Vdx, vsd, vss, vst, vsx, vtx

                  Microsoft Excel workbook                        Xls, xlsb, xlsm, xlsx

                  XML file                                        Xml

                  Compressed files                                Zip

                  Image files                                     Tiff, JPG

       Table 18 – File Types


       Images (e.g. tiff, jpg) are included within the base file types that are indexed, however the
       indexer does not perform a complete index of the image and only base file attributes are
       returned with this file type.
       Image iFilters can be obtained from vendors and other 3rd parties.




Getronics Australia Pty Limited                                                                                      Page 36
V1.2 April 2008                                            Proprietary and Confidential
 Enterprise Search Project                                                                             Detailed Design

                                                                                  for Department of Emergency Services


TeamSite Metadata ETL Process



       6.            TeamSite Metadata ETL Process
       Interwoven TeamSite stores MetaData in an ISAM database (or similar). In order to facilitate the
       indexing of MetaData, the Interwoven MetaData Store will need to be copied and transformed
       into a Microsoft SQL Server 2005 Database.
       This process will be performed using the Interwoven OpenDeploy application. OpenDeploy has
       the ability to copy and transform data into a variety of output types, of which Microsoft SQL
       Server is one.
       An OpenDeploy Job will be created to copy the current MetaData content of all approved data
       from the TeamSite staging environment. A separate Microsoft SQL Server Integration Services
       package will be created to further transform the MetaData. This will include expanding reference
       data as required and converting dates.
       The SharePoint Works Universal Protocol Handler will then use the transformed Microsoft SQL
       Server 2005 database as the source for the MetaData and related content path.
       The following sub-components will be further defined in the following sections:
              High Level Architecture
              Database Design
              Open deploy package
              Microsoft SQL Server Integration Services Transformation Package
              Meta data deployment schedules




Getronics Australia Pty Limited                                                                             Page 37
V1.2 April 2008                                    Proprietary and Confidential
 Enterprise Search Project                                                                            Detailed Design

                                                                                 for Department of Emergency Services


TeamSite Metadata ETL Process



       6.1           High Level Architecture
       The following diagram depicts the high level architecture for the TeamSite Metadata ETL process.




       Figure 6 - Search Solution High Level Architecture



       6.2           Current Data Volumes
       At time of writing this document, there are approximately 38,000 content items occupying
       approximately 6.5 GB of storage space.

       6.3           SQL Server Database Design
       The following SQL Server 2005 databases will be utilised:
              MSS_TS_MetaData_Staging_PDB
              MSS_TS_MetaData_PDB




Getronics Australia Pty Limited                                                                            Page 38
V1.2 April 2008                                   Proprietary and Confidential
 Enterprise Search Project                                                                                 Detailed Design

                                                                                     for Department of Emergency Services


TeamSite Metadata ETL Process



       6.3.1         Database: MSS_TS_MetaData_Staging_PDB
       The database: MSS_TS_MetaData_Staging_PDB is used to store the raw exported metadata from
       the TeamSite store.
       The database will consist of the following tables:

       Table Name                  Role

       TSS_Search_MetaData         Staging table for the raw exported meta data from TeamSite.

       TSS_Import_Jobs             Control table to track when the Open Deploy Package has executed and whether
                                   an update is ready for processing.

       Table 19: MSS_TS_MetaData_Staging_PDB



       6.3.1.1       Table: TSS_Search_MetaData
       The table TSS_Search_MetaData will be populated with the Raw TeamSite Meta Data when the
       Open Deploy Package is executed. This table will be truncated (purged) at the beginning of each
       Open Deploy Package execution.
       The following details the table design:

       TeamSite Attribute           Column Name                   Data Type             Null?        Primary Key

       N/A                          N_Content_Staging_ID          Int (Identity)        NOT NULL     Yes

       Path                         T_Path                        Varchar (900)         NOT NULL     No. A unique index
                                                                                                     will be applied to
                                                                                                     this column

       Title                        T_Title                       Varchar(1000)         NULL

       Keywords                     T_Keywords                    Varchar(1000)         NULL

       Description                  T_Description                 Varchar(1000)         NULL

       Subject                      T_Subject                     Varchar(1000)         NULL

       Creator (Personal Name)      T_Creator_PersonalName        Varchar(50)           NULL

       Contact Name                 T_ContactName                 Varchar(50)           NULL

       Contact URL                  T_ContactUrl                  Varchar(400)          NULL

       PublishingDivision           T_Publishing_Division         Varchar(10)           NULL

       Document Type                T_Document_Type               Varchar(10)           NULL

       Date Created                 D_Date_Created                Varchar(30)           NULL

       Date Under Review            D_Date_Under_Review           Varchar(30)           NULL

       Audience                     T_Audience                    Varchar(50)           NULL

       Date Modified                D_Date_Modified               Varchar(30)           NULL

       Date Valid                   D_Date_Valid                  Varchar(30)           NULL

       Version                      T_Version                     Varchar(10)           NULL



Getronics Australia Pty Limited                                                                                 Page 39
V1.2 April 2008                                       Proprietary and Confidential
 Enterprise Search Project                                                                                    Detailed Design

                                                                                         for Department of Emergency Services


TeamSite Metadata ETL Process



       TeamSite Attribute               Column Name                   Data Type             Null?        Primary Key

       Child Procedure (For Policies)   T_Child_Procedure             Varchar(1000)         NULL

       Update Comment                   T_Update_Comment              Varchar(1000)         NULL

       Creator (Corporate Name)         T_Creator_CorporateName       Varchar(50)           NULL

       Creator (Jurisdiction)           T_Creator_Jurisdiction        Varchar(30)           NULL

       Publisher (Corporate Name)       T_Publisher_CorporateName Varchar(50)               NULL

       Publisher (Jurisdiction)         T_Publisher_Jurisdiction      Varchar(30)           NULL

       Function                         T_Function                    Varchar(1000)         NULL

       Language                         T_Language                    Varchar(20)           NULL

       Status                           T_Status                      Varchar(20)           NULL

       Security                         T_Security                    Varchar(20)           NULL

       Type                             T_Type                        Varchar(20)           NULL

       Table 20: TSS_Search_MetaData



       6.3.1.2        Table: TSS_Import_Jobs
       The table TSS_Import_Jobs is used to trigger and track the transformation process. Once the
       Open Deploy Package has completed loading the TeamSite Meta Data into the Table:
       TSS_Search_MetaData, it will insert a record into this table indicating that an import has
       completed and the transformation can complete on the new data.
       The following details the table design:

        Column Name                      Data Type          Null?         Primary Key        Default Value

        N_Job_ID                         Int (Identity)     NOT NULL      Yes

        D_TeamSiteImport                 DateTime           NOT NULL      No                 Current   Date    and     time
                                                                                             (GetDate())

        T_TeamSiteJobName                Varchar(50)        NOT NULL      No

        F_OpenDeployDataProcessed        Bit                NOT NULL      No                 0

        D_OpenDeployProcessedDate        DateTime           NULL          No

        N_OpenDeployRecordsImported      Int                NULL          No

        F_TransformProcessed             Bit                NULL          No                 0

        D_TransformDateProcessed         DateTime           NULL          No

        N_TransformRecordsProcessed      Int                NULL          No

        N_TransformRecordsError          Int                NULL          No

        T_LastError                      Varchar(8000)      NULL          No

       Table 21: TSS_Import_Jobs




Getronics Australia Pty Limited                                                                                      Page 40
V1.2 April 2008                                           Proprietary and Confidential
 Enterprise Search Project                                                                                        Detailed Design

                                                                                             for Department of Emergency Services


TeamSite Metadata ETL Process



       6.3.1.3       Database Security
       The database will use a standard SQL Server security login account to allow access for the
       Interwoven Open Deploy package. In addition a windows account will be granted access to the
       database for the SQL Server Transformation package to transform and synchronise the data.
       The following table summarises the security for this database:

                             Login                                    Type               Role

                             TeamSiteMetaData                        Standard            Db_datareader
                                                                     Security            Db_datawriter
                                                                                         Db_owner

                             DESQLD\MSS_TS_MetaDataETL               Integrated          Db_datareader
                                                                     Security            Db_datawriter

       Table 22: MSS_TS_MetaData_Staging_PDB – Database Security



       6.3.2         Database: MSS_TS_MetaData_PDB
       The database: MSS_TS_MetaData_PDB database is used to store the transformed Meta data
       from the TeamSite store.
       The database will consist of the following tables:

       Table Name                          Role

       TS_Search_MetaData                  Final database table for the Meta Data ready to be indexed

       TS_Ref_Document_Type                Reference Table to transform document type field

       TS_Ref_Language                     Reference table to transform the Language field

       TS_Ref_Security                     Reference table to transform the Security Field

       TS_Ref_Content_Type                 Reference table to transform the Type Field

       Table 23: MSS_TS_MetaData_PDB



       6.3.2.1       Table: TS_Search_MetaData
       The table TS_Search_MetaData will be populated with the Raw TeamSite Meta Data when the
       Open Deploy Package is executed. This table will be synchronised at the beginning of each Open
       Deploy Package execution.
       The following details the table design:

       TeamSite Attribute            Column Name                  Data Type           Null?           Primary Key

       N/A                           N_Content_ID                 Int (Identity)      NOT NULL        Yes

       Path                          T_Path                       Varchar (900)       NOT NULL        No, a unique index will be
                                                                                                      applied

       Title                         T_Title                      Varchar(1000)       NULL



Getronics Australia Pty Limited                                                                                        Page 41
V1.2 April 2008                                               Proprietary and Confidential
 Enterprise Search Project                                                                                       Detailed Design

                                                                                            for Department of Emergency Services


TeamSite Metadata ETL Process



       TeamSite Attribute         Column Name                    Data Type           Null?           Primary Key

       Keywords                   T_Keywords                     Varchar(1000)       NULL

       Description                T_Description                  Varchar(1000)       NULL

       Subject                    T_Subject                      Varchar(1000)       NULL

       Creator          (Personal T_Creator_PersonalName         Varchar(50)         NULL
       Name)

       Contact Name               T_ContactName                  Varchar(50)         NULL

       Contact URL                T_ContactUrl                   Varchar(400)        NULL

       PublishingDivision         T_Publishing_Division          Varchar(10)         NULL

       Document Type              T_Document_Type                Varchar(40)         NULL

       Date Created               D_Date_Created                 DateTime            NULL

       Date Under Review          D_Date_Under_Review            DateTime            NULL

       Audience                   T_Audience                     Varchar(50)         NULL

       Date Modified              D_Date_Modified                DateTime            NULL

       Date Valid                 D_Date_Valid                   DateTime            NULL

       Version                    T_Version                      Varchar(10)         NULL

       Child Procedure       (For T_Child_Procedure              Varchar(1000)       NULL
       Policies)

       Update Comment             T_Update_Comment               Varchar(1000)       NULL

       Creator         (Corporate T_Creator_CorporateName        Varchar(50)         NULL
       Name)

       Creator (Jurisdiction)     T_Creator_Jurisdiction         Varchar(30)         NULL

       Publisher       (Corporate T_Publisher_CorporateName Varchar(50)              NULL
       Name)

       Publisher (Jurisdiction)   T_Publisher_Jurisdiction       Varchar(30)         NULL

       Function                   T_Function                     Varchar(1000)       NULL

       Language                   T_Language                     Varchar(20)         NULL

       Status                     T_Status                       Varchar(20)         NULL

       Security                   T_Security                     Varchar(20)         NULL

       Type                       T_Type                         Varchar(20)         NULL

       N/A                        D_DateModified                 DateTime            NOT NULL

       N/A                        F_Deleted                      Bit                 NOT    NULL
                                                                                     (Default 0)

       Table 24: TS_Search_MetaData
       Note: The last 2 attributes are used to assist the SharePoint Works Universal connector when
       performing incremental crawling.



Getronics Australia Pty Limited                                                                                       Page 42
V1.2 April 2008                                              Proprietary and Confidential
 Enterprise Search Project                                                                                       Detailed Design

                                                                                            for Department of Emergency Services


TeamSite Metadata ETL Process



              D_DateModified – This column is updated with the current date and time when the record is
              changed.
              F_Deleted – This column is set to 1 (TRUE) if the item is not found in the refreshed
              metadata. This is equivalent to a logical delete.



       6.3.2.2       Table: TS_Ref_Document_Type
       The table TS_Ref_Document_Type is used to transform the Document Type code from the
       TeamSite MetaData into a textual description.
       The following details the table design:

                             Column Name                   Data Type         Null?           Primary Key

                             C_Document_Type_ID            Varchar(10)       NOT NULL        Yes

                             T_Description                 Varchar(40)       NOT NULL

       Table 25: TS_Ref_Document_Type
       DES is currently reviewing the Document Types in use, with a view to refining the current list of
       22 document types available in TeamSite.



       6.3.2.3       Table: TS_Ref_Language
       The table TS_Ref_Language is used to transform the Language code from the TeamSite Meta
       Data into a textual description.
       The following details the table design:

                             Column Name                   Data Type         Null?           Primary Key

                             C_Language_ID                 Varchar(10)       NOT NULL        Yes

                             T_Description                 Varchar(20)       NOT NULL

       Table 26: TS_Ref_Language
       This data stored in this table will include:

                                             Language_ID                 Description

                                             EN                          English




       6.3.2.4       Table: TS_Ref_Security
       The table TS_Ref_Security is used to transform the Security code from the TeamSite Meta Data
       into a textual description.
       The following details the table design:

                             Column Name                   Data Type          Null?           Primary Key

                             C_Security_ID                 Varchar(10)        NOT NULL        Yes


Getronics Australia Pty Limited                                                                                       Page 43
V1.2 April 2008                                              Proprietary and Confidential
 Enterprise Search Project                                                                                       Detailed Design

                                                                                            for Department of Emergency Services


TeamSite Metadata ETL Process



                             Column Name                   Data Type          Null?           Primary Key

                             T_Description                 Varchar(20)        NOT NULL

       Table 27: TS_Ref_Security


       This data stored in this table will include:

                                             Security_ID                 Description

                                             UC                          Unclassified

                                             IC                          In Confidence




       6.3.2.5       Table: TS_Ref_Content_Type
       The table TS_Ref_Content_Type is used to transform the Content Type code from the TeamSite
       Meta Data into a textual description.
       The following details the table design:

                             Column Name                   Data Type          Null?           Primary Key

                             C_Type_ID                     Varchar(10)        NOT NULL        Yes

                             T_Description                 Varchar(20)        NOT NULL

       Table 28: TS_Ref_Content_Type
       This data stored in this table will include:

                                             Type_ID                     Description

                                             DO                          Document

                                             SE                          Service




       6.3.2.6       Database Security
       The database will use an integrated security windows account to access the database for the SQL
       Server Transformation package to transform and synchronise the data. This same account will
       be used as the crawl account for the SharePoint Works Universal connector.
       The following table summarises the security for this database:

                             Login                                  Type                 Role

                             DESQLD\MSS_TS_MetaDataETL              Integrated           Db_datareader
                                                                    Security             Db_datawriter

       Table 29: MSS_TS_MetaData_PDB – Database Security




Getronics Australia Pty Limited                                                                                       Page 44
V1.2 April 2008                                              Proprietary and Confidential
 Enterprise Search Project                                                                                 Detailed Design

                                                                                      for Department of Emergency Services


TeamSite Metadata ETL Process



       6.4           Open Deploy Package
       Two Open Deploy packages will be created by DES developers. Each will deploy the raw
       TeamSite metadata to the SQL Server database: MSS_TS_MetaData_Staging_PDB.
       One package (Full Export) will perform a full export while the other (Incremental) will ensure the
       staging database is kept in sync and up to date.
       The Full Export package will be responsible for:
                             Purging the table: TSS_Search_MetaData;
                             Exporting the metadata from the TeamSite repository;
                             Inserting a record into the table: TSS_Import_Jobs signalling the readiness of
                             the new batch of data;
       The Incremental package will be responsible for:
                             Exporting the changed meta data from the TeamSite repository in to the table
                             TSS_Search_MetaData;
                             Inserting a record into the table: TSS_Import_Jobs signalling the readiness of
                             the new batch of data;
       It is envisaged that both Open Deploy deployment packages will also execute a script to start the
       SSIS Package on the SQL Server controlling the start of that process. However, this has not
       been tested and will be confirmed during the development stage.

       6.5           SQL Server Transformation Package
       A SQL Server Transformation Package (SSIS Package) will be created to transform and
       synchronise the data from the staging database to the final database table:
       MSS_TS_MetaData_PDB.dbo.TS_Search_MetaData
       The SSIS package will be responsible for the following tasks:
              Using the control table TSS_Import_Jobs from the staging database, detecting if there are
              outstanding updates to the TeamSite MetaData;
              Transform the TeamSite MetaData from the Staging database using the reference data
              tables;
              Convert the date and time fields from varchar data types to DateTime data types:
                     D_Date_Created
                     D_Date_Under_Review
                     D_Date_Modified
                     D_Date_Valid
              Synchronise the data into the final database table: TS_Search_MetaData by:
                     Inserting new records
                     Updating changed records
                     Deleting missing records (Set the F_Deleted column to 1 (TRUE) )
              Update the control table: TSS_Import_Jobs setting the Processed flag to 1 (True) and
              setting the Processed Date.




Getronics Australia Pty Limited                                                                                 Page 45
V1.2 April 2008                                        Proprietary and Confidential
 Enterprise Search Project                                                                                Detailed Design

                                                                                     for Department of Emergency Services


TeamSite Metadata ETL Process



       6.6           Housekeeping
       A housekeeping process will be required to physically delete the logically deleted records from
       the table: TS_Search_MetaData.

       6.7           MetaData Deployment Schedules
       The following schedules will be used to control the ETL process for the TeamSite Meta Data:

       Process                    Schedule                               Comments

       Open Deploy: Full Export   Weekly

       Open Deploy: Incremental   Triggered by content being published
                                  in TeamSite

       SQL SSIS Package           On demand executed by the Open
                                  Deploy Package

       Housekeeping – Remove Weekly                                      The logically deleted records will be
       logically deleted records                                         removed from the database table weekly.

       Table 30: MetaData Deployment Schedules




Getronics Australia Pty Limited                                                                                Page 46
V1.2 April 2008                                       Proprietary and Confidential
 Enterprise Search Project                                                                                            Detailed Design

                                                                                               for Department of Emergency Services


Top Ten Query



       7.            Top Ten Query
       The Top Ten search query assist in providing the user a set of most commonly executed search
       parameters displayed as a SharePoint web part within the search site.
       The top ten most frequent search parameters will be collated through a SQL procedure call and
       displayed in a SharePoint web part displayed within the Search site.
       These search parameters will be displayed as links in a tabular fashion within the custom
       SharePoint web part.
       The following sub-components will be further defined in the following sections:
                             High Level Architecture
                             SQL Server Database Design
                             Data Transformation: SSIS Package
                             Top Ten Query Web Part

       7.1           High Level Architecture
       The following diagram depicts the high level architecture for the Top Ten Query feature.

                                  Extract Query Data                               Transformed Query Data




                                                          SSIS Transformation


               MSS_SSP_PDB                                                                                  MSS_TopTenQuery_PDB



                                                        Custom SQL Data Viewer
                                                          SharePoint Web Part




                                                                                      Access Query Data


       Figure 7 – Top Ten Query Architecture

       7.2           SQL Server Database Design
       The database: MSS_TopTenQuery_PDB is used to store the collated top ten search queries.
       The database will consist of the following table:

       Table Name                           Role

       TTQ_Data                             Table for the collated top ten query search results.

       TTQ_Log                              Control table to track when the Top Ten query has executed.

       Table 31: MSS_TopTenQuery_PDB




Getronics Australia Pty Limited                                                                                            Page 47
V1.2 April 2008                                                 Proprietary and Confidential
 Enterprise Search Project                                                                                    Detailed Design

                                                                                         for Department of Emergency Services


Top Ten Query



       7.2.1         Table: TTQ_Data
       The table TTQ_Data will be populated with the collated top ten search.                         This table will be
       truncated (purged) at the beginning of each stored procedure execution.
       The following details the table design:

                             Column Name                Data Type          Null?           Primary Key

                             N_TTQ_Data_ID              Int (Identity)     NOT NULL        Yes

                             T_Scope                    Varchar(50)        NOT NULL        No

                             N_Queries                  Int                NOT NULL        No

                             D_Created                  DateTime           NOT NULL        No

       Table 32: TTQ_Data

       7.2.2         Table: TTQ_Log
       The table TTQ_Log is used as a control table to track when the table TTQ_Data has been
       updated. The following details the table design:

                             Column Name                Data Type          Null?           Primary Key

                             N_TTQ_Log_ID               Int (Identity)     NOT NULL        Yes

                             D_Processed                DateTime           NOT NULL        No

       Table 33: TTQ_Log

       7.2.3         Database Security
       The following table summarises the security for this database:

                        Login                                  Type                       Role

                        DESQLD\NT Authenticated Users          Integrated Security        Db_datareader

                        DESQLD\MSS_SQL_TopTenQuery             Integrated Security        Db_datareader
                                                                                          Db_datawriter

       Table 34: MSS_TopTenQuery_PDB – Database Security

       7.3           Data Transformation: SSIS Package
       A SQL Server Transformation Package (SSIS Package) will be created to transform the data,
       purging the current database table, retrieved from a SQL Stored Procedure to the database
       table: MSS_TopTenQuery_PDB.dbo.TTQ_Data
       The SSIS package will be responsible for the following tasks:
                             Transform the TTQ_Data through a SQL Stored Procedure which collates from
                             reference data tables within the MSS_SSP_PDB database.
                             Synchronise the data into the database table: TTQ_Data by:
                                   Purging existing records;
                                   Inserting new records.

Getronics Australia Pty Limited                                                                                    Page 48
V1.2 April 2008                                           Proprietary and Confidential
 Enterprise Search Project                                                                                    Detailed Design

                                                                                         for Department of Emergency Services


Top Ten Query



                             Insert a record entry in the control table: TTQ_Log setting the Processed Date.

       7.4           Process Schedules
       The following schedules will be used to control the process for the Top 10 Search data updates:

       Process                       Schedule                Comments

       Top 10 Query Data Update      Daily – occurring    at This schedule coincides with the Analysis reporting updates
                                     2am in the morning      to run after the report generation process




       7.5           Query Logging
       Search queries are logged and stored within the MSS2008 Shared Services Provider (SSP)
       database. Housekeeping jobs can be run to remove queries that are over 30 days old.

       7.6           Top Ten Query Web Part
       A custom web part will be developed to extract and display the top ten search parameters to the
       user.
       The user will be able to click the hyperlink from the title of the search query and a new search
       will be executed.
       The following is a demo screen shot of how the custom web part will look (actual web part
       interface may differ).




       Figure 8 – Top 10 web part on the Advanced Search page

Getronics Australia Pty Limited                                                                                    Page 49
V1.2 April 2008                                           Proprietary and Confidential
 Enterprise Search Project                                                                  Detailed Design

                                                                       for Department of Emergency Services


User Interface/Experience



       8.            User Interface/Experience
       8.1           Design Mock-ups




       Figure 9 – Search Results




Getronics Australia Pty Limited                                                                  Page 50
V1.2 April 2008                         Proprietary and Confidential
 Enterprise Search Project                                                                                       Detailed Design

                                                                                            for Department of Emergency Services


User Interface/Experience




       Figure 10 – Advanced Search Page
       Note: Mock Ups are initial representation of User Interface and may vary from final
       implementation.

       8.2           Search User Interface
       The search user interface will be hosted within a framed page and appear integrated as part of
       the DESPortal.
       DESPortal developers will be required to create a new search page within the BEA WebLogic
       portal. This page will contain navigation breadcrumbs (as per the current DESPortal Search
       page) and will also contain the frame that will host the MSS2008 interface.
       It should be noted that the breadcrumb information will be referenced from the Search page
       within the DESPortal and there will not be any navigation passed through to the hosted MSS2008
       interface.
       This page will facilitate the following MSS2008 search pages:

       Search Page            Frame Source                   Comments

       Advanced Search        http://search.desqld.interna When the Advanced Search link is selected from the header of
                              l/search/Pages/DefaultAdva the portal, the WebLogic Search Page will navigate the
                              nced.aspx                    contained frame to the Advanced Search page.

       Search Results         http://search.desqld.interna   When the user selects a simple search from the header of the
                              l/search/Pages/DefaultResul    portal, the WebLogic Search page will navigate the contained
                              ts.aspx?k=<queryterm>&s        frame to the Search Results page passing through the query
                              =<scope>                       string to execute a simple search.


Getronics Australia Pty Limited                                                                                       Page 51
V1.2 April 2008                                              Proprietary and Confidential
 Enterprise Search Project                                                                                 Detailed Design

                                                                                      for Department of Emergency Services


User Interface/Experience



       Once the user is presented with the Advanced Search or search results page, further querying
       can then be performed without change to the frames starting navigation address.

       8.3           DESPortal Simple Search
       The DESPortal developers will be required to modify the current Search input box located in the
       header of the DES Portal to accept a simple search query. In addition, the user will be able to
       select one of the defined search scopes. Once the user clicks the “Go” or “Search” button, the
       control will navigate the browser to a new BEA WebLogic Search page passing the query term
       and scope.
       The BEA WebLogic search page will format the search term and the scope into a standard
       MSS2008 search URL and navigate the contained frame page to the MSS2008 Search Results
       page with the url.
       The format of the url is:
       http://search.desqld.internal/search/Pages/DefaultResults.aspx?k=<queryterm>&s=<scope>
       Where:
                     <queryterm>: The simple search query term
                     <scope>: The selected scope. By default the “Everywhere” scope will be selected.
       The following image depicts the simple search text box:



       The available search scopes in the drop down list will include the defined scopes as listed in
       section: 5.3 - Scopes as well as the defined Intended Audiences as listed in section: 8.4.8 -
       Intended Audience.
       For the Intended Audience section, the Query Term of the search will include the Intended
       Audience Attribute: IA=<Intended Audience Value>

       8.4           Advanced search
       The Advanced Search page will be based on the standard content of the default advanced search
       from the Ontolica search features. The following attributes will be included:

       8.4.1         Find Results
       The user will be able to locate content based on the following attributes:
                             All of these words;
                             This exact phrase;
                             Any of these words;
                             None of these words;
                             With the words in proximity.
       These attributes will allow the user to enter free text for each category.




Getronics Australia Pty Limited                                                                                 Page 52
V1.2 April 2008                                        Proprietary and Confidential
 Enterprise Search Project                                                                                  Detailed Design

                                                                                       for Department of Emergency Services


User Interface/Experience



       8.4.2         Language
       A drop down list of languages can be made available identifying the languages the content has
       been authored in to be returned in the results. DES will provide this list of languages. It should
       be noted, however, that it is assumed that all content is in English and this attribute will be
       hidden.

       8.4.3         File Type
       A drop down list will be available to assist the user in limiting the search to only the nominated
       File format type (e.g. PDF, DOC, XLS). The drop down list will contain the document format
       types (the image or icon representing that document will not be displayed in this list) as well as
       the entry “All Results”. DES will supply the list of document types to add to this drop-down list.

       8.4.4         Scope Item
       The list of defined scopes will be present to allow the user to select a particular search scopes.
       See section 5.3 - Scopes of this document for the defined scopes.
       If required, the word “Scope” in the field “Search by Scope” on the Advanced Search page can be
       replaced with a more suitable term.

       8.4.5         File Size
       The user will be able to search for content based on the size of the document. Document sizes
       will be placed in a group of sizes so that the user won’t be required to be too specific. The
       following size groups from the Ontolica Advanced Search Page options will be available:
                             Any Size
                             Tiny (0-10KB)
                             Small (10KB - 100 kB)
                             Medium (100KB - 1 MB)
                             Large (1MB – 10MB)
                             Huge (>10MB)

       8.4.6         Date Created
       The user will be able to search for content based on when it was created. Dates will be placed in
       a group of dates so that the user won’t be required to be too specific. The following date groups
       from the Ontolica Advanced Search Page options will be utilised:
                             Anytime;
                             Since Yesterday;
                             In the past 30 days;
                             In the past 3 months;
                             In the past 6 months;
                             This Week (this calendar week)
                             This Month (this calendar month);
                             This Year (this calendar year)


Getronics Australia Pty Limited                                                                                  Page 53
V1.2 April 2008                                         Proprietary and Confidential
 Enterprise Search Project                                                                                         Detailed Design

                                                                                              for Department of Emergency Services


User Interface/Experience



       8.4.7         Additional Search Attributes
       The advanced search page will also allow the user to enter a value for a specific meta data
       attribute. The user will be able to enter the text as free text against the nominated properties
       and have the choice of whether the search engine should perform an exact match on the text or
       whether the attribute contains the text entered. E.g. “Title” contains “Ambulance” will match all
       documents where the Title metadata attribute contains the term “Ambulance”.
       The following attributes will be available:

       Attribute                  Description

       Title                      The title of the document

       Keywords                   Keywords associated with the document

       Document type              DES Document Type.
                                  It should be noted that Document Type will not be enforced on the user interface. The
                                  user will be presented with a free text box to allow for the input of these terms.

       Publishing Division        The DES division that has published the document

       Author                     The author of the document

       Intended Audience          The intended audience of the content. This will be presented as a drop down list with the
                                  values as described in section: 8.4.8 - Intended Audience




       8.4.8         Intended Audience
       The intended audience attribute provides the user the ability to search for TeamSite content that
       is “Relevant To Me”. I.e. The Intended Audience filter allows the user to select content from
       their department as well as all content relevant to DES. For example, searching for the Intended
       Audience of BSS will return content that has the meta data for the Intended Audience equal to
       BSS as well as All Of DES.
       The Intended Audience will be configured using the Ontolica extensions for search and will
       include the following settings:

                                        Display Name                    Value

                                        All                             Default value – not set

                                        BSS                             BSS;DES

                                        QAS                             QAS;DES

                                        QFRS                            QFRS;DES

                                        EMQ                             EMQ;DES

                                        SPES                            SPES;DES

       The Intended Audience attribute will have a short name of IA to aide simpler searching.




Getronics Australia Pty Limited                                                                                         Page 54
V1.2 April 2008                                                Proprietary and Confidential
 Enterprise Search Project                                                                              Detailed Design

                                                                                   for Department of Emergency Services


User Interface/Experience



       8.5           Search Results
       Once the search is executed, the results will be displayed in a pageable list in the search results
       page. Users will be able to navigate forwards and backwards within the search result pages. At
       all times, the user will have a visual identification of the current page they are on and the pages
       available including buttons to navigate forwards and backwards within the pages. In addition,
       the results will indicate the records being displayed and the total number of records (e.g.
       showing results 10-20 of 200).
       The following sections describe the available functionality.

       8.5.1         No Search Results
       When no results are returned, a helpful message is displayed to the user indicating there are no
       results and suggesting a wider search to be performed. The text of this message can be modified
       by the Search Administrator.

       8.5.2         Number Of Results Per Page
       The number of results per page will be available as a global setting set by the search
       administrators. Initially this will be set to 10 results per page. This setting will not be able to be
       modified by individual users.

       8.5.3         Sort Results
       The Ontolica components will provide the user with the ability to sort the results based on the
       following attributes:
                             Relevance;
                             Date Last Updated;
                             Title.

       8.5.4         Alert Me
       Once a search has been executed, users will have the ability to set a reminder within MSS2008
       so they may be notified by email when the content of the search results for the query has been
       changed. The link “Alert Me” is available from the Search Results Page and when clicked
       navigates the user to the standard Alert Me registration page.
       The users will be able to manage their own subscriptions by following a link (embedded in the
       alerting email) that will navigate them to the standard MSS2008 “My Alerts” administration page.




Getronics Australia Pty Limited                                                                              Page 55
V1.2 April 2008                                     Proprietary and Confidential
 Enterprise Search Project                                                                                     Detailed Design

                                                                                          for Department of Emergency Services


User Interface/Experience



       8.5.5         Search Result Display Formats
       Each result item displayed will provide as much information as possible to assist the user in
       selecting the best result of the list.
       The following features will be present in the search results:

       Feature                    Description

       Visual Representation Of The ranking of the result as a visual guide such as stars (1-5) or as a percentage. E.g.
       Relevance                90%. MSS2008 assigns a Rank value to each search result item. The first item in the
                                search results will be assigned a relative ranking of 100%. The ranking of other items
                                will depend on their Rank value compared to the first item.

       Link Headings              Each search result heading will have a hyperlink that will navigate the user to the
                                  content.

       Search Summaries           A brief summary of the text surrounding the search terms within the content or the
                                  keywords and MetaData used to match the result to the query will be displayed.

       Divisional       Graphical Each search result will display the Intended Audience of the item. This will be
       Elements                   represented by a gif image and be located next to the title heading. Publishing
                                  Divisions such as DES or Whole of Government will be displayed using their Divisional
                                  Graphical Elements.

       Highlight Keywords         Title text and summaries will be displayed in Bold font matching the query text.

       URL of Sourced page        The resulting item will contain the URL of the sourced page. When clicked the user will
                                  be navigated to that content.

       Date Last Updated          The resulting item will include the date of last update (where possible). For TeamSite
                                  this data will be sourced from the MetaData, for file shares it will be the Last Modified
                                  attribute for the file. Other sources may not have this information.

       File Type and Size         The file type (such as PDF, DOC) will be displayed with its associated application icon.
                                  The size of the document will also be displayed.




       8.5.6         Faceted Searching
       The Ontolica Search components will provide the faceted searching functionality on the search
       results page. The Faceted search components will group the results based on the defined facets
       and allow the user to further refine the search. Examples of facets could be Author, Document
       Type or File Format.
       The list of facets will be supplied by DES.




Getronics Australia Pty Limited                                                                                     Page 56
V1.2 April 2008                                            Proprietary and Confidential
 Enterprise Search Project                                                                              Detailed Design

                                                                                   for Department of Emergency Services


Administration



       9.            Administration
       9.1           Backups
       Backup is an important process which in the event of a system failure the MSS 2008 deployment
       can be fully restored to its former state without loss of data and further complications. There are
       three parts to the backup process:
                             MSS2008 Farm;
                             SQL 2005 Custom Databases;
                             Windows File system.
       All three parts must be completed for a successful backup solution to be implemented.

       9.1.1         MSS2008 Farm Backup
       The Entire Search Server 2008 farm can be backed up as a full backup or differential backup.
       Backups will be performed using an automated backup script utilising the STSADM command line
       interface. When backing up by using the STSADM command-line tool, backups can be performed
       on individual aspects of the MSS2008 deployment. For example, it is possible to back up an
       individual site collection or back up the entire farm. In addition, the STSADM backup command
       will pause any indexing currently underway before running the backup.
       Backup strategies are considered out of scope and will fall under standard DES backup regime.
       Backup template scripts will be provided during solution deployment and Getronics will assist
       DES to implement and test the backup regime.

       9.1.1.1       Full Backup
       The full backup option should only be run weekly over the weekend as this backup will include all
       SharePoint databases and will be overwritten each week. Full backup of the MSS2008 farm is
       recommended to be run weekly over the weekend.
       The current database size estimations mean that this backup will be very large and could take up
       to 48hrs to run. Exact storage requirements will not be clear until Phase III of the project.
       Getronics work with DES to implement and test the backup regime.

       9.1.1.2       Differential Backup
       The differential backup should be run daily after hours to minimise impact on user functionally
       and speed.
       Differential backup of the MSS2008 farm is recommended to be run weekdays.




Getronics Australia Pty Limited                                                                              Page 57
V1.2 April 2008                                     Proprietary and Confidential
 Enterprise Search Project                                                                                   Detailed Design

                                                                                        for Department of Emergency Services


Administration



       9.1.2         SQL 2005 Custom Databases
       Within the DES Enterprise Search solution, there are three databases that are not backed up
       under the previous scripts as they are not part of the MSS2008 Farm. These databases are:
                             MSS_TS_MetaData_Staging_PDB
                             MSS_TS_Metadata_PDB
                             MSS_TopTenQuery_PDB
       These databases should be backed up each night in line with DES’ standard back up procedures.

       9.1.3         Windows File System Backup
       MSS2008 file system folder should also be included in the back up process. This folder is located
       at:
       C:\Program Files\Common Files\Microsoft Shared\web server extensions\12
       This folder should be backed up regularly using the built in windows backup utility and shadow
       copy or a third party backup utility that knows how to access locked files.

       9.2           Monitoring
       DES Administrators are responsible for monitoring the Enterprise Search solution.
       The following services and tasks should be monitored to ensure they are running correctly:
                             Internet Information Services
                             Windows SharePoint Services Administration
                             Windows SharePoint Services Timer
                             Windows SharePoint Services Tracing
                             Windows SharePoint Services Search
                             Microsoft Office SharePoint Search
                             Windows Event Log
                             TeamSite ETL process
                             TopTen query data exports
       It is recommended that DES use Microsoft SCOM for monitoring.                        SCOM provides enhanced
       monitoring packs for the SharePoint environment.

       9.3           Antivirus
       No server side Antivirus Agents for MSS2008 and WSS v3 have been agreed upon.
       As the Search solution will not contain documents, it is assumed that local disk scanning of
       documents via the desktop antivirus engine will be used during the upload or download process
       of documents.




Getronics Australia Pty Limited                                                                                   Page 58
V1.2 April 2008                                          Proprietary and Confidential

								
To top