Enterprise Search Engine Security by hcj

VIEWS: 3 PAGES: 39

									Mark Bennett




Protecting
Confidential Information
in Corporate Search
Agenda
 Business Drivers
 Levels of Security “Granularity”
 “Early” vs. “Late” Binding – why it matters!
   Vendor round up
 Organization and Technical Challenges
 Patching Search Security Holes
 Trends
 Wrap Up / Q & A


                                                 2
  Business Drivers

(why you should care)



                        3
The ES Security Paradox

As Search is deployed further and further into
    the Enterprise, the likelihood of having a
          security problem increases.




                                                 4
An Experiment You Should Try

  You’ll be amazed what you can find on your own
   company’s network. Try searching for:
      confidential
      highly confidential
      salaries
      performance review
      Excel spreadsheets (.xls)
      Access databases (.mdb)
  Also look for:
    Obscenities
    Racial and gender slurs
                                                    5
Shifts in Thinking
 From technical security to Business Viability
   IP, financial/SEC, regulatory, espionage, privacy
   Downsides include:
     Loss of competitive advantage, Degradation of company
      reputation, Impact of fraud and misuse, Decisions made on faulty
      information, Loss of access to critical information, Legal and
      contract liability, Regulatory fines, Public safety
       Forrester interview with Michael Rasmuseen

 From “perimeter-focused” to “distributed”
   Must protect some data internally
   Some systems must trust other security providers
       Burton Group

                            Enterprise Search Security
                                  Summer 2008
                                                                         6
Enterprise Search and Corporate Security

             The Current State of Affairs



  The Good: SSO, SAML,
            LDAP, Active Directory

  The Bad:    Spidering, Org Boundaries

  The Ugly: Holes, Lack of Awareness



                  Enterprise Search Security
                        Summer 2008
                                               7
Levels of Security

  “Granularity”

   Summary:
   • Application / Collection
   • Document
   • Field / Sub-Document
   • Sub-Field / “Redaction”


                                8
Granularity: Collection Level




           Enterprise Search Security
                 Summer 2008
                                        9
Granularity: Document Level




           Enterprise Search Security
                 Summer 2008
                                        10
Granularity: Field Level




                           11
Granularity: Sub-Field “Redaction”




                                     12
    “Early Binding”
          vs.
“Late Binding” Security

   This choice affects
performance and security
   infrastructure load

                           13
Defining “Early” vs. “Late”
Binding
  Early-Binding
    Search engine Index includes ACL info
      Forrester: “Caching security credentials”
  Late-Binding
    ALL security work done at Search Time
      Forrester: “Run-time access validation”
  Hybrid: combines Early and Late
  Federated: leverage indigenous engines
    May require complex security mapping
                                                   14
Early vs. Late Binding Security




                                  15
Early Binding Security (good!)




                                 16
Late Binding   (not so good)




                               17
Security Infrastructure Interaction
                                                                       No work needed at Index
 Early Binding: Index Time                                                time
                                                                             • Would appear to
    1.   I have document                                                         be a simpler/better
                                  ,
         “http://corp.acme.com/sales/forcast.html”
                                                                                 design
         what are the group IDs for
         it? (ACLs, etc)
 Early Binding: Search Time                          Late Binding: Search Time
    1.   I have Session ID                              1. I have Session ID
         “14729834416”, which User                         “14729834416”, can I access
         is that for?                                      document
    2.   I have User “Jones”, which                        “http://corp.acme.com/sales/forcast.html”,

         groups is he in?                                  Yes or No?
    3.   Transform the list of Group
         IDs into a Native Query                        (repeat for every match)
         Filter (with ACLs, etc)


                                                                                                        18
   Vendor
   Roundup

Early vs. Late
    Binding



                 19
Vendor: FAST Search & Transfer
 Supports Early and Late binding
 Can use BOTH together
   Hybrid approach “Best of both Worlds”
 Gets along very well with
     Microsoft Active Directory
     FAST SAM = Security Access Module
     Based on Windows technology
 Can still use your own application level logic if
  you prefer

                    Enterprise Search Security
                          Summer 2008
                                                      20
Vendor: Autonomy

 IDOL supports both Early and Late binding:
   Hybrid approach “Best of both Worlds”
   IDOL: Early Binding = “Mapped”
   IDOL: Late Binding = “Unmapped”


 Ultraseek
   Ultraseek is Late Binding only



                     Enterprise Search Security
                           Summer 2008
                                                  21
  Vendor: Google Appliance

 Google Appliance
   Late-Binding only
   “spin” is low latency – but actually a compromise...
   Could heavily load security infrastructure
     Does use some caching to lighten the load
     Caching decreases response time = good
     Caching increases latency (ACL changes)




                      Enterprise Search Security
                            Summer 2008
                                                           22
  Vendor: Endeca

 Out of the box is Early Binding only
   Mitigated by low latency for document changes
   Provides accurate document counts by user
   General term is “Record Filters”
 Or can use “joins” to a fulltext ACL index
   RRN: Relational Record Navigation
 Late binding via custom code


                     Enterprise Search Security
                            Spring 2008
                                                    23
“Vendor” Lucene / Solr / Nutch

 Roll your own…




                   Enterprise Search Security
                          Spring 2008
                                                24
    Organizational
          and
 Technical Challenges

“They won’t let me in!”


                          25
Access Issues
 Spider may need “Über Login”


 Divisions worried about loss of control
   Worried about cached copies of data


 Several Approaches
  1. Global Indexing – single Monolithic Search
  2. Federated Search – leverage what’s already there
  3. “Deferred Search”

                      Enterprise Search Security
                            Summer 2008
                                                        26
27
Federated Search




                   28
Deferred
Search


           29
Search Engine

Security Holes



                 30
Check List
 Limit access to Disk files
   Use File / SSH restrictions
   Don’t recommend total file encryption
      (exception for password files of course)
 Files to keep in mind
   Config files, Scripts
   LOGS
 Search Engine Indices
   In some search engines DOCUMENTS CAN BE
    RECONSTRUCTED from the Words Index

                           Enterprise Search Security
                                 Summer 2008
                                                        31
Other “Gotcha’s”
 Secure the Search Admin UI!
   May require other back end changes
 Secure the Search Analytics UI
   Can assign various “roles” as appropriate
 Secure TCP/IP traffic where appropriate
   Searches, spider, logging, admin UI
   Overkill in some cases
 Beware of Cached Data
   Can violate automatic retention policy



                     Enterprise Search Security
                           Summer 2008
                                                  32
Editing Search Engine URLs
 Form-Based Filtering:
  http://www.acme.com/go?coll=public

 Hackable View URLs
  http://www.acme.com/go?viewdoc=100

 DOCUMENT HIGHLIGHTING represents a potential
  Security Hole
   Results List Summaries
   Full-Document highlighting


                     Enterprise Search Security
                           Summer 2008
                                                  33
Gotcha’s: Misc.
 Results Navigators show Meta Data
   Employees see “Upcoming Layoff”, etc.

 Detecting FAILED pages with status 200
   Some Web Servers give back nicely formatted error
    screens or redirects, instead of an HTTP error code

 Desktop Search Holes
   Peer-to-peer may not be properly controlled
   May bypass Office file/doc passwords

 User Data: To Log or Not to Log?
   Potential liability with either choice
     Employee Privacy Concerns
     De Facto Notification
   Disclaimer: We are not lawyers
                                                          34
Wrapping Up…




               35
The Near Future
Enterprise Search and Corporate Security
 Search & Security tied to SOX/HPPA
   • Search Logs get Regulatory Interest
   • Who Saw What, When
   • Failure to Spot Trends becomes Negligence
 Distributed Credentials Management
   • Not as big of a factor in the Enterprise
   • More cooperation between e-commerce sites
   • Government employees accessing other agencies


                      Enterprise Search Security
                            Summer 2008
                                                     36
Call to Action!
Enterprise Search and Corporate Security

 Run some test searches!


 Do you know your company’s current policies?


 If confused, talk to your vendor, or get some
  professional help




                      Enterprise Search Security
                            Summer 2008
                                                   37
  Resources



                            Search Dev Newsgroup:
                              www.SearchDev.org
Newsletter & Whitepapers:
www.ideaeng.com/current

                                              Blog:
                       www.EnterpriseSearchBlog.com



                                                      38
Finish Line
Review & Questions




           General Info info@ideaeng.com
       Mark Bennett mbennett@ideaeng.com




                                           39

								
To top