finegrain-auth-adil by liaoxiuli


									        Redundancy and Information Leakage
        in Fine Grained Access Control

Govind Kabra (Univ of Illinois, Urbana-Champaign)
Ravi Ramamurthy (Microsoft Research)
S. Sudarshan (IIT Bombay)
                                        Modified for the course by:
                                        Adil Anis Sandalwala
 April 6, 2009                                                    1
    Fine Grained Access Control
      SQL authorization at the level of
           e.g. grant select on employee(name) to public

      Fine-grained access control examples
           Managers can see records of their employees
           Faculty has access to grades of courses they
                                                            User A

      Application-layer support for FGA                    User B
                                                            User C
           Several Limitations.

      Database support for FGA
           Validity checking model
           View replacement model
    April 6, 2009                                                    2
View Replacement model for FGA
   Based on rewriting of query
    Create authorization view RA             In user query, replace R by RA

   Auth view authL: customers can see the lineitems only for
    their orders                                      σ
    User Query:
                                         σ             σ
    select * from lineitem
    where shipmode=„express‟            L            authL         L      σ
   Several proposals                                                     O
         Oracle VPD, Sybase row level security
         LeFevre et al. [VLDB04], Agrawal et al. [ICDE05]

   Key implementation issues
         Redundancy in rewritten queries
         Information leakage through UDFs, timing analysis, exceptions
April 6, 2009                                                                 3

    Motivation
    Previous work
    Redundancy Removal
    Information Leakage
      What plans can be guaranteed to not leak information.

      Techniques to find optimal best plans.

    Integrating RR and safety.
    Conclusion

April 6, 2009                                              4
Previous work

    Oracle‟s Virtual Private database(VPD)
      Supports FGA through functions associated with each

       relation which return strings of predicates.
    Cell level access control (LeFevre et al.)
      Replace unauthorized values with null.

    Two classification of models:
      Truman Models: Uses query rewriting.

      Non-Truman Models: A query is valid if it can be
       rewritten with authorized views. Invalid queries are

April 6, 2009                                                 5
Query Rewriting Model
    An authorized View
     CREATE VIEW auth_Ri AS
          SELECT Li FROM Ri WHERE Pi
     Li contains expressions implementing cell level access-control
     Pi has the authorization predicates (may have sub-queries)
    Such authorized views are represented as:
     Ri          Ai
     where Ai is an expression containing the sub-queries in Pi.
     Selection conditions in Pi are folded into semi-join condition θi
     For simplicity, from now on we assume Li to be *
    Thus a query of the form:
     R1         R2    ….   Rn    (R1   θ1 A1)   (R2   θA2)
April 6, 2009                                                        6
       Redundancy Removal

                Most queries access only authorized data

April 6, 2009                                          7
    Redundancy between queries and authorization
    Auth view authL: Customers can see lineitems only for their
     own orders

    Query: Customer 123 wants to see details of lineitems
     shipped using express mode only for his orders
     Select * from lineitem L, orders O       σ
                                          σ        σ      σ       σ       σ
     where l_orderkey = o_orderkey
            and o_custkey=123             L   O   authL   O               O
            and l_shipmode=„express‟
                                                              L       σ

    The semi-join check is redundant !!!

    April 6, 2009                                                     8
Redundancy detection and removal-I

    In general, RR is equivalent to query minimization
    Heuristic approach: eliminate redundant semi-joins
          If E2 subsumes E1, then transform E1          E2 to E1

                E1               E2                  σ
                     σ       σ

                                                 L        σ
                L        σ
                                      Apply RR

    Added transformation rules in a rule based optimizer
          Use materialized view matching support for testing subsumptions
April 6, 2009                                                                9
Redundancy detection and removal-II

    Subsumption Test
          E1 is subsumed by E2 in E1             θi
                                                        E2 if
               The predicates in selection of E2 are weaker than corresponding predicates
                in E1
               The semi-join condition in   equates the columns of E1 and E2 that are
                equivalent under the mapping.

    Rule to detect and remove redundancy:
          If E2 subsumes E1 then replace E1 θiE2 by E1
          In case of disjunction of sub-query expression:
               Apply subsumption test to each disjunct
               If any one is found to subsume E1, then discard the complete set of semi-joins.

    Consider the query:
     select * from E1 where (A in (select….)) OR (B in (select…..))

April 6, 2009                                                                                     10
Redundancy detection and removal-III

    Consider a rewritten query:
     (R1         A1)
                       (R2   θ2
                               A2)   ……...     (Rn    An)

    Rules applied at:
          Transformation Phase:
               Explores all possibilities of detecting redundancy
               Inefficient.
          Simplification Phase : Normalized form by pulling
           up semi-joins.
               Linear number of authorization checks
               Depends on order of Ai‟s
               Easy to integrate with existing optimizers.
April 6, 2009                                                        11
Performance benefits of RR
    TPC-H Benchmark Queries, with authorization checks

                TPCH Query   Execution Time   Execution Time
                             Without RR       With RR
                Query 3      100.00           48.28
                Query 6      56.03            38.79
                Query 10     94.83            55.45
                Query 12     77.57            43.97
                Query 14     49.14            38.79

                             Comparing normalized execution times

April 6, 2009                                                   12
       Information Leakage

                So you thought only the query result

April 6, 2009                                          13
Information Leakage via UDFs
    Auth view myemployee: only those employee whose
     dept_id is in A1                        σmyudf(E.salary)
     select * from employee
                            σmyudf(E.salary)                  σmyudf(E.salary)      A1
       where myudf(salary)
                                  myemployees   employees A1     employees

    Final query plan is not safe
          UDF may be pushed down in plan, and executed on
           unauthorized intermediate result
          As a side-effect, UDF may expose values passed to it [Litchfield]

April 6, 2009                                                                  14
Other channels of information leakage

    Exceptions
          Query: select * from employee
                   where 1/(salary-100K) = 0.23
          Query plan: Selection condition in query gets pushed below
           authorization semi-join
          Divide by zero exception if salary = 100K
          Reveals that employee has salary = 100K
    Error Messages
          to_Integer function may throw error revealing the content
    Timing Analysis
          Sub-query can perform an expensive computation only if certain
           tuples are present in its input.
          Can be partly solved using sandboxing
April 6, 2009                                                               15
Preventing Information Leakage via UDFs

    UDFonTop: Keep UDFs at the top of query plan
          Definitely safe, no information leakage
          Better plans possible if UDF is selective

                σmyudf(E.salary)   A1

                   employees             employees     A1

    Optimal Safe plan
          When is a plan safe?
          How to search for optimal plan amongst alternative safe plans?

April 6, 2009                                                               16
Safe plans w.r.t. UDFs
    Approach 1: If UDF uses attributes from R, apply
     authorization checks for R before UDF
          Not sufficient; Full expression must be authorized
          Expression that can be rewritten using authorized views [RMSR04]
          How to efficiently infer which expressions are authorized?
    Auth Views: employee, (medical-record A2)
    Query: Find names of all employee having AIDS
    σM.disease=„AIDS‟                               A2


employees                     employees   σM.disease=„AIDS‟ σudf2(σM.disease=„AIDS‟

                                          medical-record      employees   medical-record
    medical-record   A2
April 6, 2009                                                                        17
Some definitions
    Authorized Expression
           An expression is authorized if it is equivalent to an expression
           defined using only authorized views.
    Safety w.r.t. USF‟s
           A node in a query plan is safe w.r.t. USF‟s if:
               There are no USF‟s in the node, and all inputs (if any) of the
                node are all safe, or
               The node has a USF, it is not an apply operator, and all its
                inputs are safe and authorized.
               The node is an apply operator, both its children are safe and
                 Right child does not have any USF invocations, or

                 The left child is authorized

April 6, 2009                                                                    18
   Framework of rule based optimizer


                   σ        Q1


                   G6                                  G6

                                 G5        G7                     G5

  G1                    G4            G3    G1              G4          G3
employees                             Q1   employees                    Q1
                        σ                                   σ
                   G2 medical-records                  G2 medical-records
   April 6, 2009                                                       19
Inferring authorization of expressions
    Authorization as a logical property of group

    Start with the rewritten query:

    Mark groups containing original authorization views as
    Rule IA: If all the children group nodes of an operation
     node are authorized, the parent-group-node of that
     operation node are also marked as authorized.
    Propagate authorization upwards to the parent groups
          A node which is not authorized initially may be inferred as
           authorized later.
          This information must be propagated to the parents of the node
April 6, 2009                                                               20
   Inferring authorization of expressions
                                          Authorization as a logical property of group
                                          Start with the rewritten query:

                   σ                      Mark groups containing original authorization
                                           views as authorized
   medical-records                        Propagate authorization upwards to the
                                           parent groups
                   G6                                               G6

                                 G5                   G7                         G5

  G1                    G4            G3               G1
                                                      G1                 G4            G3
employees                             Q1             employees                         Q1
                        σ                                                σ
                   G2 medical-records                                G2 medical-records
   April 6, 2009                                                                      21
    Extending optimizer to find optimal safe plan

      There are two approaches to find the optimal safe plan:

           Only Safe Transformations
                       Allow UDF push-down/pull-up only on top of authorized expressions
                       Only safe alternatives are present in memo, pick the optimal plan

           Pick Safe Plan
                       Allow all transformations for UDF
                       Use “required/derived feature” to pick only plans where UDF are on
                        top of authorized expression

    April 6, 2009                                                                        22
Both RR and Optimal Safe Plan are necessary: Motivation

                                 No RR            With RR

                UDF on top       100              47.83

                Safe Optimal     53.25            23.25

                     Comparing normalized execution times.

April 6, 2009                                                23
Integrating RR and Optimal safe plan

    Rule-based optimizers involve a simplification phase
     followed by a transformation phase
          RR in simplification reduces query size and optimization time
    But RR in simplification interferes with safety inference
          Optimal safe plan generation requires preserving
           the following input plan until memo is created

          RR can possibly remove some Ai
    Possible integration:
          RR in transformation phase
          RR in simplification phase with conditioned authorization for safe
           plan generation

April 6, 2009                                                               24
RR during Transformation Phase

    Introduce authorization-anchor nodes
          These prevent transformations that pull-up Ri or Ai‟s or push
           down any operation into the semi-join

    At start of transformation, we remove these nodes
     perform authorization propagation.

    Then RR rules are applied.

    Disadvantage:
          Increased optimization time due to multiple redundancy checks of

April 6, 2009                                                              25
RR in simplification phase with conditioned authorization
    Instead of marking an expression authorized, we mark it
     as conditioned-authorized.
          For eg.: we have a relation Ri with authorization Ai
               Ai could be removed/ moved elsewhere by Ri
               So we mark Ri as authorized condition on Ai
                 Ie. Conditioned on it being semi-join/joined with Ai

    If simplification results in a empty condition, we can infer
     that the expression is unconditionally authorized.
    For a group:
          If any of the child is unconditionally authorized, so is the group.

    If expression E is of the form E1                      E2, where
          E1 is authorized conditioned on A1 and
          E2 is equivalent to Bj Ai, then
          We infer that resultant expression is unconditionally authorized.
April 6, 2009                                                                    26
Rule for propagation authorization

    The extended propagation rule is:
      If operation has two groups E1 and E2 each

       authorized on A1 and A2 resp., then result is
       authorized conditioned on A1 and A2
      If A1 subsumes E2, we drop A1 from the condition.

April 6, 2009                                              27
Handling Exceptions and Error Messages

    For each built-in function, we create a safe version of the
     function that ignores exceptions and does not output
    Predicates using USF‟s are rewritten using the
     corresponding safe version.
     We can create a safe version of division function, which catches
     exception and returns a null value.
     for the predicate (1/(salary-100K)==0.2) we can use this safety
      This may allow unauthorized tuples to pass through. However,

        we can write a such that it is weaker than the original condition.
    We can push down the safe predicates while retaining
     the unsafe version on top.
April 6, 2009                                                                28
Performance Evaluation
    Study utility of RR and Optimal Safe Plan
    Auth: Managers can see information only pertinent to
     their region
                   authNation: Nation  ( σ (Region))
                   authCustomer: Customer (Nation      ( σ (Region)))
                   …
    Query: Find supplier who fulfill “important” orders

                             View replacement

April 6, 2009                                                            29
 Both RR and Optimal Safe Plan are necessary

                            No RR    With RR
                   UDF On
                                                     100.00   47.83
                        Apply RR
                        UDF On Top
                   Optimal                           53.25    23.25

                                        Apply Both

Safe Optimal

 April 6, 2009                                                 30
    Redundancy in queries
          Transformation rules for redundancy removal
    Information leakage
          Definition of a safe plan
          Extending optimizer for generating optimal safe plan
    Preliminary performance study of proposed techniques
          Ensure safety while providing significant performance benefits
    Future:
          Study conditioned authorization to reduce optimization time
          Better solution for timing analysis based information leakage
          Add rules for handling authorizations involving nullification and

April 6, 2009                                                                  31
                 Thank You!!


April 6, 2009                   32

To top