Technologies and Goals for Softw

Document Sample
Technologies and Goals for Softw Powered By Docstoc
					Methods for Software Protection
             Prof. Clark Thomborson

             Keynote Address at the
            International Forum on
             Computer Science and
         Advanced Software Technology
            Jiangxi Normal University
                 11th June 2007
Questions to be (Partially) Answered
 What is security?
 What is software watermarking, and how
  is it implemented?
 What is software obfuscation, and how
  is it implemented?
     How does software obfuscation compare
      with encryption?
     Is “perfect obfuscation” possible?

                    SW Protection 11June07    2
 What is Security?
 (A Taxonomic Approach)
The first step in wisdom is to know the things themselves;
this notion consists in having a true idea of the objects;
objects are distinguished and known by classifying them
methodically and giving them appropriate names.
Therefore, classification and name-giving will be the
foundation of our science.

Carolus Linnæus, Systema Naturæ, 1735

(from Lindqvist and Jonsson, “How to Systematically
Classify Computer Security Intrusions”, 1997.)

                          SW Protection 11June07         3
    Standard Taxonomy of Security

1.   Confidentiality: no one is allowed to read, unless they
     are authorised.
2.   Integrity: no one is allowed to write, unless they are
3.   Availability: all authorised reads and writes will be
     performed by the system.
    Authorisation: giving someone the authority to do
    Authentication: being assured of someone’s identity.
    Identification: knowing someone’s name or ID#.
    Auditing: maintaining (and reviewing) records of
     security decisions.
                          SW Protection 11June07          4
    A Multi-Level Hierarchy

   Static security: the Confidentiality, Integrity, and
    Availability properties of a system.
   Dynamic security: the technical processes which
    assure static security.
      The gold standard: Authentication, Authorisation, Audit.

       Defense in depth: Prevention, Detection, Response.
   Security governance: the “people processes”
    which develop and maintain a secure system.
       Governors set budgets and delegate their responsibilities
        for Specification, Implementation, and Assurance.

                              SW Protection 11June07          5
    Generalized Static Security
   Confidentiality, Integrity, and Availability are properties
    of read and write operations on data objects.
   What about executable objects?
       Unix directories have “rwx” permission bits.
       XXXX-ity: all executions must be authorised.
       GuiJu FangYuan ZhiZhiYe  a new English adjective
   At the top of a taxonomy we should combine, rather
    than divide.
       Confidentiality, Integrity, and Guijuity are Prohibitions.
       Availability is a Permission.
                                                                        P−           P+

                                     C            I         G   A   C   I    G       A
                                   SW Protection 11June07                        6
Prohibitions and Permissions

   Prohibition: prevent an action.
   Permission: allow an action.
   There are two types of action-secure systems:
       In a prohibitive system, all actions are prohibited by
        default. Permissions are granted in special cases, e.g.
        to authorised individuals.
       In a permissive system, all actions are permitted by
        default. Prohibitions are special cases, e.g. when an
        individual attempts to access a secure system.
   Prohibitive systems have permissive subsystems.
   Permissive systems have prohibitive subsystems.
                             SW Protection 11June07          7
    Recursive Security
   Prohibitions, i.e. “Thou shalt not kill.”
        General rule: An action (in some range P−) is
         prohibited, with exceptions (permissions) E1, E2, E3,
   Permissions, i.e. a “licence to kill” (James Bond).
        General rule: An action in P+ is permitted, with
         exceptions (prohibitions) E1, E2, E3, ...
   Static security is a hierarchy of controls on actions:

    P+: permitted
                              E11                       E2
             E1: prohibited
                               SW Protection 11June07             8
Is Our Taxonomy Complete?
   Prohibitions and permissions are properties of
    hierarchical systems, such as a judicial system.
       Most legal controls (“laws”) are prohibitive: they prohibit
        certain actions, with some exceptions (permissions).
   Contracts are non-hierarchical (agreed between
    peers), and consist mostly of requirements to act
    (with some exceptions):
       Obligations are promises to do something in the
       Exemptions are exceptions to an obligation.
   Obligations and exemptions are not well-modeled
    by action-security rules.
       Obligations arise occasionally in the law, e.g. a doctor’s
        “duty of care” or a trustee’s fiduciary responsibility.
                             SW Protection 11June07            9
Forbiddances and Allowances
   Obligations are forbidden inactions; Prohibitions are
    forbidden actions.
       When we take out a loan, we are obligated to repay it. We are
        forbidden from never repaying.
   Exemptions are allowed inactions; Permissions are
    allowed actions.
       In the English legal tradition, a court can not compel a person to give
        evidence which would incriminate their spouse (husband or wife).
        This is an exemption from a general obligation to give evidence.
   We have added a new level to our hierarchy!

                                                Forbid                 Allow

             Pro   Per       Obl Exe       Pro             Obl       Per   Exe
                                       SW Protection 11June07                    10
Reviewing our Questions
1. What is security?
     Three layers: static, dynamic, governance.
     A new taxonomic structure for static security:
      (forbiddances, allowances) x (actions, inactions).
     Four types of static security rules: prohibitions
      (including “guijus”), permissions, obligations, and
2. What is software watermarking, and how is
   it implemented?
3. What is software obfuscation, and how is it

                       SW Protection 11June07           11
     Defense in Depth for Software
1.   Prevention:
   a) Deter attacks on forbiddances using obfuscation, encryption,
         watermarking, cryptographic hashes, or trustworthy computing.
   b) Deter attacks on allowances using replication or resilient
2.   Detection:
   a) Monitor subjects (user logs). Requires user ID: biometrics, ID
         tokens, or passwords.
   b) Monitor actions (execution logs, intrusion detectors). Requires
         code ID: cryptographic hashing, watermarking.
   c) Monitor objects (object logs). Requires object ID: hashing,
3.   Response:
   a) Ask for help: Set off an alarm (which may be silent –
         steganographic), then wait for an enforcement agent.
   b) Self-help: Self-destructive or self-repairing systems.
Note: “steganography” means “secret writing” – an invisible watermark.
                                SW Protection 11June07               12
Software Watermarking
Key taxonomic questions:
 Where is the watermark embedded?
     How is the watermark embedded?
   When is the watermark embedded?
   Why is the watermark embedded?
     What are its desired properties?

                    SW Protection 11June07   13
    Software Watermarking Systems
   An embedder E(P; W; k)  Pw embeds a message (the
    watermark) W into a program P using secret key k,
    yielding a watermarked program Pw
   An extractor R(Pw ; ... )  W extracts W from Pw
       In an invisible watermarking system, R (or a parameter) is a secret.
       In visible watermarking, R is well-publicised (ideally obvious).
   The attack set A and goal G model the security threat.
       For a robust watermark, the attacker’s goal is a false-negative
        extraction, usually by creating an attacked object a(Pw), with
        R(a(Pw); ... ) ≠ W such that Pw is valuable.
       For a fragile watermark, the attacker’s goal is a false-positive:
        R(a(Pw); ... ) = W such that Pw ≠ P is valuable.
       A protocol attack is a substitution of R’ for R, causing a false-
        negative or false-positive extraction.

                                    SW Protection 11June07                  14
Where Software Watermarks are
   Static code watermarks are stored in the
    section of the executable that contains
   Static data watermarks are stored in other
    sections of the executable
   Static watermarks are extracted without
    executing (or emulating) the code.
       A watermark extractor is a special-purpose static
       Extraction is inexpensive, but we don’t know of any
        robust static code watermarks. Attackers can
        easily modify the watermarked code to create an
        unwatermarked (false-negative) version.
                          SW Protection 11June07          15
Dynamic Watermarks
   Easter Eggs are revealed to any end-user
    who types a special input sequence.
   Other dynamic behaviour watermarks:
       Execution Trace Watermarks are carried in the
        instruction execution sequence of a program, when
        it is given a special input sequence (possibly null).
       Data Structure Watermarks are built by a
        program, when it is given a special input.
       Data Value Watermarks are produced by a
        program on a surreptitious channel, when it is given
        a special input.

                            SW Protection 11June07          16
Easter Eggs
                                 The watermark is
                                  visible – if you know
                                  where to look!
                                 Not very robust,
                                  after the secret is
                                 See

              SW Protection 11June07                      17
Dynamic Data Structure Watermarks
   The embedder inserts code in the program, so that it
    creates a recognisable data structure when given specific
    input (the key).
   Details are given in our POPL’99 paper, and in two
    published patent applications.
       Assigned to Auckland UniServices Ltd.
       I would very much like to find licensed uses for this technology!
   Implemented at
    (2000- )
   Experimental findings by Palsberg et al. (2001):
       JavaWiz adds less than 10 kilobytes of code on average.
       Embedding a watermark takes less than 20 seconds.
       Watermarking increases a program’s execution time by less than
       Watermark retrieval takes about 1 minute per megabyte of heap.

                                SW Protection 11June07                 18
Thread-Based Watermarks
   A dynamic watermark is expressed in the
    thread-switching behaviour of a program,
    when given a specific input (the key).
       The thread-switches are controlled by non-nested
       NZ Patent 533208, US Patent App 2005/0262490
       Article in IH’04; Jas Nagra’s PhD thesis, 2006
   The embedder inserts tamper-proofing
    sequences which closely resemble the
    watermark sequences but which, if removed,
    will cause the program to behave incorrectly.
       This is a “self-help” response mechanism.

                          SW Protection 11June07           19
SW Watermarking
(Review of Taxonomic Questions)
   Where is the watermark embedded?
     How   is the watermark embedded?
 When is the watermark embedded?
 Why is the watermark embedded?
     What   are its desired properties?

                       SW Protection 11June07   20
Active Watermarks
   We can embed a watermark during a design
    step (“active watermarking”: Kahng et al.,
       IC designs may carry watermarks in place-route
       Register assignments during compilation can
        encode a software watermark, however such
        watermarks are insecure because they can be
        easily removed by an adversary.
   Most software watermarks are “passive”, i.e.
    inserted at or near the end of the design

                          SW Protection 11June07         21
    Why Watermark Software?
 Invisible robust watermarks: useful for
  prohibition (of unlicensed use)
 Invisible fragile watermarks: useful for
  permission (of licensed uses).
 Visible robust watermarks: useful for
  assertion (of copyright or authorship).
 Visible fragile watermarks: useful for
  affirmation (of authenticity or validity).

                    SW Protection 11June07     22
A Fifth Function?
  Any watermark is useful for the
   transmission of information irrelevant to
   security (espionage, humour, …).
  Transmission Marks may involve
   security for other systems, in which
   case they can be categorised as
   Permissions, Prohibitions, etc.

                 SW Protection 11June07   23
Our Functional Taxonomy for
Watermarks [2002]

                      Protective                          Non-protective

           Robust                               Fragile   Transmission

    Assertion   Prohibition Affirmation Permission
    (Visible)   (Invisible)  (Visible)   (Invisible)

 But: there are no “assertions” and “affirmations”
           in our theory of static security!

                            SW Protection 11June07                         24
    Future and Past Actions

   The Rules of static security          Secure
    define what a system
    should do in the future.
   Assertions (e.g. of
    authorship) are
                                 Assure                Rule
    Assurances about a past
   Affirmations (e.g. of
    authenticity) are
    Assurances about a past Affirm Assert Forbid              Allow
   Audit records are
   Identifications and                 Prohibit Obligate Permit Exempt
    Authentications are
                                SW Protection 11June07               25
Reviewing our Questions
1. What is Security?
2. What is software watermarking, and
   how is it implemented?
3. What is software obfuscation, and how
   is it implemented?
4. How does software obfuscation
   compare with encryption? Is “perfect
   obfuscation” possible?

                 SW Protection 11June07   26
What is Obfuscation?
   Obfuscation is a semantics-preserving
    transformation of computer code that
    renders it more secure against
    confidentiality attacks.

                   SW Protection 11June07   27
What Secrets are in Software?
   Algorithms (so competitors or attackers can’t
    build similar functionality without redesigning
    from scratch).
   Constants, such as an encryption key (typically
    hidden in code that computes obscure functions
    of this constant).
   Internal function points, such as a license-
    control predicate “if (not licensed) exit()”.
   External interfaces (to deny access by attackers
    and competitors to an intentional “service
    entrance” or an unintentional “backdoor”).
                        SW Protection 11June07    28
Security Boundary for
Source code P
                                                 Executable X’
Algorithm           Compiler                     • Same behaviour as X
                                                 • Released to attackers
Function Points                                    who want to know
                   Executable X
                                                   secrets: source code P,
Secret Keys                                        algorithm,
                    Obfuscator                     unobfuscated X,
Secret Interface                                   function points, …

                        SW Protection 11June07                        29
Security Boundary for Encryption
Source code P                       Executable X
Secret Keys
                                                    file E(X)
Function Points
Secret Interface
                              Buffer (RAM)

                          Attacker’s computer      GUI and I/O

                         SW Protection 11June07           30
Encryption v. Obfuscation
+ Strong encryption E() can be used.
    •   Security is assured if key-secrecy is maintained, and if the
        attacker is unable to look inside the “black-box” CPU.
– We need a “black box” for the key-store, decryption,
  and execution.
    •   If the black box isn’t big enough to store the entire program,
        then branches into an undecrypted block will stall the CPU.
    •   This runtime penalty is proportional to block size, but
        stronger encryption  larger blocks  larger runtime
    •   The RAM buffer and the decrypter must be large and fast, to
        minimize the number of undecrypted blocks, but “large and
        fast”  “expensive or insecure”.
•   “Black boxes” are obfuscations – we build them either
    from hardware or from software.
                               SW Protection 11June07                    31
Partial Encryption
   Small portions of large executables can be
    protected with strong encryption, at reasonable
        The remainder of the executable may be unprotected,
        or protected with cheap-but-insecure encryption.
       “Small portions” = some or all of the control transfers,
        plus a few of the variables (Maude & Maude, 1984;
        many similar articles and patents since 1984)
   The strongly-protected portions are executed in a
    secure hardware environment, e.g. a smart card.
       Extreme case: a dongle is a secure execution
        environment for just one predicate “if ( licensed(x) ) …”
   Performance penalties may be large, especially
    when more than one protected program is being
    executed.                 SW Protection 11June07          32
How to Obfuscate Software?
   Lexical layer: obscure the names of variables,
    constants, opcodes, methods, classes,
    interfaces, etc. (Important for interpreted
    languages and named interfaces.)
   Data obfuscations:
       obscure the values of variables (e.g. by encoding
        several booleans in one int; encoding one int in
        several floats; encoding values in enumerable
       obscure data structures (e.g. transforming 2-d
        arrays into vectors, and vice versa).
   Control obfuscations (to be explained later)

                          SW Protection 11June07        33
Attacks on Data Obfuscation
   An attacker may be able to discover the decoding
    function, by observing program behaviour
    immediately prior to output: print( decode( x )
    ), where x is an obfuscated variable.
   An attacker may be able to discover the encoding
    function, by observing program behaviour
    immediately after input.
   A sufficiently clever human will eventually de-
    obfuscate any code. Our goal is to frustrate an
    attacker who wants to automate the de-obfuscation
   More complex obfuscations are more difficult to de-
    obfuscate, but they tend to degrade program
    efficiency and may enable pattern-matching attacks.

                         SW Protection 11June07           34
Cryptographic Obfuscations?
   Cloakware have patented an algebraic obfuscation on
    data, but it does not have a cryptographic secret key.
       W Zhu, in my group, fixed a bug in their division algorithm.
   An ideal data obfuscator would have a cryptographic
    key that selects one of 264 encoding functions.
   Fundamental vulnerability: The encoding and
    decoding functions must be included in the
    obfuscated software. Otherwise the obfuscated
    variables cannot be read and written.
       “White-box cryptography” is an obfuscated code that resists
        automated analysis, deterring adversaries who would extract
        a working implementation of the keyed functions or of the
        keys themselves.

                               SW Protection 11June07                  35
  Perfect Obfuscation?
 Function Ensemble F                Polynomial Time Bound p()
 Property π: F → {0,1}
                   Program P for f  F                 Program P’
                                                       communicates 1
                                                       bit π(f) of secret
                                         Obfuscator    message

         Secret Message

No obfuscator can prevent this prisoner from
sending messages to an accomplice (Barak et al, 2001).
But... the de-obfuscator might have to spend non-linear effort on a
program that was obfuscated in linear time.
                              SW Protection 11June07                  36
Practical Data Obfuscation
   Barak et al. have proved that “perfect obfuscation” is
    impossible, but “practical obfuscation” is still possible.
   We cannot build a “black box” (as required to
    implement an encryption) without using obfuscation
    somewhere – either in our hardware, or in software,
    or in both.
   In practical obfuscation, our goal is to find a cost-
    effective way of preventing our adversaries from
    learning our secret for some period of time.
       This places a constraint on system design – we must be able
        to re-establish security after we lose control of our secret.
       “Technical security” is insufficient as a response mechanism.
       Practical systems rely on legal, moral, and financial controls
        to mitigate damage and to restore security after a successful

                               SW Protection 11June07                37
Control Obfuscations
 Inline procedures
 Outline procedures
 Obscure method inheritances (e.g.
  refactor classes)
 Opaque predicates:
     Dead code (which may trigger a tamper-
      response mechanism if it is executed!)
     Variant (duplicate) code
     Obscure control flow (“flattened” or
                     SW Protection 11June07    38
    Opaque Predicates
{A; B }      A                        A                    A

          T        F      T                        F    T        F
              pT                      P?                    PT

      B                 B                          B’   B        Bbug

       “always true”     “indeterminate”     “tamperproof”
 Note: “always false” is not shown on this slide.

                          SW Protection 11June07                 39
Opaque Predicates on Graphs
Dynamic analysis is required to deobfuscate – this is
very difficult to automate!

 f                 g

                                            f                g

         g.Delete()                    if (f = = g) then …
                        SW Protection 11June07                   40
      History of Software Obfuscation
   “Hand-crafted” obfuscations: IOCCC (Int’l Obfuscated C Code Contest,
    1984 - ); a few earlier examples.
   InstallShield has used obfuscation since its first product (1987).
   Automated lexical obfuscations since 1996: Crema, HoseMocha, …
   Automated control obfuscations since 1996: Monden.
   Opaque predicates since 1997: Collberg, Thomborson, Low.
   Commercial vendors since 1997: Cloakware, Microsoft (in their
   Commercial users since 1997: Adobe DocBox, Skype.
   Obfuscation is still a small field, with just a handful of companies selling
    obfuscation products and services. There are only a few non-trivial
    published results, and a few patents.

                                    SW Protection 11June07                   41
   A new taxonomy of static security:
     (forbiddance, allowance) x (action, inaction) =
     (prohibition, permission, obligation, exemption).
   Progress toward a general theory for
    hierarchical and peering secure systems.
       (past, future) x (P-, P+) x (action, inaction) ??
       Existing theories of P- security for future actions
         • Bell-LaPadula, for confidentiality
         • Biba, for integrity
         • Clark-Wilson, for guijuity
   An overview of software security techniques,
    focussing on watermarking and obfuscation.

                               SW Protection 11June07         42

Shared By: