Slide title is 30pt font

Document Sample
Slide title is 30pt font Powered By Docstoc
					    The Computational World

                    Stephen Pulman*




      University of Oxford Computing Laboratory

                   www.comlab.ox.ac.uk


                         September 2009
*with contributions from Bob Coecke, Karo Moilanen, and Nic Smith
           Brief Intro to Comlab
•   Established 1957. In 2009:
•   55 permanent faculty (35 in 2005)
•   c.110 DPhil (PhD) students
•   c.40 postdoc researchers
•   c.60 MSc students
•   c.200 undergraduates
•   c.£30m research funding (up from £6m in 2005)
                                                               Interdisciplinarity...
                              Systems
                               Biology
        Computational
                                          Verification
           Biology                                                   Numerical
   Cardiac       Virtual
                                                                      Analysis
  Modelling   Physiological
                Human
                               Climate prediction
                  Security Infrastructures Chip design
                                                Robotics       Foundations, Logic
                         Medicine Music                           and structures
             Humanities                   Animal behaviour
                       Biochemical
          Biology                     Medical Informatics
                   Web Technologies                    Physics                Game
         Fluid Dynamics             Sensor Networks              Quantum    Semantics
                       Transportation Aircraft Engineering      Information
                Building                      Commerce
                                                                 Processing
     Spatial              Power Management
   reasoning
                  Retail
                          Finance Nanotechnology
         Machine
                  Computational                                              Programming
         Learning                        InformationRequirements
                   Linguistics                                                   Tools
                                          Modelling

Information systems Knowledge                                         Programming
                     Representation Software   Engineering
                                                                        Languages
            Just three examples:


• Heart Modelling (Computer Science + Physiology)
  = better diagnosis and treatment.

• Quantum Information Processing (Computer
  Science + Physics) = potentially a whole new way
  of doing computing.

• Computational Linguistics (Computer Science +
  Linguistics) = new ways of processing and
  acquiring information.
 Modelling the Human Heart
In Comlab, David Gavaghan, Nic Smith, and Banca Rodriguez are
involved in several projects aimed at heart modelling, to:
- better understand how the heart functions/goes wrong
- better predict the outcome of particular types of treatment
- give clinicians better diagnostic tools and more information
 about individual patients than they currently have.
Modelling the Human Heart

 Heart disease is an enormous health and financial problem:


                                        Hospital
                                     inpatient care
                 Primary care             60%
                     17%

          Drugs 9%


          Outpatient
             6%




                      Hospital
                   outpatient care
                         8%
              Scales in the Heart
                           gene                        clinical
  gene       SNPs                       physiology
                         expression                   medicine
sequence




35,000     100,000 (?)    100 cell    4 tissue     10         1
genes       proteins       types       types     organs      body
       The spatial and temporal scales
• 1m           person
• 1 mm         electrical length scale of cardiac tissue
• 1 µm         cardiac sarcomere spacing
• 1 nm         pore diameter in a membrane protein
Range = 109

• 109 s (70 yrs)      human lifetime
• 106 s (10 days)     protein turnover
• 103 s (1 hour)      digest food
• 1s                  heart beat
• 1 ms                ion channel HH gating
• 1 µs                Brownian motion
Range = 1015
   Ingredients of a Whole Organ Heart Model
Tissue Structure                                 Anatomy




Tissue Properties

                                            Cellular Properties

                       Drug Discovery
                    Clinical Applications
Model Validation
Blood flow




             Coronary vessels




Stress
                      Computational Modelling Applications




Cardiac activation
                                       Ventricular               Coronary Blood flow
and contraction
                                       Blood flow




           Regional work through the
           cardiac cycle                         Stresses during contraction
                                         Volume


                240000


                230000


                220000


                210000




Volume (mm^3)
                200000


                190000


                180000


                170000


                160000


                150000
                         0   200   400         600      800   1000   1200
                                            Time (ms)
 Quantum mechanics: a bluffer's guide



• The state of a system (e.g. electron + nucleus) is described
  by elements of a complex vector space. This allows
  calculation of probabilities of measurements (e.g. position
  or direction of spin) of a particle. You can't simultaneously
  measure both accurately (Heisenberg)
• Measurement of a physical observable is represented by an
  operator on the vector space. When you measure
  something the state `collapses' to a definite value, but this
  blurs the values of other observables.
• `Superposition' of states: i.e. a system can be partly in one
  state (e.g. spin up), partly in another (spin down),
  simultaneously. A pair of particles can be in any
  superposition of pairs of positions.
 Quantum mechanics: a bluffer's guide




• It is possible to create pairs of particles that are
  `entangled': i.e. some of their their properties are
  correlated: e.g. direction of spin or charge.
• For example, both might have the same, or opposite
  directions of spin or charge. This means that (if you know
  which way the entanglement works, from the way the
  particles were constructed) measuring one particle tells you
  about the other.
• A `qubit' is the quantum analogue of a classical `bit': think
  of it as a superposition of proportions of states 0 and 1.
• Particles representing qubits can be entangled in the same
  way.
 Quantum mechanics: a bluffer's guide

• The very weird thing about entanglement is that it can
  happen over long distances: many kilometres in fact - the
  current record is from Geneva to the Canary Islands!
• In quantum cryptographic key distribution we can use this
  fact to transmit information in a completely secure way.
• We can also carry out an operation which allows us to
  transmit the state of a qubit using only two classical bits.
• This is known as `quantum teleportation' ... but not
  teleportation as we know it, Jim.
           Quantum teleportation
• Create an entangled qubit BC, where Alice (in Oxford) has
  B and Bob (in the other place) has C. Alice wants to
  transmit qubit A (in an unknown state) to Bob.
• She then jointly measures A and B and gets two classical
  bits as the result. A and B are now entangled by the
  measurement.
• B is no longer entangled with C - but C now potentially
  contains information about A.
• Alice sends the two classical bits to Bob, which encode one
  of 4 states (00,01,10,11).
• This tells Bob which operation to perform on C to
  reconstruct A.
           Quantum entanglement


• The possibility of teleportation shows that by measuring
  (i.e. observing) one can process continuous quantum data..
• This leads to a model of quantum computing where a
  classical computer controls a huge quantum state.
• Such a computer is expected to be exponentially faster
  than a classical computer on its own.
• Entanglement based communication protocols are secure
  because although an eavesdropper could intercept the
  classical communication, the quantum channel cannot be
  intercepted without destroying the quantum state.
• But how can be sure that such quantum computational
  processes have the right properties?
           Quantum entanglement
• In Comlab, Samson Abramsky and Bob Coecke have been
  developing a graphical calculus to reason about such
  quantum processes, and prove they do have the right
  properties:
             Quantum teleportation
• Here is their proof of correctness of the teleportation
  protocol:
            Quantum teleportation
• And here is the textbook proof:
             Quantum teleportation
• How do we know the graphical calculus is telling us the
  truth?
• Because it has a mathematical interpretation in Category
  Theory, one of the basic tools of Computer Science.

The Future:
• The Comlab Quantum Information Processing group
  collaborates with Materials Science and Physics to advance
  this research.
• Quantum Computing will be the next real computer
  revolution...but not for a while.
WHAT IS COMPUTATIONAL LINGUISTICS?

We aim to create computer programs that behave
  as if they had some understanding of English (or
  French, Urdu, Hixkaryana, etc...)
To enable:
Better searching for information: just ask a
  question like `How many colleges are there in
  Oxford University?'
Automatic translation, summarisation of
  documents.
You can interact naturally with the computer to
  solve problems or perform tasks that would
  otherwise be too difficult, take too long, or
  require specialist computational expertise.
             Why is this difficult?
Computer languages:
●   simple
●not ambiguous (a sentence only has one
meaning)
●   not dependent on context
    - a sentence always means the same thing.


Human languages:
●   not simple, and very ambiguous
● the same sentence can mean different things in
different contexts: e.g. `he's here now'
● require non-linguistic knowledge (`common
sense') to resolve ambiguities.
    How can we get the computer to
             understand?

• we find the syntactic or grammatical structure of
  sentences
• we use that to translate the sentence into a
  simpler language that the computer can
  understand: a `meaning representation'
• but we also have to solve the problems of
  ambiguity and `common sense'
• and the problem of context changing meaning
Syntactic analysis
 Sentences have a hierarchical structure: words
 combine into phrases, and phrases into
 sentences:

               Sentence

                          Verb Phrase


                               Noun Phrase
         Noun Phrase



        Name Name Verb Det Adjective      Noun
        Barack Obama won the presidential election
              Syntactic Analysis
We can describe the structure of sentences with rules like:

Sentence    → Noun Phrase + Verb Phrase
Noun Phrase → Name + Name
Noun Phrase → Determiner + Adj + Noun
Verb Phrase → Verb + Noun Phrase
etc.

  We call a collection of rules like this a `grammar' and we
  can use it to find the structure of new sentences
  automatically. We call this process `parsing'.
                  Ambiguity

Many sentences have more than one possible
 syntactic analysis, corresponding to different
 interpretations:

 He photographed the man with the camera

Either:
  He used a camera to photograph a man
Or:
  He photographed a man who had a camera.
                    Ambiguity

But often some of the syntactic possibilities do not
  make sense:
  He photographed the man with a bicycle

- can't (easily) mean:

He used a bicycle to photograph a man

- because we know that you can't take photographs
  with a bicycle. But it is very hard to get a
  computer to `know' this.
             Ambiguous words

• Many words are ambiguous: for example, the English
  word `bank' has several meanings: financial, and
  river-related
• Humans do not find this a problem:

    He went to the bank to get some money
   The fish jumped out of the water on to the bank

but computers do not know that fish do not jump on to
  a financial institution (yet), or that you do not
  (usually) get money from a riverbank
             The state of the art

How well do we do?

• Parsing - 70-80% on newspaper text, including
  disambiguation.

• Word sense ambiguity: around 60%

• Context - `he's here now' etc. Really hard. Lucky
  to get it right 50% of the time!
A practical application: detecting positive
  (POS), neutral (NTR), or negative (NEG)
             attitudes in text.
Used for:
• brand management (what do people think
  of your company or reputation?)
• consumer research (what do people like
  and dislike about your product?)
• monitoring government policy (will you
  get re-elected!)
• we call this Sentiment Analysis
      Automated Sentiment Analysis

Typically done by spotting key words:
• This is a great film!
• This is the worst film I've seen this year.

Not always so simple:
• publicity for the Blackberry Storm
• adverse publicity for the Blackberry Storm
• counteract adverse publicity for the Blackberry Storm
• fail to counteract adverse publicity for the Blackberry
  Storm
Using syntactic analysis we can deal with these
more sophisticated sentiment phenomena, using
a set of `sentiment logic' rules:



•   adverse + NTR = NEG (e.g. adverse circumstances)
•   counteract + POS = NEG (e.g. counteract progress)
•   counteract + NEG = POS (e.g. counteract disease)
•   fail + NEG = POS (e.g. fail to counteract progress)
•   fail + POS = NEG (e.g. fail to counteract disease)
  Syntax + sentiment logic:
             VP:NEG

                    VP:POS


                          NP:NEG


                                NOM:NTR



                                       MOD:NTR




fail to counteract adverse publicity for the Blackberry
  We can classify individual entities rather
  than whole documents:
Sentiment analysis from Twitter:
         David Cameron:            Nick Clegg:




        Gordon Brown:              David Beckham:
We can track sentiment over time, or
   across a series of documents:
It's a computational world, but not this one:
         Our computational world

• Computers are not going to go away - if
  anything, they will become ever more present.
  They are in your watch, phone, car, credit card,
  magazine page...
• They are deeply involved in every aspect of
  manufacturing, commerce, physical science and
  technology, and increasingly in the medical and
  life sciences.
• And are beginning to be indispensable in the
  humanities and the arts.
• I can't imagine how we managed for so long
  without them...

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:9
posted:8/13/2010
language:English
pages:39