Hiring in Databases

Document Sample
Hiring in Databases Powered By Docstoc
					Database Systems Research:
  Where it is (or should be)
 (aka looking for a “perfect”
  Laks V.S. Lakshmanan
  Dept. of Computer Science
  Univ. of British Columbia

                              December 6, 2001.
Disclaimers and Stage Setting

 not meant to be comprehensive
 necessarily biased
 database intended in a very broad sense
    –   e.g., relational databases, OO, object-relational, …
    –   legacy systems (hierarchical/network DBs)
    –   file system, spreadsheets, network directories
    –   text, media, maps
    –   time series, biological sequences
    –   data on the web, XML
   data management research – more apt term
      DB Research Paradigms
   three major streams:
    – database theory (connections to math. logic,
      finite model theory, …)
    – principles (data modeling, design, query
      languages, query optimization, …)
    – systems (database tuning, benchmarking, …)
 all three have their place in general
 but there are limitations
DB Research: A data-driven
                                data mining, OLAP
   OO                           - data on the web
                                - business data, scientific,
                                - biological data

           - alphanumeric
           - rigid structure

              data                  semi-structured
            multi-media              data & XML
temporal      -raster, video,       - text/doc domination
mobility          -audio          - surprising e.g.:AcEDB
        DB Research: A process-
          driven perspective
 classical: e.g., transactions, triggers,
  integrity checking
 modern:
    –   richer transaction models
    –   active databases
    –   workflow
    –   data warehousing
    –   data integration
   Note: last two have a substantial data
    modeling, query answering,
    algorithmic component.
     Some Database Theory
 what are queries?
 First (bad) answer: any computable
  INOUT function.
 Okay, efficiently computable ones: why is
  this still bad?
 What about the following “queries”?
    – Find the 10th tuple in relation emp.
    – Find the employees with an odd salary.
    – Find the employees the internal
     representation of whose name is odd!
                More on queries
 What went wrong: representation dependence.
 Queries are computable functions that commute
  (i.e. they are generic):
      DB                       Ans

Rep                                  Rep

      Rep(DB)                Rep(Ans)
       Interesting Questions
 what are meaningful queries for a given
  data model/application class?
 how do you design declarative query
  languages and algebras?
 build novel indices for new data types?
 design optimal strategies for clustering data
 deal with size: data compression,
  approximation, summarization, etc.
 resource conscious designs
 scalable algorithms for analysis queries
  (incl. data mining)
              IQ (contd.)
 liberating data mining from present-day
 answering queries using views and view
 semi-structured data management
 mixing paradigms: e.g., database style
  querying and information retireval or media
 foundational questions in new domains:
  e.g., what does it mean to query sequences?
Profile of a perfect candidate
 some obvious desirables: is a hardcore
  system builder, architect of extensions
 has vision in traditional or new domains
  (e.g., web, biology, mobility, …)
    – vision just as important as technical skills
 raises difficult questions and provides
  surprisingly elegant and/or efficient
 complements the DB group’s strengths
 has unbounded energy and enthusiasm!!!!