Semantic Web II Civic Hacking, the Semantic Web, and by ygq15756

VIEWS: 4 PAGES: 43

									          Semantic Web II:
Civic Hacking, the Semantic Web, and
         Data Visualizations
      (part 2 of “WTF is the Semantic Web?”)


      Josh Tauberer, GovTrack.us
     Greg Elin, Sunlight Foundation

          Transparency Camp, March 1, 2009
Introduction

   Who I am
   What I do with the Semantic Web
   Data Isolation versus One Big Web
   Metadata versus Beyond Metadata
   The SW, RDF, LOD, SPARQL
   From SW to Visualization
Data Isolation

                               Voting Records
    Campaign                     house.gov
   Contributions
     fec.gov           ?
                                       Voting Records
                                         senate.gov

                   ?
                               ?
        Earmarks
     usaspending.gov         Legislation
                           thomas.loc.gov
Not What I Mean
Data Isolation

                                        Voting Records
    Campaign                              house.gov
   Contributions
     fec.gov             MAPLight
                            ?
                   CRP
                               GovTrack.us
                                         Voting Records
                                                     senate.gov


                           How do we make
        Earmarks
                           this cheaper?
     usaspending.gov                  Legislation
                                    thomas.loc.gov
Data Integration

   Machines sort, search, and transform large
    data sets so we can understand it better.
   New uses can go beyond the resources
    mandate, or vision of an agency/org.
   Why?
       Civic Education
       Journalism
       Basic Research (health, economy, ...)
Metadata

   Metadata (as we often use the term) is the first
    level in the web of data.
       Metadata is tabular.
       Metadata is isolated.

Congress Type     Number Sponsor         Title
110      H.R.     650    Thomas ReynoldsTo provide for the Secretary of Vete
110      S.       12     Mitch McConnell HOME Act
111      H.R.     32     Mike McIntyre   Veterans Outreach Improvement Ac
111      H.R.     1      David Obey      American Recovery and Reinvestme
111      H.Res.   11     Roscoe Bartlett Expressing the sense of the House
Beyond Metadata

              H.R. 1
              (111th)
                        Sp
                           on
            Number            sor
            Date
            Title
            ...                     David Obey
                                           Name
                                           Party
                                           Age
                                           ...
Beyond Metadata

              H.R. 1
              (111th)
                         Sp
                            on
            Number             sor
            Date
            Title
            ...                      David Obey
                                             Name
                        Represents           Party
                                             Age
                                             ...
                        Wisc.’s 7th
                                     Latitude
                                     Longitude
                                     Population
Beyond Metadata

                              H.R. 1
                              (111th)
                                         Sp
                        n                   on
                  c t io    Number             sor
              A
                            Date
                            Title
     House Vote             ...                      David Obey
   Date                                                      Name
   Result                               Represents           Party
                                                             Age
                                                             ...
                                        Wisc.’s 7th
                                                     Latitude
                                                     Longitude
                                                     Population
Beyond Metadata

                                H.R. 1
                                (111th)
                                                Sp
                          n                        on
                    c t io    Number                  sor
                A
                              Date
                              Title
    House Vote                ...                           David Obey
            Ch                                 d
   Date        oic                        Vote                      Name
   Result          e                             Represents         Party
                                                                    Age
                          Yea                                       ...
                                              Wisc.’s 7th
                                                            Latitude
                                                            Longitude
                                                            Population
Beyond Metadata

                                      H.R. 1
                                      (111th)
                                                        Sp
                                n                          on
                          c t io    Number                    sor
                      A
                                    Date
                                    Title
    House Vote                      ...                             David Obey
            Ch                                       d
   Date        oic                              Vote                          Name
             Choice




   Result          e                                   Represents             Party
                                                                              Age
                                Yea                                           ...
                                                    Wisc.’s 7th
            Nay
                                                                      Latitude


                                                             Part
                                                                      Longitude
       d
      te




                          Wisc.’s 6th                                 Population
    Vo




                      Rep            Par
                                                                  O
                                        t          Of
  Thomas Petri                                                    f
                                                           Wisc.
The Semantic Web

   The Semantic Web is a vision to build one big,
    interconnected database.
   But decentralized, distributed, and flexible.
   Implemented in a technology stack:
       The Web (HTTP etc.)
       RDF (Resource Description Framework)
       SPARQL
       ...
The Semantic Web

   Originally intended to describe web resources:
       Web page metadata
       Information resources: schedules, contacts, etc.
   Since 2004ish, it's become more interesting.
       Use the same framework to describe the real world!
Why I'm interested in this?

Machine processing of knowledge combined with
machine processing of language is going to
radically and fundamentally transform the way we
learn, communicate, and live.


   Today's prototypes: Zemanta, OpenCalais
The Semantic Web: RDF

   What are the minimal standards to achieve the
    vision?
   The database isn’t tabular but rather a web.
       “graph”
   Nodes are labeled with URIs (and edges too).
       URIs are opaque: they're just arbitrary labels with
        no expectation that the text of the URI is related to
        the node it names (but wait...)
       Decentralized normalization
The Semantic Web: RDF: URIs

                       H.R. 1
                       (111th)




    House Vote                           David Obey



                 Yea
                                 Wisc.’s 7th
          Nay

                 Wisc.’s 6th

  Thomas Petri
                                     Wisc.
    The Semantic Web: RDF: URIs

             http://www.rdfabout.com/rdf/usgov/congress/111/bills/h1




                  #vote            http://www.rdfabout.com/rdf/usgov/congress/people




                                #yea
                            http://www.rdfabout.com/rdf/usgov/geo/us/wi/cd/111/7
                    #nay

           http://www.rdfabout.com/rdf/usgov/geo/us/wi/cd/111/6


.rdfabout.com/rdf/usgov/congress/people/P000265
                                   http://www.rdfabout.com/rdf/usgov/geo/us/wi
The Semantic Web: RDF

   URIs to name things in the real world
       Need not be URLs, but current best practice is to
        use URLs. (Difference: Plug into web browser.)
   Graph Structure (RDF) to represent information
       Each edge represents a “triple” of information:
        subject ---- predicate (node label) ---- object
       RDF is not inherently XML.
   rdfabout.com
   Limitation: Not well adopted, tools ~immature.
My Data Cloud
My Data Cloud
My Data Cloud
My Data Cloud
My Data Cloud
Linked Data Cloud

   linkeddata.org
H.R. 1 --- HTML on GovTrack
 http://www.govtrack.us/congress/bill.xpd?bill=h111-1




          <link rel="alternate" type="application/rdf+xml"
               href="http://www.rdfabout.com/rdf/usgov/congress/111/bills/h1"/>
   H.R. 1 – Linked Data URI
     http://www.rdfabout.com/rdf/usgov/congress/111/bills/h1

<rdf:RDF ...>
    <usbill:HouseBill rdf:about="http://www.rdfabout.com/rdf/usgov/congress/111/bills/h1">

        <usbill:congress>111</usbill:congress>
        <usbill:type>h</usbill:type>
        <usbill:number>1</usbill:number>

        <dcterms:created>2009-01-26</terms:created>
        <dc:title>H.R. 1: American Recovery and Reinvestment Act of 2009</ns:title>

        <usbill:introduced rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2009-01-26

         ..................................

    </rdf:Description>
</rdf:RDF>
H.R. 1 – Linked Data URI
 http://www.rdfabout.com/rdf/usgov/congress/111/bills/h1

<usbill:sponsor rdf:resource="http://www.rdfabout.com/rdf/....../people/O000007" />
<usbill:enacts rdf:resource="http://www.rdfabout.com/rdf/.../111/statutes/public/5" />

<usbill:hadAction>
    <usbill:VoteAction>
          <usbill:voteResult>pass</usbill:voteResult>
          <usbill:vote rdf:resource="http://www.rdfabout..../111/house/votes/2009-46" />
          <ns:description>On passage Passed by the Yeas and Nays: 244 - 188 (Roll no.
    </usbill:VoteAction>
</usbill:hadAction>

<usbill:hadAction>
    <usbill:VoteAction>
          <usbill:voteResult>pass</usbill:voteResult>
          <usbill:vote rdf:resource="http://www.rdfabout.../111/senate/votes/2009-61" />
          <ns:description>Passed Senate with an amendment by Yea-Nay Vote. 61 - 37.
    </usbill:VoteAction>
</usbill:hadAction>
Linked Data Nuts & Bolts

   Give dereferenceable URIs to entities.
   Accessing the URI gives info on it, or:
   HTTP redirect to SPARQL DESCRIBE query:
    DESCRIBE <http://www.rdfabout.com/rdf/usgov/congress/people/O000007>

       to generate the info page automatically
   http://rdfabout.com/demo/census is a whole
    case study on putting Census data up this way
SPARQL

   The W3C's query language for RDF.
   It's the SQL for the Semantic Web.
       SELECT ... FROM ... WHERE ....
   Differences:
       Everything else about the syntax.
       Read-only (so far).
       Intended to be publicly accessible.
Stimulus Bill Vote & State Income

   Did a state’s median income predict the votes
    of Senators on H.R. 1424, the October 2008
    stimulus bill?
Stimulus Bill Vote & State Income

 Stimulus
    Bill A                                    “Yea”
             cti                              l
                   on                      be
                                         la
                    hasOption             votedBy
        Senate Vote ha            Yea               Sen. Graham
                      sO
                        pt
                           ion                Represents
                                 Nay                   South
                                                      Carolina

                                   vot
                                                 Median
                                       ed        Income
                                         By
                                                     “$25,824”
Stimulus Bill Vote & State Income

 Stimulus
    Bill A
             cti
                   on                        el ?b
                                         lab
                        hasOption         votedBy
         Senate Vote                ?a                   ?c

                                            Represents
 SPARQL Query: (almost)
                                                         ?d
 SELECT ?a ?e WHERE {
   VOTE_URI hasOption          ?a   .           Median
   ?a       label              ?b   .           Income
   ?a       votedBy            ?c   .                    ?e
   ?c       represents         ?d   .
   ?d       medianIncome       ?e   .
 }
Stimulus Bill Vote & State Income

   Submit SPARQL query at:
    http://www.govtrack.us/sparql.xpd
   Get results back in
       SPARQL XML Format
       HTML Table
       CSV
    SPARQL Query
SELECT ?option ?population ?medianincome ?name ?regionname ?party WHERE {
  <http://www.rdfabout.com/rdf/usgov/congress/110/senate/votes/2008-213>
    dc:date ?date ;
    vote:hasOption [
      vote:votedBy ?person ;
      rdfs:label ?option ;
    ] .

    ?person foaf:name ?name .
    ?person pol:hasRole [
        time:from [ time:at ?startdate ] ;
        time:to [ time:at ?enddate ] ;
        pol:party ?party ;
        pol:forOffice [ pol:represents ?region ] ] .
    ?region dc:title ?regionname .
    ?region census:population ?population .

    ?region census:details [
      census2:population15YearsAndOverWithIncomeIn1999 [
        census2:medianIncomeIn1999 ?medianincome
      ]
    ] .

    FILTER(str(?startdate) <= str(?date)) .
    FILTER(str(?enddate) >= str(?date)) .
}
SPARQL Query Results

  option medianincome regionname       population         name            party
 Yea             28967 New York           18976457 Charles Schumer    Democrat
 Yea             28090 Rhode Island        1048319 John Reed          Democrat
 Yea             29091 Ohio               11353140 Sherrod Brown      Democrat
 Yea             26142 Vermont              608827 Patrick Leahy      Democrat

 Yea             32406 Massachusetts       6349097 John Kerry         Democrat
 Yea             29723 Wisconsin           5363675 Herbert Kohl       Democrat
 Yea             27445 Utah                2233169 Robert Bennett     Republican
 Yea             26263 Missouri            5595211 Christopher Bond   Republican
 Yea             27453 Hawaii              1211537 Daniel Akaka       Democrat
 Yea             35845 Connecticut         3405565 Joseph Lieberman   Independent
Visualizations

   Here's my dream....
Visualizations

   ggobi.org
   IBM’s Many Eyes
   Swivel.com
   My map server
   ManyEyes




http://manyeyes.alphaworks.ibm.com/manyeyes/visualizations/senate-vote-on-hr1424-by-party-vote-
ManyEyes




http://manyeyes.alphaworks.ibm.com/manyeyes/visualizations/senate-vote-on-hr1424-map
 Campaign Contribution Map


PREFIX fec: <http://www.rdfabout.com/rdf/schema/usfec/>

SELECT ?zipcode ?value WHERE {
 ?campaign fec:candidate <http://www.rdf...ongress/people/I000057> .
 ?campaign fec:cycle 2008 .
 ?zipcode fec:zipAggregatedContribution [
   fec:toCampaign ?campaign;
   fec:amount ?value
   ] .
 ?zipcode fec:zcta ?uri .
}


             http://www.govtrack.us/perl/wms/upload-styles-sparql.html
Campaign Contribution Map
Campaign Contribution Map

								
To top