Docstoc

Using Python for CGI programming

Document Sample
Using Python for CGI programming Powered By Docstoc
					Using Python for CGI
   programming

       Guido van Rossum
             CNRI
   (Corporation for National Research Initiatives, Reston, Virginia, USA)


        guido@python.org
         www.python.org
                                                                            1
Basic Python tutorial




                        2
                           Lists
      • a = [99, "bottles of beer", ["on", "the", "wall"]]
• Flexible arrays, not Lisp-like linked lists
• Same operators as for strings
      • a+b, a*3, a[0], a[-1], a[1:], len(a)
• Item and slice assignment
      • a[0] = 98
      • a[1:2] = ["bottles", "of", "beer"]
          -> [98, "bottles", "of", "beer", ["on", "the", "wall"]]
      • del a[-1]       # -> [98, "bottles", "of", "beer"]




                                                                    3
      More list operations
>>> a = range(5)       # [0,1,2,3,4]
>>> a.append(5)        # [0,1,2,3,4,5]
>>> a.pop()            # [0,1,2,3,4]
5
>>> a.insert(0, 5.5)   # [5.5,0,1,2,3,4]
>>> a.pop(0)           # [0,1,2,3,4]
5.5
>>> a.reverse()        # [4,3,2,1,0]
>>> a.sort()           # [0,1,2,3,4]



                                           4
              Dictionaries
• Hash tables, "associative arrays"
      • d = {"duck": "eend", "water": "water"}
• Lookup:
      • d["duck"] -> "eend"
      • d["back"] # raises KeyError exception
• Delete, insert, overwrite:
      • del d["water"] # {"duck": "eend", "back": "rug"}
      • d["back"] = "rug" # {"duck": "eend", "back": "rug"}
      • d["duck"] = "duik" # {"duck": "duik", "back": "rug"}




                                                               5
    More dictionary ops
• Keys, values, items:
     • d.keys() -> ["duck", "back"]
     • d.values() -> ["duik", "rug"]
     • d.items() -> [("duck","duik"), ("back","rug")]
• Presence check:
     • d.has_key("duck") -> 1; d.has_key("spam") -> 0
• Values of any type; keys almost any
     • {"name":"Guido", "age":43, ("hello","world"):1,
        42:"yes", "flag": ["red","white","blue"]}




                                                         6
      Dictionary details
• Keys must be immutable:
  – numbers, strings, tuples of immutables
     • these cannot be changed after creation
  – reason is hashing (fast lookup technique)
  – not lists or other dictionaries
     • these types of objects can be changed "in place"
  – no restrictions on values
• Keys will be listed in arbitrary order
  – again, because of hashing



                                                          7
                    Tuples
• key = (lastname, firstname)
• point = x, y, z    # paren’s optional
• x, y, z = point
• lastname = key[0]
• singleton = (1,)        # trailing comma!
• empty = ()              # parentheses!
• tuples vs. lists; tuples immutable



                                              8
                  Variables
• No need to declare
• Need to assign (initialize)
      • use of uninitialized variable raises exception
• Not typed
      if friendly: greeting = "hello world"
      else: greeting = 12**2
      print greeting
• Everything is a variable:
      • functions, modules, classes



                                                         9
   Reference semantics
• Assignment manipulates references
     • x = y does not make a copy of y
     • x = y makes x reference the object y references
• Very useful; but beware!
• Example:
     >>> a = [1, 2, 3]; b = a
     >>> a.append(4); print b
     [1, 2, 3, 4]




                                                         10
Changing a shared list
 a = [1, 2, 3]   a   1   2   3


                 a
 b=a                 1   2   3
                 b

                 a
 a.append(4)         1   2   3   4
                 b

                                     11
Changing an integer
a=1       a   1


          a
b=a           1
          b       new int object created
                  by add operator (1+1)


          a   2
a = a+1           old reference deleted
                  by assignment (a=...)
          b   1

                                  12
        Control structures
if condition:        while condition:
   statements          statements
[elif condition:
   statements] ...   for var in sequence:
else:                  statements
   statements
                     break
                     continue


                                            13
    Grouping indentation
                                                            0
                                                            Bingo!
In Python:               In C:                              ---
                                                            ---
                                                            ---
                                                            3
                                                            ---
for i in range(20):      for (i = 0; i < 20; i++)           ---
                                                            ---
  if i%3 == 0:           {                                  6
                                                            ---
                                                            ---
     print i                 if (i%3 == 0) {                ---
                                                            9
     if i%5 == 0:                 printf("%d\n", i);        ---
                                                            ---
        print "Bingo!"            if (i%5 == 0) {           ---
                                                            12
                                                            ---
  print "---"                       printf("Bingo!\n"); }   ---
                                                            ---
                              }                             15
                                                            Bingo!
                              printf("---\n");              ---
                                                            ---
                                                            ---
                         }                                  18
                                                            ---
                                                            ---


                                                               14
  Functions, procedures
def name(arg1, arg2, ...):
  "documentation"      # optional
  statements


return                 # from procedure
return expression      # from function




                                          15
         Example function
def gcd(a, b):
    "greatest common divisor"
    while a != 0:
      a, b = b%a, a             # parallel assignment
    return b


>>> gcd.__doc__
'greatest common divisor'
>>> gcd(12, 20)
4


                                                        16
                          Classes
class Stack:
  "A well-known data structure…"
  def __init__(self):              # constructor
     self.items = []
  def push(self, x):
     self.items.append(x)          # the sky is the limit
  def pop(self):
     x = self.items[-1]            # what happens if it’s empty?
     del self.items[-1]
     return x
  def empty(self):
     return len(self.items) == 0   # Boolean result


                                                                   17
                 Using classes
• To create an instance, simply call the class object:
       x = Stack()       # no 'new' operator!


• To use methods of the instance, call using dot notation:
       x.empty()         # -> 1
       x.push(1)                          # [1]
       x.empty()         # -> 0
       x.push("hello")                    # [1, "hello"]
       x.pop()           # -> "hello"     # [1]


• To inspect instance variables, use dot notation:
       x.items           # -> [1]


                                                           18
                  Subclassing
class FancyStack(Stack):
  "stack with added ability to inspect inferior stack items"


  def peek(self, n):
     "peek(0) returns top; peek(-1) returns item below that; etc."
     size = len(self.items)
     assert 0 <= n < size                    # test precondition
     return self.items[size-1-n]




                                                                     19
             Subclassing (2)
class LimitedStack(FancyStack):
  "fancy stack with limit on stack size"


  def __init__(self, limit):
     self.limit = limit
     FancyStack.__init__(self)             # base class constructor


  def push(self, x):
     assert len(self.items) < self.limit
     FancyStack.push(self, x)              # "super" method call




                                                                      20
Class & instance variables
class Connection:
  verbose = 0                              # class variable
  def __init__(self, host):
     self.host = host                      # instance variable
  def debug(self, v):
     self.verbose = v                      # make instance variable!
  def connect(self):
     if self.verbose:                      # class or instance variable?
        print "connecting to", self.host




                                                                   21
   Instance variable rules
• On use via instance (self.x), search order:
   – (1) instance, (2) class, (3) base classes
   – this also works for method lookup
• On assigment via instance (self.x = ...):
   – always makes an instance variable
• Class variables "default" for instance variables
• But...!
   – mutable class variable: one copy shared by all
   – mutable instance variable: each instance its own


                                                        22
                   Modules
• Collection of stuff in foo.py file
   – functions, classes, variables
• Importing modules:
   – import string; print string.join(L)
   – from string import join; print join(L)
• Rename after import:
   – import string; s = string; del string




                                              23
                 Packages
• Collection of modules in directory
• Must have __init__.py file
• May contain subpackages
• Import syntax:
   – from P.Q.M import foo; print foo()
   – from P.Q import M; print M.foo()
   – import P.Q.M; print P.Q.M.foo()




                                          24
 Catching exceptions
def foo(x):
  return 1.0/x


def bar(x):
  try:
     print foo(x)
  except ZeroDivisionError, message:
     print "Can’t divide by zero:", message


bar(0)


                                              25
  Try-finally: cleanup
f = open(file)
try:
   process_file(f)
finally:
   f.close()         # always executed
print "OK" # executed on success only




                                         26
     Raising exceptions
• raise IndexError
• raise IndexError("k out of range")
• raise IndexError, "k out of range"
• try:
     something
  except: # catch everything
     print "Oops"
     raise # reraise



                                       27
    More on exceptions
• User-defined exceptions
  – subclass Exception or any other standard exception
• Old Python: exceptions can be strings
  – WATCH OUT: compared by object identity, not ==
• Last caught exception info:
  – sys.exc_info() == (exc_type, exc_value, exc_traceback)
• Last uncaught exception (traceback printed):
  – sys.last_type, sys.last_value, sys.last_traceback
• Printing exceptions: traceback module


                                                             28
               File objects
• f = open(filename[, mode[, buffersize])
  – mode can be "r", "w", "a" (like C stdio); default "r"
  – append "b" for text translation mode
  – append "+" for read/write open
  – buffersize: 0=unbuffered; 1=line-buffered; buffered
• methods:
  – read([nbytes]), readline(), readlines()
  – write(string), writelines(list)
  – seek(pos[, how]), tell()
  – fileno(), flush(), close()

                                                            29
          Standard library
• Core:
  – os, sys, string, getopt, StringIO, struct, pickle, ...
• Regular expressions:
  – re module; Perl-5 style patterns and matching rules
• Internet:
  – socket, rfc822, httplib, htmllib, ftplib, smtplib, ...
• Miscellaneous:
  – pdb (debugger), profile+pstats
  – Tkinter (Tcl/Tk interface), audio, *dbm, ...


                                                             30
Python CGI programming




                    31
     A typical HTML form



<form method="POST" action="http://host.com/cgi-bin/test.py">
  <p>Your first name: <input type="text" name="firstname">
  <p>Your last name: <input type="text" name="lastname">
  <p>Click here to submit form: <input type="submit" value="Yeah!">
  <input type="hidden" name="session" value="1f9a2">
</form>




                                                                  32
          A typical CGI script
#!/usr/local/bin/python
import cgi


def main():
  print "Content-type: text/html\n"
  form = cgi.FieldStorage()       # parse query
  if form.has_key("firstname") and form["firstname"].value != "":
     print "<h1>Hello", form["firstname"].value, "</h1>"
  else:
     print "<h1>Error! Please enter first name.</h1>"


main()



                                                                    33
     CGI script structure
• Check form fields
   – use cgi.FieldStorage class to parse query
       • takes care of decoding, handles GET and POST
       • "foo=ab+cd%21ef&bar=spam" -->
         {'foo': 'ab cd!ef', 'bar': 'spam'} # (well, actually, ...)
• Perform action
   – this is up to you!
   – database interfaces available
• Generate HTTP + HTML output
   – print statements are simplest
   – template solutions available


                                                                      34
    Structure refinement
form = cgi.FieldStorage()
if not form:
   ...display blank form...
elif ...valid form...:
   ...perform action, display results (or next form)...
else:
   ...display error message (maybe repeating form)...




                                                          35
    FieldStorage details
• Behaves like a dictionary:
  – .keys(), .has_key()        # but not others!
  – dictionary-like object ("mapping")
• Items
  – values are MiniFieldStorage instances
     • .value gives field value!
  – if multiple values: list of MiniFieldStorage instances
     • if type(...) == types.ListType: ...
  – may also be FieldStorage instances
     • used for file upload (test .file attribute)


                                                             36
       Other CGI niceties
• cgi.escape(s)
   – translate "<", "&", ">" to "&lt;", "&amp;", "&gt"
• cgi.parse_qs(string, keep_blank_values=0)
   – parse query string to dictionary {"foo": ["bar"], ...}
• cgi.parse([file], ...)
   – ditto, takes query string from default locations
• urllib.quote(s), urllib.unquote(s)
   – convert between "~" and "%7e" (etc.)
• urllib.urlencode(dict)
   – convert dictionary {"foo": "bar", ...} to query string
     "foo=bar&..." # note asymmetry with parse_qs() above


                                                              37
      Dealing with bugs
• Things go wrong, you get a traceback...
• By default, tracebacks usually go to the
  server's error_log file...
• Printing a traceback to stdout is tricky
   – could happen before "Content-type" is printed
   – could happen in the middle of HTML markup
   – could contain markup itself
• What's needed is a...



                                                     38
  Debugging framework
import cgi


def main():
  print "Content-type: text/html\n" # Do this first
  try:
     import worker     # module that does the real work
  except:
     print "<!-- --><hr><h1>Oops. An error occurred.</h1>"
     cgi.print_exception() # Prints traceback, safely


main()


                                                          39
          Security notes
• Watch out when passing fields to the shell
  – e.g. os.popen("finger %s" % form["user"].value)
  – what if the value is "; cat /etc/passwd" ...
• Solutions:
  – Quote:
     • user = pipes.quote(form["user"].value)
  – Refuse:
     • if not re.match(r"^\w+$", user): ...error...
  – Sanitize:
     • user = re.sub(r"\W", "", form["user"].value)


                                                      40
  Using persistent data
• Store/update data:
  – In plain files (simplest)
     • FAQ wizard uses this
  – In a (g)dbm file (better performance)
     • string keys, string values
  – In a "shelf" (stores objects)
     • avoids parsing/unparsing the values
  – In a real database (if you must)
     • 3rd party database extensions available
     • not my field of expertise



                                                 41
                        Plain files
key = ...username, or session key, or whatever...
try:
   f = open(key, "r")
   data = f.read()                  # read previous data
   f.close()
except IOError:
   data = ""                        # no file yet: provide initial data
data = update(data, form)           # do whatever must be done
f = open(key, "w")
f.write(data)                       # write new data
f.close()
# (could delete the file instead if updated data is empty)



                                                                          42
                 (G)DBM files
# better performance if there are many records


import gdbm
key = ...username, or session key, or whatever...
db = gdbm.open("DATABASE", "w")             # open for reading+writing
if db.has_key(key):
  data = db[key]                   # read previous data
else:
  data = ""                        # provide initial data
data = update(data, form)
db[key] = data                     # write new data
db.close()



                                                                         43
                         Shelves
# a shelf is a (g)dbm files that stores pickled Python objects


import shelve
class UserData: ...
key = ...username, or session key, or whatever...
db = shelve.open("DATABASE", "w")            # open for reading+writing
if db.has_key(key):
  data = db[key]           # an object!
else:
  data = UserData(key)     # create a new instance
data.update(form)
db[key] = data
db.close()


                                                                          44
                    Locking
• (G)DBM files and shelves are not protected
  against concurrent updates!
• Multiple readers, single writer usually OK
   – simplest approach: only lock when writing
• Good filesystem-based locking is hard
   – no cross-platform solutions
   – unpleasant facts of life:
      • processes sometimes die without unlocking
      • processes sometimes take longer than expected
      • NFS semantics


                                                        45
    A simple lock solution
import os, time                            def unlock(self):
                                              assert self.locked
class Lock:                                   self.locked = 0
                                              os.rmdir(self.filename)
  def __init__(self, filename):
     self.filename = filename              # auto-unlock when lock object is deleted
     self.locked = 0                       def __del__(self):
                                              if self.locked:
  def lock(self):                                self.unlock()
     assert not self.locked
     while 1:
     try:                                # for a big production with timeouts,
            os.mkdir(self.filename)      # see the Mailman source code (LockFile.py);
            self.locked = 1              # it works on all Unixes and supports NFS;
            return          # or break   # but not on Windows,
     except os.error, err:               # and the code is very complex...
            time.sleep(1)



                                                                                       46
                 Sessions
• How to correlate requests from same user?
  – Assign session key on first contact
  – Incorporate session key in form or in URL
  – In form: use hidden input field:
     • <input type="hidden" name="session" value="1f9a2">
  – In URL:
     • http://myhost.com/cgi-bin/myprog.py/1f9a2
     • passed in environment (os.environ[...]):
         – PATH_INFO=/1f9a2
         – PATH_TRANSLATED=<rootdir>/1f9a2




                                                       47
                    Cookies
• How to correlate sessions from the same user?
  – Store "cookie" in browser
     • controversial, but useful
  – Module: Cookie.py (Tim O'Malley)
     • writes "Set-Cookie" headers
     • parses HTTP_COOKIE environment variable
  – Note: using cookies affects our debug framework
     • cookies must be printed as part of HTTP headers
     • cheapest solution:
         – move printing of blank line into worker module
         – (and into exception handler of debug framework)


                                                             48
                   Cookie example
import os, cgi, Cookie                 c["user"] = user


c = Cookie.Cookie()                    print c
try:
   c.load(os.environ["HTTP_COOKIE"])   print """
except KeyError:                       <form action="/cgi-bin/test.py"
   pass                                    method="get">
                                       <input type="text" name="user"
form = cgi.FieldStorage()                  value="%s">

try:                                   </form>

   user = form["user"].value           """ % cgi.escape(user)

except KeyError:
   try:                                # debug: show the cookie header we wrote

       user = c["user"].value          print "<pre>"

   except KeyError:                    print cgi.escape(str(c))

       user = "nobody"                 print "</pre>"




                                                                                  49
          File upload example
import cgi
form = cgi.FieldStorage()
if not form:
  print """
  <form action="/cgi-bin/test.py" method="POST" enctype="multipart/form-data">
  <input type="file" name="filename">
  <input type="submit">
  </form>
  """
elif form.has_key("filename"):
  item = form["filename"]
  if item.file:
        data = item.file.read()     # read contents of file
        print cgi.escape(data)      # rather dumb action



                                                                            50

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:5
posted:10/2/2012
language:English
pages:50