Using Python for CGI programming by dffhrtcv3

VIEWS: 5 PAGES: 50

									Using Python for CGI
   programming

       Guido van Rossum
             CNRI
   (Corporation for National Research Initiatives, Reston, Virginia, USA)


        guido@python.org
         www.python.org
                                                                            1
Basic Python tutorial




                        2
                           Lists
      • a = [99, "bottles of beer", ["on", "the", "wall"]]
• Flexible arrays, not Lisp-like linked lists
• Same operators as for strings
      • a+b, a*3, a[0], a[-1], a[1:], len(a)
• Item and slice assignment
      • a[0] = 98
      • a[1:2] = ["bottles", "of", "beer"]
          -> [98, "bottles", "of", "beer", ["on", "the", "wall"]]
      • del a[-1]       # -> [98, "bottles", "of", "beer"]




                                                                    3
      More list operations
>>> a = range(5)       # [0,1,2,3,4]
>>> a.append(5)        # [0,1,2,3,4,5]
>>> a.pop()            # [0,1,2,3,4]
5
>>> a.insert(0, 5.5)   # [5.5,0,1,2,3,4]
>>> a.pop(0)           # [0,1,2,3,4]
5.5
>>> a.reverse()        # [4,3,2,1,0]
>>> a.sort()           # [0,1,2,3,4]



                                           4
              Dictionaries
• Hash tables, "associative arrays"
      • d = {"duck": "eend", "water": "water"}
• Lookup:
      • d["duck"] -> "eend"
      • d["back"] # raises KeyError exception
• Delete, insert, overwrite:
      • del d["water"] # {"duck": "eend", "back": "rug"}
      • d["back"] = "rug" # {"duck": "eend", "back": "rug"}
      • d["duck"] = "duik" # {"duck": "duik", "back": "rug"}




                                                               5
    More dictionary ops
• Keys, values, items:
     • d.keys() -> ["duck", "back"]
     • d.values() -> ["duik", "rug"]
     • d.items() -> [("duck","duik"), ("back","rug")]
• Presence check:
     • d.has_key("duck") -> 1; d.has_key("spam") -> 0
• Values of any type; keys almost any
     • {"name":"Guido", "age":43, ("hello","world"):1,
        42:"yes", "flag": ["red","white","blue"]}




                                                         6
      Dictionary details
• Keys must be immutable:
  – numbers, strings, tuples of immutables
     • these cannot be changed after creation
  – reason is hashing (fast lookup technique)
  – not lists or other dictionaries
     • these types of objects can be changed "in place"
  – no restrictions on values
• Keys will be listed in arbitrary order
  – again, because of hashing



                                                          7
                    Tuples
• key = (lastname, firstname)
• point = x, y, z    # paren’s optional
• x, y, z = point
• lastname = key[0]
• singleton = (1,)        # trailing comma!
• empty = ()              # parentheses!
• tuples vs. lists; tuples immutable



                                              8
                  Variables
• No need to declare
• Need to assign (initialize)
      • use of uninitialized variable raises exception
• Not typed
      if friendly: greeting = "hello world"
      else: greeting = 12**2
      print greeting
• Everything is a variable:
      • functions, modules, classes



                                                         9
   Reference semantics
• Assignment manipulates references
     • x = y does not make a copy of y
     • x = y makes x reference the object y references
• Very useful; but beware!
• Example:
     >>> a = [1, 2, 3]; b = a
     >>> a.append(4); print b
     [1, 2, 3, 4]




                                                         10
Changing a shared list
 a = [1, 2, 3]   a   1   2   3


                 a
 b=a                 1   2   3
                 b

                 a
 a.append(4)         1   2   3   4
                 b

                                     11
Changing an integer
a=1       a   1


          a
b=a           1
          b       new int object created
                  by add operator (1+1)


          a   2
a = a+1           old reference deleted
                  by assignment (a=...)
          b   1

                                  12
        Control structures
if condition:        while condition:
   statements          statements
[elif condition:
   statements] ...   for var in sequence:
else:                  statements
   statements
                     break
                     continue


                                            13
    Grouping indentation
                                                            0
                                                            Bingo!
In Python:               In C:                              ---
                                                            ---
                                                            ---
                                                            3
                                                            ---
for i in range(20):      for (i = 0; i < 20; i++)           ---
                                                            ---
  if i%3 == 0:           {                                  6
                                                            ---
                                                            ---
     print i                 if (i%3 == 0) {                ---
                                                            9
     if i%5 == 0:                 printf("%d\n", i);        ---
                                                            ---
        print "Bingo!"            if (i%5 == 0) {           ---
                                                            12
                                                            ---
  print "---"                       printf("Bingo!\n"); }   ---
                                                            ---
                              }                             15
                                                            Bingo!
                              printf("---\n");              ---
                                                            ---
                                                            ---
                         }                                  18
                                                            ---
                                                            ---


                                                               14
  Functions, procedures
def name(arg1, arg2, ...):
  "documentation"      # optional
  statements


return                 # from procedure
return expression      # from function




                                          15
         Example function
def gcd(a, b):
    "greatest common divisor"
    while a != 0:
      a, b = b%a, a             # parallel assignment
    return b


>>> gcd.__doc__
'greatest common divisor'
>>> gcd(12, 20)
4


                                                        16
                          Classes
class Stack:
  "A well-known data structure…"
  def __init__(self):              # constructor
     self.items = []
  def push(self, x):
     self.items.append(x)          # the sky is the limit
  def pop(self):
     x = self.items[-1]            # what happens if it’s empty?
     del self.items[-1]
     return x
  def empty(self):
     return len(self.items) == 0   # Boolean result


                                                                   17
                 Using classes
• To create an instance, simply call the class object:
       x = Stack()       # no 'new' operator!


• To use methods of the instance, call using dot notation:
       x.empty()         # -> 1
       x.push(1)                          # [1]
       x.empty()         # -> 0
       x.push("hello")                    # [1, "hello"]
       x.pop()           # -> "hello"     # [1]


• To inspect instance variables, use dot notation:
       x.items           # -> [1]


                                                           18
                  Subclassing
class FancyStack(Stack):
  "stack with added ability to inspect inferior stack items"


  def peek(self, n):
     "peek(0) returns top; peek(-1) returns item below that; etc."
     size = len(self.items)
     assert 0 <= n < size                    # test precondition
     return self.items[size-1-n]




                                                                     19
             Subclassing (2)
class LimitedStack(FancyStack):
  "fancy stack with limit on stack size"


  def __init__(self, limit):
     self.limit = limit
     FancyStack.__init__(self)             # base class constructor


  def push(self, x):
     assert len(self.items) < self.limit
     FancyStack.push(self, x)              # "super" method call




                                                                      20
Class & instance variables
class Connection:
  verbose = 0                              # class variable
  def __init__(self, host):
     self.host = host                      # instance variable
  def debug(self, v):
     self.verbose = v                      # make instance variable!
  def connect(self):
     if self.verbose:                      # class or instance variable?
        print "connecting to", self.host




                                                                   21
   Instance variable rules
• On use via instance (self.x), search order:
   – (1) instance, (2) class, (3) base classes
   – this also works for method lookup
• On assigment via instance (self.x = ...):
   – always makes an instance variable
• Class variables "default" for instance variables
• But...!
   – mutable class variable: one copy shared by all
   – mutable instance variable: each instance its own


                                                        22
                   Modules
• Collection of stuff in foo.py file
   – functions, classes, variables
• Importing modules:
   – import string; print string.join(L)
   – from string import join; print join(L)
• Rename after import:
   – import string; s = string; del string




                                              23
                 Packages
• Collection of modules in directory
• Must have __init__.py file
• May contain subpackages
• Import syntax:
   – from P.Q.M import foo; print foo()
   – from P.Q import M; print M.foo()
   – import P.Q.M; print P.Q.M.foo()




                                          24
 Catching exceptions
def foo(x):
  return 1.0/x


def bar(x):
  try:
     print foo(x)
  except ZeroDivisionError, message:
     print "Can’t divide by zero:", message


bar(0)


                                              25
  Try-finally: cleanup
f = open(file)
try:
   process_file(f)
finally:
   f.close()         # always executed
print "OK" # executed on success only




                                         26
     Raising exceptions
• raise IndexError
• raise IndexError("k out of range")
• raise IndexError, "k out of range"
• try:
     something
  except: # catch everything
     print "Oops"
     raise # reraise



                                       27
    More on exceptions
• User-defined exceptions
  – subclass Exception or any other standard exception
• Old Python: exceptions can be strings
  – WATCH OUT: compared by object identity, not ==
• Last caught exception info:
  – sys.exc_info() == (exc_type, exc_value, exc_traceback)
• Last uncaught exception (traceback printed):
  – sys.last_type, sys.last_value, sys.last_traceback
• Printing exceptions: traceback module


                                                             28
               File objects
• f = open(filename[, mode[, buffersize])
  – mode can be "r", "w", "a" (like C stdio); default "r"
  – append "b" for text translation mode
  – append "+" for read/write open
  – buffersize: 0=unbuffered; 1=line-buffered; buffered
• methods:
  – read([nbytes]), readline(), readlines()
  – write(string), writelines(list)
  – seek(pos[, how]), tell()
  – fileno(), flush(), close()

                                                            29
          Standard library
• Core:
  – os, sys, string, getopt, StringIO, struct, pickle, ...
• Regular expressions:
  – re module; Perl-5 style patterns and matching rules
• Internet:
  – socket, rfc822, httplib, htmllib, ftplib, smtplib, ...
• Miscellaneous:
  – pdb (debugger), profile+pstats
  – Tkinter (Tcl/Tk interface), audio, *dbm, ...


                                                             30
Python CGI programming




                    31
     A typical HTML form



<form method="POST" action="http://host.com/cgi-bin/test.py">
  <p>Your first name: <input type="text" name="firstname">
  <p>Your last name: <input type="text" name="lastname">
  <p>Click here to submit form: <input type="submit" value="Yeah!">
  <input type="hidden" name="session" value="1f9a2">
</form>




                                                                  32
          A typical CGI script
#!/usr/local/bin/python
import cgi


def main():
  print "Content-type: text/html\n"
  form = cgi.FieldStorage()       # parse query
  if form.has_key("firstname") and form["firstname"].value != "":
     print "<h1>Hello", form["firstname"].value, "</h1>"
  else:
     print "<h1>Error! Please enter first name.</h1>"


main()



                                                                    33
     CGI script structure
• Check form fields
   – use cgi.FieldStorage class to parse query
       • takes care of decoding, handles GET and POST
       • "foo=ab+cd%21ef&bar=spam" -->
         {'foo': 'ab cd!ef', 'bar': 'spam'} # (well, actually, ...)
• Perform action
   – this is up to you!
   – database interfaces available
• Generate HTTP + HTML output
   – print statements are simplest
   – template solutions available


                                                                      34
    Structure refinement
form = cgi.FieldStorage()
if not form:
   ...display blank form...
elif ...valid form...:
   ...perform action, display results (or next form)...
else:
   ...display error message (maybe repeating form)...




                                                          35
    FieldStorage details
• Behaves like a dictionary:
  – .keys(), .has_key()        # but not others!
  – dictionary-like object ("mapping")
• Items
  – values are MiniFieldStorage instances
     • .value gives field value!
  – if multiple values: list of MiniFieldStorage instances
     • if type(...) == types.ListType: ...
  – may also be FieldStorage instances
     • used for file upload (test .file attribute)


                                                             36
       Other CGI niceties
• cgi.escape(s)
   – translate "<", "&", ">" to "&lt;", "&amp;", "&gt"
• cgi.parse_qs(string, keep_blank_values=0)
   – parse query string to dictionary {"foo": ["bar"], ...}
• cgi.parse([file], ...)
   – ditto, takes query string from default locations
• urllib.quote(s), urllib.unquote(s)
   – convert between "~" and "%7e" (etc.)
• urllib.urlencode(dict)
   – convert dictionary {"foo": "bar", ...} to query string
     "foo=bar&..." # note asymmetry with parse_qs() above


                                                              37
      Dealing with bugs
• Things go wrong, you get a traceback...
• By default, tracebacks usually go to the
  server's error_log file...
• Printing a traceback to stdout is tricky
   – could happen before "Content-type" is printed
   – could happen in the middle of HTML markup
   – could contain markup itself
• What's needed is a...



                                                     38
  Debugging framework
import cgi


def main():
  print "Content-type: text/html\n" # Do this first
  try:
     import worker     # module that does the real work
  except:
     print "<!-- --><hr><h1>Oops. An error occurred.</h1>"
     cgi.print_exception() # Prints traceback, safely


main()


                                                          39
          Security notes
• Watch out when passing fields to the shell
  – e.g. os.popen("finger %s" % form["user"].value)
  – what if the value is "; cat /etc/passwd" ...
• Solutions:
  – Quote:
     • user = pipes.quote(form["user"].value)
  – Refuse:
     • if not re.match(r"^\w+$", user): ...error...
  – Sanitize:
     • user = re.sub(r"\W", "", form["user"].value)


                                                      40
  Using persistent data
• Store/update data:
  – In plain files (simplest)
     • FAQ wizard uses this
  – In a (g)dbm file (better performance)
     • string keys, string values
  – In a "shelf" (stores objects)
     • avoids parsing/unparsing the values
  – In a real database (if you must)
     • 3rd party database extensions available
     • not my field of expertise



                                                 41
                        Plain files
key = ...username, or session key, or whatever...
try:
   f = open(key, "r")
   data = f.read()                  # read previous data
   f.close()
except IOError:
   data = ""                        # no file yet: provide initial data
data = update(data, form)           # do whatever must be done
f = open(key, "w")
f.write(data)                       # write new data
f.close()
# (could delete the file instead if updated data is empty)



                                                                          42
                 (G)DBM files
# better performance if there are many records


import gdbm
key = ...username, or session key, or whatever...
db = gdbm.open("DATABASE", "w")             # open for reading+writing
if db.has_key(key):
  data = db[key]                   # read previous data
else:
  data = ""                        # provide initial data
data = update(data, form)
db[key] = data                     # write new data
db.close()



                                                                         43
                         Shelves
# a shelf is a (g)dbm files that stores pickled Python objects


import shelve
class UserData: ...
key = ...username, or session key, or whatever...
db = shelve.open("DATABASE", "w")            # open for reading+writing
if db.has_key(key):
  data = db[key]           # an object!
else:
  data = UserData(key)     # create a new instance
data.update(form)
db[key] = data
db.close()


                                                                          44
                    Locking
• (G)DBM files and shelves are not protected
  against concurrent updates!
• Multiple readers, single writer usually OK
   – simplest approach: only lock when writing
• Good filesystem-based locking is hard
   – no cross-platform solutions
   – unpleasant facts of life:
      • processes sometimes die without unlocking
      • processes sometimes take longer than expected
      • NFS semantics


                                                        45
    A simple lock solution
import os, time                            def unlock(self):
                                              assert self.locked
class Lock:                                   self.locked = 0
                                              os.rmdir(self.filename)
  def __init__(self, filename):
     self.filename = filename              # auto-unlock when lock object is deleted
     self.locked = 0                       def __del__(self):
                                              if self.locked:
  def lock(self):                                self.unlock()
     assert not self.locked
     while 1:
     try:                                # for a big production with timeouts,
            os.mkdir(self.filename)      # see the Mailman source code (LockFile.py);
            self.locked = 1              # it works on all Unixes and supports NFS;
            return          # or break   # but not on Windows,
     except os.error, err:               # and the code is very complex...
            time.sleep(1)



                                                                                       46
                 Sessions
• How to correlate requests from same user?
  – Assign session key on first contact
  – Incorporate session key in form or in URL
  – In form: use hidden input field:
     • <input type="hidden" name="session" value="1f9a2">
  – In URL:
     • http://myhost.com/cgi-bin/myprog.py/1f9a2
     • passed in environment (os.environ[...]):
         – PATH_INFO=/1f9a2
         – PATH_TRANSLATED=<rootdir>/1f9a2




                                                       47
                    Cookies
• How to correlate sessions from the same user?
  – Store "cookie" in browser
     • controversial, but useful
  – Module: Cookie.py (Tim O'Malley)
     • writes "Set-Cookie" headers
     • parses HTTP_COOKIE environment variable
  – Note: using cookies affects our debug framework
     • cookies must be printed as part of HTTP headers
     • cheapest solution:
         – move printing of blank line into worker module
         – (and into exception handler of debug framework)


                                                             48
                   Cookie example
import os, cgi, Cookie                 c["user"] = user


c = Cookie.Cookie()                    print c
try:
   c.load(os.environ["HTTP_COOKIE"])   print """
except KeyError:                       <form action="/cgi-bin/test.py"
   pass                                    method="get">
                                       <input type="text" name="user"
form = cgi.FieldStorage()                  value="%s">

try:                                   </form>

   user = form["user"].value           """ % cgi.escape(user)

except KeyError:
   try:                                # debug: show the cookie header we wrote

       user = c["user"].value          print "<pre>"

   except KeyError:                    print cgi.escape(str(c))

       user = "nobody"                 print "</pre>"




                                                                                  49
          File upload example
import cgi
form = cgi.FieldStorage()
if not form:
  print """
  <form action="/cgi-bin/test.py" method="POST" enctype="multipart/form-data">
  <input type="file" name="filename">
  <input type="submit">
  </form>
  """
elif form.has_key("filename"):
  item = form["filename"]
  if item.file:
        data = item.file.read()     # read contents of file
        print cgi.escape(data)      # rather dumb action



                                                                            50

								
To top