Tag Archives: python

A friendlier asynchronous twisted web, the ghetto monkey patch way

UPDATE to the UPDATE – A cleaned up and more coherent example of txweb is here
UPDATE – Github repo here

I like twisted, and I like Cherrypy, unfortunately just like my militant atheist friends and my more spiritual friends neither seems to get along with the other.

What to do? MONKEY PATCH + GHETTO HACKING to the rescue!

Note, this is just a mockup of CherryPy’s routing system and not a bridge or interface to CherryPy. There is no CherryPy to be had here, just ghetto py.


from twisted.web import server, resource
from twisted.internet import reactor


def expose(func):
    func.exposed = True
    return func

class PageOne(object):

    def foo(self, request):
        return "Hello From PageOne Foo!"        
    foo.exposed = True
    
    @expose
    def delayed(self, request):
        def delayedResponse():
            request.write("I was delayed :( ")
            request.finish()
            
        reactor.callLater(5, delayedResponse)
        return server.NOT_DONE_YET
    
    
class PageTwo(object):

    @expose
    def index(self, request):
        return "Hello From PageTwo index!"
        
        
class Root(object):
    
    @expose
    def index(self, request):
        return "Hello From Index!"
    
    @expose
    def __default__(self, request):
        return "I Caught %s " % request.path
    
    pageone = PageOne()
    pagetwo = PageTwo()
        
class OneTimeResource(resource.Resource):
    """
        Monkey patch to avoid rewriting more of twisted's lower web
        layer which does a fantastic job dealing with the minute details
        of receiving and sending HTTP traffic.
        
        func is a callable and exposed property in the Root OO tree
    """
    def __init__(self, func):
        self.func = func
        
    def render(self, request):
        #Here would be a fantastic place for a pre-filter
        return self.func(request)
        #ditto here for a post filter
        
        
class OverrideSite(server.Site):
    """
        A monkey patch that short circuits the normal
        resource resolution logic @ the getResourceFor point
        
    """
    def checkAction(self, controller, name):
        """
            On success, returns a bound method from the provided controller instance
            else it return None
        """
        action = None
        if hasattr(controller, name):
                action = getattr(controller, name)
                if not callable(action) or not hasattr(action, "exposed"):
                    action = None
        
        return action
        
                    
    def routeRequest(self, request):
        action = None
        response = None
        
        root = parent = self.resource
        defaultAction = self.checkAction(root, "__default__")
        
        path = request.path.strip("/").split("/")
        
         
        
        for i in range(len(path)):
            element = path[i]
            
            parent = root
            root = getattr(root, element, None)
            request.prepath.append(element)
            
            if root is None:                
                break
            
            if self.checkAction(root, "__default__"):
                #Check for a catchall default action
                defaultAction = self.checkAction(root, "__default__")
                
                
            if element.startswith("_"):
                #500 simplistic security check
                action = lambda request: "500 URI segments cannot start with an underscore"
                break
                
            if callable(root) and hasattr(root, "exposed") and root.exposed == True:
                action = root
                request.postpath = path[i:] 
                break
            
            
                
        else:
            if action is None:
                if root is not None and self.checkAction(root, "index"):
                    action = self.checkAction(root, "index")
                
                
        #action = OneTimeResource(action) if action is not None else OneTimeResource(lambda request:"500 Routing error :(")
        if action is None:
            if defaultAction:
                action = defaultAction
            else:            
                action = lambda request:"404 :("
                
        return OneTimeResource(action)         

                
                
        
    def getResourceFor(self, request):
        return self.routeRequest(request)
        
"""
    Twisted thankfully doesn't do any type checking, so a
    dumb OO graph is A-Okay here.  It will be assigned to
    site.resource
"""
dumb = OverrideSite(Root())

reactor.listenTCP(80, dumb )
reactor.run()

Slapped this together in about 30 minutes… so there is a HIGH probability that it is almost entirely edge cased! Still it does work ( for me ) and it doesn’t hijack too much of twisted’s core, so it could be viable with a lot of unit-testing love, some additional sanity checking logics, and maybe some well thought out refactoring.

PyProxy – Aka the development Helper proxy

Coming out of nothing and into supah doopa Alpha is finally a working proof of concept of my python web proxy. I don’t really want to talk about the asinine alternatives I’ve tried until I finally said “fuck it, time to go completely twisted!” Low and behold the actual proxy part is 4 lines of code, which is then expanded to maybe 20-30 to allow for overloading some lower level classes.

Originally a public announcement for this project would have been in August at the earliest, give me time to clean things up and go from proof of concept to working concept but apparently a lot of other people have similar thoughts and I figured it’s better to collaborate then compete.

So some quick notes:
The ultimate goal for PyProxy ( or whatever it ends up being named ) is to sit between a developer and a development server. The first and immediate idea for this was to automagically parse out Python mechanize scripts to replicate the traffic. These mechanize scripts could then be collected into a suite, marking other scripts as requirements ( example login process ). That alone would make it pretty easy to create full system under test unit-tests. The next idea was to add in regex or pattern based hooks that could allow a developer to dial in to a specific domain, or even a specific set of webpages.

After that, the idea was to just continually tack on support plugins and scripts, maybe tell PyProxy the name of the target application’s database, and if it’s MySQL, switch on the general log. This could allow for combining both mechanize scripts AND a SQLObject or SQLAlchemy powered unit-test suite to assert that the correct data was changed.

The final future idea was to make a Firefox/Chrome extension that would allow a developer to control some parts of the proxy from their browser and also see additional information. For Python and PHP web apps, imagine have a finalization plugin that appended a response header listing all File’s used to perform a request…. then imagine having a “click to edit” button that, if the dev. instance is workstation local, would have your favorite IDE open the specified file for editing.

All in all, I think these are really subtle idea’s that if combined together, would cut down some mudane parts of developing a web app.

GitHub repo (https://github.com/devdave/PyProxy) here

Python pydoc module

Python documentation here

Example

$python -m pydoc pydoc

Talk about eating your own dogfood! This is exactly like using the help() function in the python command line interpreter… except accessible from your shell prompt…. but wait!

It’ gets better! Not only does it make julian fries ( may not for any implementation ) but it’s got a few versatile little secrets

$ python -m pydoc
pydoc - the Python documentation tool

pydoc.py  ...
    Show text documentation on something.   may be the name of a
    Python keyword, topic, function, module, or package, or a dotted
    reference to a class or function within a module or module in a
    package.  If  contains a '/', it is used as the path to a
    Python source file to document. If name is 'keywords', 'topics',
    or 'modules', a listing of these things is displayed.

pydoc.py -k 
    Search for a keyword in the synopsis lines of all available modules.

pydoc.py -p 
    Start an HTTP server on the given port on the local machine.

pydoc.py -g
    Pop up a graphical interface for finding and serving documentation.

pydoc.py -w  ...
    Write out the HTML documentation for a module to a file in the current
    directory.  If  contains a '/', it is treated as a filename; if
    it names a directory, documentation is written for all the contents.

Now the graphical interface isn’t anything to write home about, but the -p option provides a no thrills web interface to
almost everything accessible to your python interpreter. This can make it slightly easier to troll through foreign modules
looking for undocumented sub modules and classes… or having an accessible reference doc for properly managed modules

Python urllib

Python documentation here

Unfortunately there is little or no documentation on the command line properties of urllib but it does recognize everything that urllib can handle. So
python -m urllib http://website.com will grab the specified url and print to std out

Note FTP works as well but you need to follow the pattern ftp://user:password@website.com if authentication is required

Python module dis

Python documentation here

Example

python -m dis myFile.py provides an interesting look into a python file’s guts
I could easily imagine this being part of some sort of static time inspect system where dis sits at the front and a parse
walks down the output lines, turning the data into a dependency and symbol graph. Unfortunately it doesn’t seem to provide anything more and is really just a test function most likely intended for unit-testing the python stdlib.

Python SimpleHTTPServer

When working on pure javascript applications ( canvas widgets & such ), I’ve found using the SimpleHTTPServer disgustingly useful as it serves the current working directly without much thrills.

Python documentation here

Usage

$ python -m SimpleHTTPServer 8081 0.0.0.0
Serving HTTP on 0.0.0.0 port 8081 ...

Note that it’s not necessary to set the 2nd argument to 0.0.0.0 if you want the service to listen on all routes. It normally will by default listen on everything… just habit for me to always append that.

Another useful part of this server is that it servers an apache directory style listing of all file’s present unless there is a valid index file like index.htm present.

Python json.tool module

Python documentation here

Example usage:

$ echo '{"json":"obj"}' | python -mjson.tool
{
    "json": "obj"
}
$ echo '{ 1.2:3.4}' | python -mjson.tool
Expecting property name: line 1 column 2 (char 2)
   

All argument patterns are:

  • piped json string | python -m json.tool which syntax checks then outputs the results to stdout
  • python -m json.tool input_file.json which reads the path relative file and outputs the results
  • python -m json.tool input_file.json output_file.json only difference here is that the output is directed to the specified file

My thoughts, this could be part of some sort of data validation check, looking for corrupted json static files.

Example

$ echo '{"a":123, "foo":"bar" }' | python -m json.tool && echo "IS valid" || echo "Is not valid"
{
    "a": 123,
    "foo": "bar"
}
IS valid
$ echo '{"a"1:123, "foo":"bar" }' | python -m json.tool && echo "IS valid" || echo "Is not valid"
Expecting : delimiter: line 1 column 4 (char 4)
Is not valid

Python – batteries included

Inspired by this
AND here

I’m going to do my best to either write up some examples of how to use these or link to someone/somewhere else on the
internet where someone did a better job then my grammar handicapped self can

For those to lazy to click the above liniks, the list below is a semi-complete list of command line accessible modules
to perform utility work.

So python -m calendar prints out a pretty calendar of the year, much like the GNU linux cal command line function.

  • json.tool -> pretty prints JSON Examples
  • SimpleHTTPServer -> serve the current directory over HTTP on port 8080 Examples
  • quopri / uu / binhex / base64 -> encode / decode Quoted-Printable / UUEncoded content.
  • telnetlib -> ghetto telnet client
  • filecmp -> directory entry comparison tool
  • ftplib -> ghetto FTP client
  • smtpd -> SMTP proxy
  • timeit -> command line profiling interface. Very handy
  • calendar -> prints year calendar
  • urllib -> ghetto wget Example
  • zipfile -> ghetto info-zip
  • aifc -> dumps some info about the provided aiff file (if given two paths, also copies path1 to path2)
  • cgi -> dumps a bunch of information as HTML to stdout
  • CGIHTTPRequestHandler -> same as SimpleHTTPServer except via the CGIHTTPRequestHandler: it will executes scripts it recognizes as CGI, instead of just sending them over (has not survived the transition to Python 3)
  • compileall -> compiles a tree of Python files to bytecode, has a bunch of options. Does not compile to stripped files (pyo)
  • cProfiler -> runs the provided script file (argument) under cProfiler
  • dis -> disassembles a python script Example
  • doctest -> runs doctests on the provided files (which can be python scripts or doctest files)
  • encodings.rot_13 -> rot13 encodes stdin to stdout (has not survived the transition to Python 3)
  • fileinput -> some kind of ghetto pseudo-annotate. No idea what use that thing might be of
  • formatter -> reformats the provided file argument (or stdin) to stdout: 80c paragraphs &etc
  • gzip -> ghetto gzip (or gunzip with -d), can only compress and decompress, does not delete the archive file
  • htmllib -> strips HTML tags off of an HTML file
  • imaplib -> ghetto IMAP client
  • locale -> displays current locale information
  • mimify -> converts (mail) messages to and from MIME format
  • modulefinder -> lists the modules imported by the provided script argument, and their location
  • pdb scriptfile.py -> automatically enters PDB post-mortem mode if the script crashes
    platform, displays the platform string
  • poplib -> dumps a bunch of info on a POP mailbox
  • profile -> see cProfile
  • pstats -> opens a statistics browser (for profile files)
  • pydoc -> same as the pydoc command Example
  • sgmllib -> see htmllib (as far as I can tell)
  • shlex -> displays tokenization result of the provided file argument (one token per line prefixed with Token:)
  • SimpleXMLRPCServer -> XMLRPC server for power and addition
  • telnetlib -> telnet client
  • tokenize -> dumps tokenization result of a Python file
  • webbrowser -> opens provided URL in your default web browser (options: in a new window, in a new tab)
  • whichdb -> A helpful little tool to try and tell which DB-api driver to use on a specified DB file