Author Archives: David

About David

A mostly professional code monkey, to contact me translate the following "direct from blog to email" at the website ominian.net. Replace spaces with underscores or dashes, whatever is valid. A script will pick up your email, scrub it against a white list, and if your not a spammer I will get an email.

Project notes

http://www.stevecoursen.com/209/stackless-python-meets-twisted-matrix/
https://github.com/smira/txZMQ
https://pypi.python.org/pypi/Twistless/1.0.0

goals: evaluate speed over memory for pypy using stackless/coroutines with ZMQ and twistless ( if its even possible without rebuilding txZMQ ).

Initial goal would be to take github.com/devdave/pyfella and let two stacks fight it out for an hour ( assuming 10x deep othello minimax ).

Interesting way to safely debug multiprocessing python systems

I have one particular “job” that has 3 sub processes moving as fast as humanly possible to build a report. The main slowdown is an external data source which isn’t outright terrible but its not great either. The worst possible outcome is when this thing hangs or misses available work which it was predisposed to do a lot.

Various kill signals usually failed to give me an idea of where the workers were getting hung up and I wasn’t really excited about putting tracer log messages everywhere. Fortunately I have a dbgp enabled IDE and I found this answer on SO. http://stackoverflow.com/a/133384/9908

Taking that I modified it to look like this:

import traceback, signal

#classdef FeedUserlistWorker which is managed by a custom multiprocessing.Pool implementation.

    @classmethod
    def Create(cls, feed, year_month = None):

        signal.signal(signal.SIGUSR1, FeedUserlistWorker._PANIC)

        try:
            return cls(year_month=year_month, feed=feed).run()
        except Exception as e:
            from traceback import print_exc
            print_exc(e)
            sys.stderr.flush()
            sys.stdout.flush()

the print_exc is there because there isn’t a very reliable bridge to carry Exceptions from child to parent. Flush’s are there because stdout/stderr are buffered in between the parent pool manager.

    @classmethod
    def _PANIC(cls, sig, frame):
        d={'_frame':frame}
        d.update(frame.f_globals)
        d.update(frame.f_locals)

        from dbgp.client import brk; brk("192.168.1.2", 9090)

The only thing that matters is that call to dbgp. Using that tool, I was able to step up the call stack, fire adhoc commands to inspect variables in the stack frame, and find the exact blocking call, which turned out to be the validation/authentication part of boto s3. That turned out to be a weird problem as I had assumed the busy loop/block was in my own code ( eg while True: never break ), fortunately it has an easy fix https://groups.google.com/forum/#!msg/boto-users/0osmP0cUl5Y/5NZBfokIyoUJ which resolved the problem as my Pool manager doesn’t mark tasks complete and failures will only cause the lost task to be resumed from the last point of success.

Cassandra 1.2.x – Wide rows are freaky on AWS m1.xlarge

CQL3 is a very nice abstraction to Cassandra but its important to pay attention to what it is doing.

In SQL land, 1 record == 1 row. In Cassandra 1 record == 1 row, but 2+ records can ALSO be on the same row. This has to do with CQL’s partition and primary keys. Your partition key is what decides which row a record belongs to while the primary key is where the record is in a row. If you only have a primary key and no partition key, 1 record == 1 row, but if you have a composite ( partition key, primary key) every record where partition key is the same is going on the same row.

I had a few rows that were ~30GB in size which put stress on nodes using m1.xlarge ( 8GB heap, 300MB new heap size ) with epic Compaction cycles of doom.

quick setup Datastax Cassandra for Ubuntu 12.04

curl -L http://debian.datastax.com/debian/repo_key | sudo apt-key add -
sudo sh -c 'echo "deb http://debian.datastax.com/community/ stable main" >> /etc/apt/sources.list.d/datastax.list'

sudo apt-get update
sudo apt-get install cassandra

Really wish there was a cassandra-core and cassandra-server divide as more often than not I just need the c* tools and libraries and not so much the server itself.

Also 1.2.5 is TOXIC as all hell for high read/write environments.

AWS + python + new datastax driver for python

These directions will setup openvpn ( you need that unless your dev’ing inside AWS )
http://sysadminandnetworking.blogspot.com/2012/12/openvpn-on-ec2aws.html

New Python cassandra driver is here https://github.com/datastax/python-driver
It’s really new so it has sharp edges.

Howto: Make a HTML desktop app with PySide

Given a project structure like:

my_project
 main.py
 www/
    index.html
    scripts/ 
        jquery-1.9.1.min.js

main.py looks like:

import sys
from os.path import dirname, join

#from PySide.QtCore import QApplication
from PySide.QtCore import QObject, Slot, Signal
from PySide.QtGui import QApplication
from PySide.QtWebKit import QWebView, QWebSettings
from PySide.QtNetwork import QNetworkRequest



web     = None
myPage  = None
myFrame = None

class Hub(QObject):

    def __init__(self):
        super(Hub, self).__init__()


    @Slot(str)
    def connect(self, config):
        print config
        self.on_client_event.emit("Howdy!")

    @Slot(str)
    def disconnect(self, config):
        print config

    on_client_event = Signal(str)
    on_actor_event = Signal(str)
    on_connect = Signal(str)
    on_disconnect = Signal(str)


myHub = Hub()

class HTMLApplication(object):

    def show(self):
        #It is IMPERATIVE that all forward slashes are scrubbed out, otherwise QTWebKit seems to be
        # easily confused
        kickOffHTML = join(dirname(__file__).replace('\\', '/'), "www/index.html").replace('\\', '/')

        #This is basically a browser instance
        self.web = QWebView()

        #Unlikely to matter but prefer to be waiting for callback then try to catch
        # it in time.
        self.web.loadFinished.connect(self.onLoad)
        self.web.load(kickOffHTML)

        self.web.show()

    def onLoad(self):
        if getattr(self, "myHub", False) == False:
            self.myHub = Hub()
         
        #This is the body of a web browser tab
        self.myPage = self.web.page()
        self.myPage.settings().setAttribute(QWebSettings.DeveloperExtrasEnabled, True)
        #This is the actual context/frame a webpage is running in.  
        # Other frames could include iframes or such.
        self.myFrame = self.myPage.mainFrame()
        # ATTENTION here's the magic that sets a bridge between Python to HTML
        self.myFrame.addToJavaScriptWindowObject("my_hub", myHub)
        
        #Tell the HTML side, we are open for business
        self.myFrame.evaluateJavaScript("ApplicationIsReady()")


#Kickoff the QT environment
app = QApplication(sys.argv)

myWebApp = HTMLApplication()
myWebApp.show()

sys.exit(app.exec_())

and index.html looks like:


    
        
        

        
    
    
        Tell the hub to connect
        
    
    

Basically class Hub is a bridge/interface pattern between Python and the HTML Javascript engines. It’s probably best to pass things between the two as JSON as that’s a language both sides are readily prepared to deal with.

Stuff I didn’t deal with yet: The Main window title ( self.web.setWindowTitle), setting a window Icon ( no clue on this one ).

Otherwise this gives me an OS agnostic UI, I can leverage my HTML skillset without having to learn QML or QT Designer’s UI interface, and I can hopefully recycle some logic. Additionally I can go full fledged with the bridge logic, split it between the two, or shove all the complexity into JS and have basic Python endpoints exposed.

Second thoughts on the ZMQInline logic

Initially I was going to set the experiment with ZMQInline aside but I am starting to realize that it does have some merits.

Below is a burnout test I did using dotGraph ( current name candidate )

@ZMQInline()
    def do_kickstart(self):
        self.log.debug("do_kickstart")
        for i in range(1, 100):
            try:
                try:
                    self.log.debug("Process Loop {0}".format(i))
                    response = yield Request(self.control, RequestKickstart(), timeout = 5000)
                except TimedoutError:
                    self.log.exception("Timedout")

                except Exception:
                    self.log.exception("Unexpected exception")
                else:
                    self.log.debug("Got "+ repr(response))

                    if RespondWithKickstart.Matches(response):
                        self.log.debug("Got Kickstart!")
            except Exception as e:
                self.log.error("Caught an exception leak {0}".format(e))

        self.loop.stop()

In Dotgraph, Kickstart is the first task both client’s and services need to accomplish and it’s basically “Hello I am here, who do I talk to?”. Depending on the Kickstart response, clients and services might be sent to a completely different host, port set. The idea is that if DotGraph needs to grow, it can have an array of slave hubs and then delegate new client’s to the most underbooked.

One thing I realized is that the above structure would perfect for doing standoff logic ( attempt, time out…wait, attempt N seconds later). Of course it begs the question of WHY the other party hasn’t responded in time ( are they down or just backlogged? ). Other problem is how to do unit-testing cleanly? Still, I haven’t thrown it away yet so maybe I will find a use for it in DotGraph.

FYI on pyzmq and IOLoop

A funny little quirk has gotten me twice now ( hopefully it won’t be three times ).

In Twisted, once the generator has been called ( even inadvertently ) it becomes the defacto instance.

In PyZMQ, calling ioloop.IOLoop() creates a new loop ( and blows the old one away )… The correct approach is to always call it via

    ioloop.IOLoop.Instance()

I don’t want to talk about how much time I wasted working out how to make a pyzmq like interface similar to Twisted’s inlineCallback, but it was enough!

On a lighter note: how to make my co-worker do XXX

From stack exchange – http://programmers.stackexchange.com/questions/185923/how-can-i-deal-with-a-team-member-who-dislikes-making-comments-in-code/185926#185926

I think anyone who has written software alongside anyone else has had at least one grievance with their peers work. Sometimes it can be benign like

if(foo == bar): do_stuff

versus

if(foo == bar) {
   do_stuff
}

And other times it can a tad more annoying like

class Foo { function Foo(arg1,arg2){ this.arg1=arg1, this.arg2=arg2} }

To get a co-worker to change that style, from what I’ve seen it’s not going to happen by asking them to use more XXX and formatting politely. I worked with someone like this and they got immediately defensive and said “I just don’t like wasting space.” which kind of took me off guard as it didn’t really make any sense. For that case I was able to pull rank and kickback their commit’s which probably isn’t the best solution. I’ve seen other solutions like implementing a CI that stage one was an automatic style checker… which is basically what I did but automated and probably a worse solution as sometimes you need to write something really dirty. Maybe a better solution is code reviews but it is fairly rare to see a company institute those. 9 out of 11[SIC] companies, the guy writing the checks for the software engineers isn’t an engineer themselves and in the short term they’re going to see that one of their senior engineers isn’t directly doing work that correlates to profits. In that last case, when it comes up in a review the lead can justify the expenses by pointing out that they’re minimizing the wasted time other staff have to spend just to comprehend something.

Finally at the end of the day, while it’s important that a person finds reward in their job… it’s still a job and not your private playpen. Instead it’s more like software survivor island, the majority has to work together to keep their company/product going and if one goof ball insists on writing software in a conflicting manner then the majority it doesn’t really matter if they are “right”, they need to conform or GTFO.

For scenario’s where you ARE that goofball, you need to get political and I don’t mean selling the highest non-technical person on your island but demonstrating to your team the merits of an idea and working with their feedback. For real world example, I joined a consultancy as the goofball and started with the company architect and development manager. I didn’t sell them on the idea that I was right but on the idea until it became their “right” idea. From there I prodded momentum until one day the idea stopped being a minority opinion and a coup was staged. A week later the company made a fairly dramatic shift in how they did development. That’s a nice happy ending scenario but it doesn’t always happen. Twice I’ve failed to sell the virtues of unit-testing ( one case they assigned me the task of writing unit-tests which was just a means of getting me to shut up about them… but completely negated the idea) and twice I walked away because I got tired of having last minute “Oh my god everything is fucked up” hack sessions to fix prod. That’s life, sometimes you lose and sometimes you win.

Consultancy, the downsides

People outside of my vocation always marvel at my “flexible” life-style and for a much fewer number the amount of money I make in an hour or a week when employed to a client or project. These things are definitely nice, another nice perk is that I’ve been exposed to a much larger breath of technologies, idea’s, people, and I am probably some sort of interview guru as I can usually reverse engineer what the goal of the interview is. Then there is the fact of life that if I am being called into a project, it’s not because I am some sort of genius code monkey but because the problem is pretty terrible. I might also be thrown under an undeserving bus, months down the road due to things completely out of my control. And lastly there is no future in being a consultant.

With interviews, things usually go poorly when prospective client’s don’t know what the hell their goals are for an interview. Or worse if they just have a bucket list they’re trying to check off. In the first case I’ve politely ended interviews ( as the interviewee ) when presented with “What’s the difference between a inner & outer join” ( for that one I usually draw two circles and shade in relevant sections to illustrate every variation ( left, right, inner, outer) and then told I am wrong. or asked to implement something really crazy like a b-tree on a whiteboard. In the first case this is a red flag that the company is a semi-hostile environment and the interviewer is probably trying to protect their incompetence and the second it begs the question “Would I need to implement my own b-tree logic, what other wheel is being reinvented here?” I’ve seen a lot of the first “No you’re wrongs” in my career so far and they’re completely opposite of the second where the interviewer has good intentions but fails to find out if the interviewee has critical thinking and problem solving skills.
The other problem of interviews is the bucket list interviews, especially depending on who is doing it. Usually its headhunter’s going down a line and are completely ignorant of how insane they sound when they ask questions like “Do you have 10+ years of experience with Ruby on Rails?” That one took me off guard so I asked “Which Rails? Version 2 or 3 and is that with 3’s asset generation system or what?” to which the headhunter had no answer. Then that also begs the question of what is the goal? Technically I have 6-7 years of experience with Rails as I was first introduced to it in 2006 and my last working experience was the start of 2013… but I haven’t used Rails 100% of the time in between. If had to guess I might have 4000 hours of Rails experience which is 6000 hours short of being a master of that ever changing landscape.

Being a consultant doesn’t mean you’re always going to find world challenging work either. For an unnamed client we will call Acme, I was called in because an adjacent department knew me and knew that I was their go to guy for making something that actually worked. A lot of things had gone wrong for Acme, their project manager turned out to be a liar that sat on his feet while a “lead” contractor watched netflix and pecked out some code occasionally. I’ve seen a lot of convoluted solutions in my career and each one is special. For Acme’s case the lead was very good at aesthetics but had never read a book on design patterns or done any business software before. If pressed, I’d say the system was some sort of reversed MVP pattern but with stored procedure style application code ( implemented in the application tier ). Further the system could only be talked to via XHR via a global anchor tag handler that took the anchor tag’s href, wrapped it in XML, and sent that to the server. The server then built the view up, calling injects along the way until it reached the end and sent that back as a two element httpbody document, the first half was javascript to be executed and the second was html. It would eval the javascript xml element and then insert/overwrite the contents of a browsers main view-able content. There was a lot of problems with this system but fortunately the “lead” had been sacked alongside the original PM just before I arrived. Another consultant and I sat down over dinner and after flipping a coin as to whether we were running away, we delegated out goals, the other guy was going to focus on adding new “features” to the system while I worked on figuring out how to refactor the system into a MVC pattern… or at least bring in some sort of simple model system to get rid of the functional procedures crap. We pulled that off. Meanwhile a third “Real” project flow consultant pulled some serious smoke & glitter magic tricks and claimed whatever credit for several 220 hour long months of “fixing” the application. He did all this after we threatened to gang up on him if he got in the way of the refactoring process.

Also even if I do everything right… things can still go badly down the road. For the prior dept at Acme, I built a simple little MVC application with a data bridge from a proprietary database the name of which I can’t remember. It worked and handed it off to someone that knew how it worked, how to maintain it, and life was good. A year and change later the maintainer got fired with prejudice for insulting some executive or along those lines. I say with prejudice because he didn’t do a hand off for the application and all of the documentation for the application disappeared alongside any backups. The replacement then went ahead and corrupted the only intact copy of the application’s database, breaking the application. They didn’t call me up until a week before it was needed again and then I sat through a fairly nasty conference call where people were blaming me for “There’s no documentation, no backups, and it doesn’t work.” In a fairly neutral voice I explained my rate had just tripled and oh by the way here’s a signed receipt by the person vilifying me showing him signing off that “A) Product meets expectations B) Documentation provided in digital and printed format” with a “will notify” consultant of preexisting clause so that I could fix any bugs for free. Even if they had agreed to 3x my normal rate, I am pretty sure pointing out that the relevant work manager was slandering me to cover his incompetence pretty much ended any chance of that dept being a client ever again. I’ve worked in hostile environments before ( people screaming at each other, on the job substance abuse, or micro-managers that are more concerned about covering their ass then actually doing something ) and it’s really not worth it to be there. Maybe 30-40% of all contract/consultant opportunities I see are like this and it never really ends well because it ultimately comes down that the man or woman that could actually fix the problem is just some well meaning person who’s afraid of conflict ( so what was a minor leak in a pipe becomes an insane tempest flood ).

Lastly, being a consultant/code monkey for hire has no future. It will take you down the road a little bit more, cover your bills for a time, and it’s actually got some good experiences but generally a consultant is there to solve a problem while a contractor is there to give the company an emergency release valve ( oh no, we’re running out of money FIRE the contractors!). If a problem is never solved then you’re a terrible consultant and if you’re a contractor for several years in a company…well why buy the cow when you already got the milk? Sure consultants can team up into a consultancy to try and make more money, but that doesn’t scale and becomes unbelievably risky as you grow larger. Furthermore it becomes some sort of pyramid scheme where the company stops hiring experienced engineers & developers and begins hiring student intern/junior engineers but selling them as senior engineers. If done right, the client might not even realize that they’re overpaying by 70% but there will be hell to pay if an auditor shows up one day to find out how things looked below the covers.