Category Archives: Uncategorized

Python dataclass database (DCDB)


While I do use sqlalchemy and to some extent peewee for my projects, I slowly got tired of having to relearn how to write SQL when I’ve known SQL since the mid-90’s.

DCDB’s design is also aiming for simplicity and minimal behind the scenes automagical behaviors.   Instead complexity should be added voluntarily and in such a way that it can be traced back.   


import dataclasses as dcs
import dcdb 

class Foo:

db = dcdb.DBConnection(":memory:") # alternatively this can be a file path
   Bind doesn't change Foo in the local scope but instead
   it creates a new class DCDB_Foo which is stored to the DBConnection in it's 
   table registry.

   Behind the scenes, a table `Foo` is created to the connected database.  No changes to the name are made (eg pluralization). How you wrote your bound dataclasses is almost exactly how it is stored in the sqlite database.

   An exception is that a .id instance property along with DB methods like: update/save, Create, Get, and Select are added to the class definition.

record = db.t.Foo(name="Bob", age="44")
assert == "Bob"
same_record = db.t.Foo.Get("name=?", "Bob")
assert record.age == 44
assert ==

record.age = 32

same_record = db.t.Foo.Get("age=?", 32)
assert ==
assert same_record.age == 32


Note it is important to notice that currently same_record and 
record have the same .id # property but they are different 
instances and copies of the same record with no shared reference.   
Changes to one copy will not reflect with the other.


Github DCDB

PyQT5 QMediaPlaylist documentation snafu

QMediaPlaylist has a method


with the signature

bool QMediaPlaylist::addMedia(const QMediaContent &content)

And the documentation suggests for c++ that


should work.

BUT in PyQT5


errors out with unexpected QUrl.

So….. you have to do


in python. A bit unwieldy but PySide2 appears to have stalled from an outsiders perspective.

Initially I suspect that QUrl and QMediaContent inherited from some sort of common base class and were split off and its perhaps an untested use case of using QUrl was lost. I ended up making a bandaid in python with

MakePath = lambda x: QtM.QMediaContent(QtCore.QUrl.fromLocalFile(x))

where QtM is `from PyQt5 import QtMultimedia as QtM`

From a discussion on a python chat board – How to deal with criticism

The poster asked “This may be offtopic but i really want to share this.
Did you ever faced criticism with your code. How to overcome this criticism ?”

And this was my response.

Every so often I go back to something I wrote 10 years ago and I laugh my ass off. Now if 10 years ago someone had done that, it would have been somewhat of a painful experience. People have already mentioned “you are not you’re code” which is mostly true BUT at same time you are your code at this moment in time. When a peer criticizes your code, you need to quickly detach yourself and hear out what they have to say. Sometimes their input is going to be asinine ( “I prefer naming variable after my kid’s” ) but hopefully its going to real value ( “You should implement dash or camel casing, does ‘expertsexchange’ mean Expert Sex change or Experts exchange? “)

Before I give whatever advice on filtering poisonous vs constructive criticism, going to address why you want criticism. If you are a small person, you will be stuck in a small world and only realize the situation when people are selling cars and you’re still crafting buggy whips to a doomed industry ( eg COBOL & mainframes vs Java & server farms ). To improve your craft, yes you need challenging work but you also need to be exposed to different idea’s or you will not grow professionally.

Now as far as handling criticism and deciding if it has value. #1 is that you need to check your emotions and listen. #2 identify the problem they have with your work and clarify so you have concrete examples of what is and is not the problem. #3 Evaluate the value of fixing the problem “Does this improve things for my team and our success” or “Does this make my product better and or more maintainable?”

If the person cannot give concrete examples of #2, tell them you don’t understand. If they cannot do this without resorting to verbal abuse, conversation is over and escalate to supervision or discontinue discourse.

If the person cannot demonstrate #3 ( eg how does using their children’s names make things “better”? ) escalate or discontinue, telling them that you don’t seen an advantage to their proposal.

Finally, emotion’s get you into a fight you may not be able to win or will have costs to your career down the line. One example is a person I found immensely influential to my career and I thought he was the bee’s knees. This person was outspoken and somewhat vitriolic but he was generally right ( or appeared to be ). 5 years down the road, he seems like a “has been” that is constantly verballing threatening to beat up people when they criticize his code or his behavior “I am a MMA fighter, I will kick your ass”. For the most part I think that guy’s career is on its way out as who wants to collaborate or associate with him? Also I am not talking about Linus Torvald… he’s a completely different kind of crazy with a completely different problem.

Project notes

goals: evaluate speed over memory for pypy using stackless/coroutines with ZMQ and twistless ( if its even possible without rebuilding txZMQ ).

Initial goal would be to take and let two stacks fight it out for an hour ( assuming 10x deep othello minimax ).

Interesting way to safely debug multiprocessing python systems

I have one particular “job” that has 3 sub processes moving as fast as humanly possible to build a report. The main slowdown is an external data source which isn’t outright terrible but its not great either. The worst possible outcome is when this thing hangs or misses available work which it was predisposed to do a lot.

Various kill signals usually failed to give me an idea of where the workers were getting hung up and I wasn’t really excited about putting tracer log messages everywhere. Fortunately I have a dbgp enabled IDE and I found this answer on SO.

Taking that I modified it to look like this:

import traceback, signal

#classdef FeedUserlistWorker which is managed by a custom multiprocessing.Pool implementation.

    def Create(cls, feed, year_month = None):

        signal.signal(signal.SIGUSR1, FeedUserlistWorker._PANIC)

            return cls(year_month=year_month, feed=feed).run()
        except Exception as e:
            from traceback import print_exc

the print_exc is there because there isn’t a very reliable bridge to carry Exceptions from child to parent. Flush’s are there because stdout/stderr are buffered in between the parent pool manager.

    def _PANIC(cls, sig, frame):

        from dbgp.client import brk; brk("", 9090)

The only thing that matters is that call to dbgp. Using that tool, I was able to step up the call stack, fire adhoc commands to inspect variables in the stack frame, and find the exact blocking call, which turned out to be the validation/authentication part of boto s3. That turned out to be a weird problem as I had assumed the busy loop/block was in my own code ( eg while True: never break ), fortunately it has an easy fix!msg/boto-users/0osmP0cUl5Y/5NZBfokIyoUJ which resolved the problem as my Pool manager doesn’t mark tasks complete and failures will only cause the lost task to be resumed from the last point of success.

Cassandra 1.2.x – Wide rows are freaky on AWS m1.xlarge

CQL3 is a very nice abstraction to Cassandra but its important to pay attention to what it is doing.

In SQL land, 1 record == 1 row. In Cassandra 1 record == 1 row, but 2+ records can ALSO be on the same row. This has to do with CQL’s partition and primary keys. Your partition key is what decides which row a record belongs to while the primary key is where the record is in a row. If you only have a primary key and no partition key, 1 record == 1 row, but if you have a composite ( partition key, primary key) every record where partition key is the same is going on the same row.

I had a few rows that were ~30GB in size which put stress on nodes using m1.xlarge ( 8GB heap, 300MB new heap size ) with epic Compaction cycles of doom.

quick setup Datastax Cassandra for Ubuntu 12.04

curl -L | sudo apt-key add -
sudo sh -c 'echo "deb stable main" >> /etc/apt/sources.list.d/datastax.list'

sudo apt-get update
sudo apt-get install cassandra

Really wish there was a cassandra-core and cassandra-server divide as more often than not I just need the c* tools and libraries and not so much the server itself.

Also 1.2.5 is TOXIC as all hell for high read/write environments.

AWS + python + new datastax driver for python

These directions will setup openvpn ( you need that unless your dev’ing inside AWS )

New Python cassandra driver is here
It’s really new so it has sharp edges.

Howto: Make a HTML desktop app with PySide

Given a project structure like:

        jquery-1.9.1.min.js looks like:

import sys
from os.path import dirname, join

#from PySide.QtCore import QApplication
from PySide.QtCore import QObject, Slot, Signal
from PySide.QtGui import QApplication
from PySide.QtWebKit import QWebView, QWebSettings
from PySide.QtNetwork import QNetworkRequest

web     = None
myPage  = None
myFrame = None

class Hub(QObject):

    def __init__(self):
        super(Hub, self).__init__()

    def connect(self, config):
        print config

    def disconnect(self, config):
        print config

    on_client_event = Signal(str)
    on_actor_event = Signal(str)
    on_connect = Signal(str)
    on_disconnect = Signal(str)

myHub = Hub()

class HTMLApplication(object):

    def show(self):
        #It is IMPERATIVE that all forward slashes are scrubbed out, otherwise QTWebKit seems to be
        # easily confused
        kickOffHTML = join(dirname(__file__).replace('\\', '/'), "www/index.html").replace('\\', '/')

        #This is basically a browser instance
        self.web = QWebView()

        #Unlikely to matter but prefer to be waiting for callback then try to catch
        # it in time.

    def onLoad(self):
        if getattr(self, "myHub", False) == False:
            self.myHub = Hub()
        #This is the body of a web browser tab
        self.myPage =
        self.myPage.settings().setAttribute(QWebSettings.DeveloperExtrasEnabled, True)
        #This is the actual context/frame a webpage is running in.  
        # Other frames could include iframes or such.
        self.myFrame = self.myPage.mainFrame()
        # ATTENTION here's the magic that sets a bridge between Python to HTML
        self.myFrame.addToJavaScriptWindowObject("my_hub", myHub)
        #Tell the HTML side, we are open for business

#Kickoff the QT environment
app = QApplication(sys.argv)

myWebApp = HTMLApplication()


and index.html looks like:


        Tell the hub to connect

Basically class Hub is a bridge/interface pattern between Python and the HTML Javascript engines. It’s probably best to pass things between the two as JSON as that’s a language both sides are readily prepared to deal with.

Stuff I didn’t deal with yet: The Main window title ( self.web.setWindowTitle), setting a window Icon ( no clue on this one ).

Otherwise this gives me an OS agnostic UI, I can leverage my HTML skillset without having to learn QML or QT Designer’s UI interface, and I can hopefully recycle some logic. Additionally I can go full fledged with the bridge logic, split it between the two, or shove all the complexity into JS and have basic Python endpoints exposed.

Second thoughts on the ZMQInline logic

Initially I was going to set the experiment with ZMQInline aside but I am starting to realize that it does have some merits.

Below is a burnout test I did using dotGraph ( current name candidate )

    def do_kickstart(self):
        for i in range(1, 100):
                    self.log.debug("Process Loop {0}".format(i))
                    response = yield Request(self.control, RequestKickstart(), timeout = 5000)
                except TimedoutError:

                except Exception:
                    self.log.exception("Unexpected exception")
                    self.log.debug("Got "+ repr(response))

                    if RespondWithKickstart.Matches(response):
                        self.log.debug("Got Kickstart!")
            except Exception as e:
                self.log.error("Caught an exception leak {0}".format(e))


In Dotgraph, Kickstart is the first task both client’s and services need to accomplish and it’s basically “Hello I am here, who do I talk to?”. Depending on the Kickstart response, clients and services might be sent to a completely different host, port set. The idea is that if DotGraph needs to grow, it can have an array of slave hubs and then delegate new client’s to the most underbooked.

One thing I realized is that the above structure would perfect for doing standoff logic ( attempt, time out…wait, attempt N seconds later). Of course it begs the question of WHY the other party hasn’t responded in time ( are they down or just backlogged? ). Other problem is how to do unit-testing cleanly? Still, I haven’t thrown it away yet so maybe I will find a use for it in DotGraph.