Category Archives: Uncategorized

Process finished with exit code -1073740791 (0xC0000409) – PySide2 & PyQT5

I am working on a WinAmp clone called PySongMan(ager) and kept getting a stackoverflow bug. Drilling my code downward, simplifying it as I went, I got a script like:

import sys
import argparse
import pathlib
from PyQt5 import QtCore
from PyQt5 import QtWidgets
class Test(QtWidgets.QMainWindow):
pass
def main(song_file):
test = Test()
test.show()
app = QtWidgets.QApplication(sys.argv)
print("Player shown")
return sys.exit(app.exec_())
if __name__ == '__main__':
# parser = argparse.ArgumentParser()
# parser.add_argument("song_file", nargs="?", default=None)
# args = parser.parse_args()
main(None)
view raw player.py hosted with ❤ by GitHub

Which kept throwing a 0x0…409 error which is a stack overflow error. Finally, somewhat by accident, I figured out my mistake. The good code looks like:

import sys
import argparse
import pathlib
from PyQt5 import QtCore
from PyQt5 import QtWidgets
class Test(QtWidgets.QMainWindow):
pass
def main(song_file):
app = QtWidgets.QApplication(sys.argv)
test = Test()
test.show()
print("Player shown")
return sys.exit(app.exec_())
if __name__ == '__main__':
# parser = argparse.ArgumentParser()
# parser.add_argument("song_file", nargs="?", default=None)
# args = parser.parse_args()
main(None)
view raw player.py hosted with ❤ by GitHub

and runs without issue. So the problem is that before any QT widget/window can be created, QApplication must be called first.

Data migration with SQLAlchemy and Alembic

I needed to optimize an unruly table filled with floats but I also didn’t want to lose my data. Unfortunately the documentation on the alembic website doesn’t mention anything or give any hints on how to do a data migration versus just a schema migration.

Fortunately I was able to run a symbolic debugger against alembic and figured out that all of the op.<method>`calls are atomic. If you have an add_column call, it adds the column when it executes that method. So that opened the door to data migrations.

One note before I pasted the code. You don’t need to specify all of the columns of the source table when used in a data migration scope. This makes your code a lot cleaner as the working model code is specific to what data you plan on using.

Alright, no more babbling, here is the example code.

"""Convert lat/long from float to int
Revision ID: b020841d98e4
Revises: 6e741a21efc8
Create Date: 2019-07-10 20:03:38.282042
Given a source table like
class GPS(Base):
# $--RMC, hhmmss.sss, x, llll.lll, a, yyyyy.yyy, a, x.x, u.u, xxxxxx,, , v * hh < CR > < LF >
__table_args__ = (UniqueConstraint("date_time", name="uix_dt"),)
video_id = Column(Integer, ForeignKey("Video.id"))
video = relationship("Video", back_populates="coordinates")
#time = Column(Time)
date_time = Column(DateTime)
status = Column(String)
latitude = Column(Float)
north_south = Column(String)
longitude = Column(Float)
east_west = Column(String)
speed = Column(Float)
course = Column(Float)
Where I want to convert all of the floating point columns to integer AND convert the data.
"""
from alembic import op
import sqlalchemy as sa
from sqlalchemy import orm
from sqlalchemy.ext.declarative import declarative_base
# revision identifiers, used by Alembic.
revision = 'b020841d98e4'
down_revision = '6e741a21efc8'
branch_labels = None
depends_on = None
Base = declarative_base()
class GPS(Base):
__tablename__ = "GPS"
id = sa.Column(sa.Integer, primary_key=True)
latitude = sa.Column(sa.Float)
_latitude = sa.Column(sa.Integer)
longitude = sa.Column(sa.Float)
_longitude = sa.Column(sa.Integer)
speed = sa.Column(sa.Float)
_speed = sa.Column(sa.Integer)
course = sa.Column(sa.Float)
_course = sa.Column(sa.Integer)
def upgrade():
with op.batch_alter_table("GPS") as batch_op:
batch_op.add_column(sa.Column("_latitude", sa.Integer))
batch_op.add_column(sa.Column("_longitude", sa.Integer))
batch_op.add_column(sa.Column("_speed", sa.Integer))
batch_op.add_column(sa.Column("_course", sa.Integer))
####
# Here is where you can connect your declarative model to the database going through the migration
bind = op.get_bind()
session = orm.Session(bind=bind)
# now that you've got a session
i = 0
c_count = session.query(GPS).count() #you can query the database table you are working on
seven = 10 ** 7
for coordinate in session.query(GPS): # type: GPS
i += 1
coordinate._latitude = int(coordinate.latitude * seven)
coordinate._longitude = int(coordinate.longitude * seven)
coordinate._course = int(coordinate.course * 1000)
coordinate._speed = int(coordinate.speed * 100)
session.add(coordinate)
if i % 3000 == 0:
print(f"\tProcessed {i}/{c_count}")
session.commit()
session.commit()
with op.batch_alter_table("GPS") as batch_op:
batch_op.drop_column("latitude")
batch_op.drop_column("longitude")
batch_op.drop_column("status")
batch_op.drop_column("speed")
batch_op.drop_column("course")
# noinspection PyProtectedMember
def downgrade():
with op.batch_alter_table("GPS") as batch_op:
batch_op.add_column(sa.Column("latitude", sa.Float))
batch_op.add_column(sa.Column("longitude", sa.Float))
batch_op.add_column(sa.Column("course", sa.Float))
batch_op.add_column(sa.Column("speed", sa.Float))
bind = op.get_bind()
session = orm.Session(bind=bind)
i = 0
for coordinate in session.query(GPS): # type: GPS
i += 1
coordinate.latitude = coordinate._latitude / 10 ** 7
coordinate.longitude = coordinate._longitude / 10 ** 7
coordinate.speed = coordinate._speed / 1000
coordinate.course = coordinate._course / 100
session.add(coordinate)
if i % 1000 == 0:
session.commit()
session.commit()
with op.batch_alter_table("GPS") as batch_op:
batch_op.drop_column("_latitude")
batch_op.drop_column("_longitude")
batch_op.drop_column("_speed")
batch_op.drop_column("_course")

A while back I downloaded my google location and history data and ran into these strange lat7 and long7 columns (paraphrasing as I don’t remember their exact names). The data were these large integer numbers that I couldn’t figure out how to decode. Suddenly it became obvious when I noticed all of the latitude fields started with 35 and the longitude started with -104. 35, -104 is approximately a few hundred miles from where I live. By doing lat7 / 10000000 (10e7 or 10**7) I was able to get floating point GPS coordinates.

Since then, when it comes time to optimize database schemas I’ve always started with figuring out if I can shift the percentage out and use integers instead. If using sqlite3, a Float is actually a varchar and that’s huge in comparison to using a byte or two of signed integers. Throw a million records on and it can get up to 30-40% of wasted diskspace.

Anyway where was I. Since I wanted to get rid of all of the floats and replace the real fields with @hybrid_propertyand @hybrid_property.expression I renamed latitude to _latitude, shifted out the percent, and used the aforementioned decorators to transform the integers back to floats on demand.

Slimmed down twisted compatible reloader script

Working on txweb again, I decided to give it a real flask/django style reloader script. Directly inspired by <a href=”
https://blog.elsdoerfer.name/2010/03/09/twisted-twistd-autoreload/ “>this</a> blog post I decided to cut down on the extraneous bits and also change what files it watched.

https://gist.github.com/devdave/05de2ed2fa2aa0a09ba931db36314e3e

"""
A flask like reloader function for use with txweb
In user script it must follow the pattern
def main():
my main func that starts twisted
if __name__ == "__main__":
from txweb.sugar.reloader import reloader
reloader(main)
else:
TAC logic goes here but isn't necessary for short term dev work
Originally found via https://blog.elsdoerfer.name/2010/03/09/twisted-twistd-autoreload/
NOTE - pyutils appears to be dead or a completely different project
That link lead to this code snippet - https://bitbucket.org/miracle2k/pyutils/src/tip/pyutils/autoreload.py
I didn't like how the watching logic walked through sys.modules as I was just concerned with the immediate
project files and not the entire ecosystem. Instead it starts with the current working directory via os.getcwd
and then walks downward to look over .py files
sys.exit didn't work correctly so I switched to use os._exit as hardset. I am not sure what to do if
os._exit ever gets deprecated.
I also removed the checks for where to run the reloader logic and it will always run as a thread.
"""
import pathlib
import os
import sys
import time
try:
import thread
except ImportError:
try:
import _thread as thread
except ImportError:
try:
import dummy_thread as thread
except ImportError:
try:
import _dummy_thread as thread
except ImportError:
print("Alright... so I tried importing thread, that failed, so I tried _thread, that failed too")
print("..so then I tried dummy_thread, then _dummy_thread. All failed")
print(", at this point I am out of ideas here")
sys.exit(-1)
RUN_RELOADER = True
SENTINEL_CODE = 7211
SENTINEL_NAME = "RELOADER_ACTIVE"
SENTINEL_OS_EXIT = True
try:
"""
"Reason" is here https://code.djangoproject.com/ticket/2330
TODO - Figure out why threading needs to be imported as this feels like a problem
within stdlib.
"""
import threading
except ImportError:
pass
_watch_list = {}
_win = (sys.platform == "win32")
def build_list(root_dir, watch_self = False):
"""
Walk from root_dir down, collecting all files that end with ^*.py$ to watch
This could get into a recursive hell loop but I don't use symlinks in my projects
so just roll with it.
:param root_dir: pathlib.Path current working dir to search
:param watch_self: bool Watch the reloader script for changes, some insane dogfooding going on
:return: None
"""
global _watch_list
if watch_self is True:
selfpath = pathlib.Path(__file__)
stat = selfpath.stat()
_watch_list[selfpath] = (stat.st_size, stat.st_ctime, stat.st_mtime,)
for pathobj in root_dir.iterdir():
if pathobj.is_dir():
build_list(pathobj, watch_self=False)
elif pathobj.name.endswith(".py") and not (pathobj.name.endswith(".pyc") or pathobj.name.endswith(".pyo")):
stat = pathobj.stat()
_watch_list[pathobj] = (stat.st_size, stat.st_ctime, stat.st_mtime,)
else:
pass
def file_changed():
global _watch_list
change_detected = False
for pathname, (st_size, st_ctime, st_mtime) in _watch_list.items():
pathobj = pathlib.Path(pathname)
stat = pathobj.stat()
if pathobj.exists() is False:
raise Exception(f"Lost track of {pathname!r}")
elif stat.st_size != st_size:
change_detected = True
elif stat.st_ctime != st_ctime:
change_detected = True
elif _win is False and stat.st_mtime != st_mtime:
change_detected = True
if change_detected:
print(f"RELOADING - {pathobj} changed")
break
return change_detected
def watch_thread(os_exit = SENTINEL_OS_EXIT, watch_self=False):
exit_func = os._exit if os_exit is True else sys.exit
build_list(pathlib.Path(os.getcwd()), watch_self=watch_self)
while True:
if file_changed():
exit_func(SENTINEL_CODE)
time.sleep(1)
def run_reloader():
while True:
args = [sys.executable] + sys.argv
if _win:
args = ['"%s"' % arg for arg in args]
new_env = os.environ.copy()
new_env[SENTINEL_NAME] = "true"
print("Running reloader process")
exit_code = os.spawnve(os.P_WAIT, sys.executable, args, new_env)
if exit_code != SENTINEL_CODE:
return exit_code
def reloader_main(main_func, args, kwargs, watch_self=False):
"""
:param main_func:
:param args:
:param kwargs:
:return:
"""
# If it is, start watcher thread and then run the main_func in the parent process as thread 0
if os.environ.get(SENTINEL_NAME) == "true":
thread.start_new_thread(watch_thread, (), {"os_exit":SENTINEL_OS_EXIT,"watch_self":watch_self})
try:
main_func(*args, **kwargs)
except KeyboardInterrupt:
pass
else:
# respawn this script into a blocking subprocess
try:
sys.exit(run_reloader())
except KeyboardInterrupt:
#I should just raise this because its already broken free of its rails
pass
def reloader(main_func, args=None, kwargs=None, **more_options):
"""
To avoid fucking with twisted as much as possible, the watcher logic is shunted into
a thread while the main (twisted) reactor runs in the main thread.
:param main_func: The function to run in the main/primary thread
:param args: list of arguments
:param kwargs: dictionary of arguments
:param more_options: var trash currently
:return: None
"""
if args is None:
args = ()
if kwargs is None:
kwargs = {}
reloader_main(main_func, args, kwargs, **more_options)
"""
def main():
#startup twisted here
if __name__ == "__main__":
reloader(main)
"""

DCDB post-mortem

I went into writing DCDB with little or no plan besides building it around dataclasses. The result is a bit rough and precarious.

That said I think I am going to progress onward with making a DCDB2 library that will change a few things. The first would be to completely separate the DCDB tables themselves from the SQL processing logic in a way similar to sqlalchemy’s session system. I do have some other changes in mind, notably a better separation between the ORM domain classes and business logic as well as changes to how relationship’s work.

On the subject of relationship handling. That one would be a bit more complicated as the DCDB2 design idea I had was to use placeholders for the relationship (what does it connect too and in what way), then have the real instrumented handlers created and assigned to a constructed domain class. That last sentence is a bit painful to read which tells me I need to mull that one over a bit more. Regardless, the hack I put together in DCDB was just way too fragile.

Unit-testing sqlalchemy with pytest

@pytest.fixture(scope="function")
def conn(request):
    ts = int(time.time())
    db_path_name = "db"
    db_name = f"{request.function.__name__}.sqlite3"
    filepath = pathlib.Path(__file__).parent / db_path_name / db_name

    LOG.debug(f"Test DB @ {filepath}")
    engine = create_engine(f"sqlite:///{filepath}")
    connection = engine.connect()

    sal2.Base.metadata.drop_all(bind=engine)
    sal2.Base.metadata.create_all(bind=engine)
    factory = scoped_session(sessionmaker(bind=engine))
    sal2.Base.query = factory.query_property()
    session = factory()

    yield ConnResult(connection, session, engine)

    connection.close()
    engine.dispose()

Inside of my “tests” directory I added a “db” directory. Given the logic above, it spawns an entire new database for each test function so that I can go back and verify my database. For someone elses code, you just need to swap out “sal2” with the module name holding your sqlalchemy base and associated model classes. The only thing I wonder about is the issue with create_all. I remember there is a way to bind the metadata object without create_all but damn if I can remember it right now.

Python dataclass database (DCDB)

Why

While I do use sqlalchemy and to some extent peewee for my projects, I slowly got tired of having to relearn how to write SQL when I’ve known SQL since the mid-90’s.

DCDB’s design is also aiming for simplicity and minimal behind the scenes automagical behaviors.   Instead complexity should be added voluntarily and in such a way that it can be traced back.   

Example

import dataclasses as dcs
import dcdb 

@dcs.dataclass()
class Foo:
    name:str
    age:int

db = dcdb.DBConnection(":memory:") # alternatively this can be a file path
db.bind(Foo)
"""
   Bind doesn't change Foo in the local scope but instead
   it creates a new class DCDB_Foo which is stored to the DBConnection in it's 
   table registry.

   Behind the scenes, a table `Foo` is created to the connected database.  No changes to the name are made (eg pluralization). How you wrote your bound dataclasses is almost exactly how it is stored in the sqlite database.

   An exception is that a .id instance property along with DB methods like: update/save, Create, Get, and Select are added to the class definition.

"""
record = db.t.Foo(name="Bob", age="44")
assert record.name == "Bob"
same_record = db.t.Foo.Get("name=?", "Bob")
assert record.age == 44
assert record.id == same_record.id

record.age = 32
record.save()

same_record = db.t.Foo.Get("age=?", 32)
assert record.id == same_record.id
assert same_record.age == 32

same_record.delete()

"""
Note it is important to notice that currently same_record and 
record have the same .id # property but they are different 
instances and copies of the same record with no shared reference.   
Changes to one copy will not reflect with the other.

"""

Github DCDB

PyQT5 QMediaPlaylist documentation snafu

QMediaPlaylist has a method

addMedia

with the signature

bool QMediaPlaylist::addMedia(const QMediaContent &content)

http://doc.qt.io/qt-5/qmediaplaylist.html#addMedia

And the documentation suggests for c++ that

playlist->addMedia(QUrl("http://example.com/movie1.mp4"));

should work.
http://doc.qt.io/qt-5/qmediaplaylist.html#details

BUT in PyQT5

playlist.addMedia(QUrl.fromLocalFile("some/path/music.mp3")

errors out with unexpected QUrl.

So….. you have to do

QMediaContent(QtCore.QUrl.fromLocalFile("filePath/fileName.mp3"))

in python. A bit unwieldy but PySide2 appears to have stalled from an outsiders perspective.

Initially I suspect that QUrl and QMediaContent inherited from some sort of common base class and were split off and its perhaps an untested use case of using QUrl was lost. I ended up making a bandaid in python with

MakePath = lambda x: QtM.QMediaContent(QtCore.QUrl.fromLocalFile(x))

where QtM is `from PyQt5 import QtMultimedia as QtM`

From a discussion on a python chat board – How to deal with criticism

The poster asked “This may be offtopic but i really want to share this.
Did you ever faced criticism with your code. How to overcome this criticism ?”

And this was my response.

Every so often I go back to something I wrote 10 years ago and I laugh my ass off. Now if 10 years ago someone had done that, it would have been somewhat of a painful experience. People have already mentioned “you are not you’re code” which is mostly true BUT at same time you are your code at this moment in time. When a peer criticizes your code, you need to quickly detach yourself and hear out what they have to say. Sometimes their input is going to be asinine ( “I prefer naming variable after my kid’s” ) but hopefully its going to real value ( “You should implement dash or camel casing, does ‘expertsexchange’ mean Expert Sex change or Experts exchange? “)

Before I give whatever advice on filtering poisonous vs constructive criticism, going to address why you want criticism. If you are a small person, you will be stuck in a small world and only realize the situation when people are selling cars and you’re still crafting buggy whips to a doomed industry ( eg COBOL & mainframes vs Java & server farms ). To improve your craft, yes you need challenging work but you also need to be exposed to different idea’s or you will not grow professionally.

Now as far as handling criticism and deciding if it has value. #1 is that you need to check your emotions and listen. #2 identify the problem they have with your work and clarify so you have concrete examples of what is and is not the problem. #3 Evaluate the value of fixing the problem “Does this improve things for my team and our success” or “Does this make my product better and or more maintainable?”

If the person cannot give concrete examples of #2, tell them you don’t understand. If they cannot do this without resorting to verbal abuse, conversation is over and escalate to supervision or discontinue discourse.

If the person cannot demonstrate #3 ( eg how does using their children’s names make things “better”? ) escalate or discontinue, telling them that you don’t seen an advantage to their proposal.

Finally, emotion’s get you into a fight you may not be able to win or will have costs to your career down the line. One example is a person I found immensely influential to my career and I thought he was the bee’s knees. This person was outspoken and somewhat vitriolic but he was generally right ( or appeared to be ). 5 years down the road, he seems like a “has been” that is constantly verballing threatening to beat up people when they criticize his code or his behavior “I am a MMA fighter, I will kick your ass”. For the most part I think that guy’s career is on its way out as who wants to collaborate or associate with him? Also I am not talking about Linus Torvald… he’s a completely different kind of crazy with a completely different problem.

Project notes

http://www.stevecoursen.com/209/stackless-python-meets-twisted-matrix/
https://github.com/smira/txZMQ
https://pypi.python.org/pypi/Twistless/1.0.0

goals: evaluate speed over memory for pypy using stackless/coroutines with ZMQ and twistless ( if its even possible without rebuilding txZMQ ).

Initial goal would be to take github.com/devdave/pyfella and let two stacks fight it out for an hour ( assuming 10x deep othello minimax ).

Interesting way to safely debug multiprocessing python systems

I have one particular “job” that has 3 sub processes moving as fast as humanly possible to build a report. The main slowdown is an external data source which isn’t outright terrible but its not great either. The worst possible outcome is when this thing hangs or misses available work which it was predisposed to do a lot.

Various kill signals usually failed to give me an idea of where the workers were getting hung up and I wasn’t really excited about putting tracer log messages everywhere. Fortunately I have a dbgp enabled IDE and I found this answer on SO. http://stackoverflow.com/a/133384/9908

Taking that I modified it to look like this:

import traceback, signal

#classdef FeedUserlistWorker which is managed by a custom multiprocessing.Pool implementation.

    @classmethod
    def Create(cls, feed, year_month = None):

        signal.signal(signal.SIGUSR1, FeedUserlistWorker._PANIC)

        try:
            return cls(year_month=year_month, feed=feed).run()
        except Exception as e:
            from traceback import print_exc
            print_exc(e)
            sys.stderr.flush()
            sys.stdout.flush()

the print_exc is there because there isn’t a very reliable bridge to carry Exceptions from child to parent. Flush’s are there because stdout/stderr are buffered in between the parent pool manager.

    @classmethod
    def _PANIC(cls, sig, frame):
        d={'_frame':frame}
        d.update(frame.f_globals)
        d.update(frame.f_locals)

        from dbgp.client import brk; brk("192.168.1.2", 9090)

The only thing that matters is that call to dbgp. Using that tool, I was able to step up the call stack, fire adhoc commands to inspect variables in the stack frame, and find the exact blocking call, which turned out to be the validation/authentication part of boto s3. That turned out to be a weird problem as I had assumed the busy loop/block was in my own code ( eg while True: never break ), fortunately it has an easy fix https://groups.google.com/forum/#!msg/boto-users/0osmP0cUl5Y/5NZBfokIyoUJ which resolved the problem as my Pool manager doesn’t mark tasks complete and failures will only cause the lost task to be resumed from the last point of success.