Category Archives: python

Filtering and ordering by date with SQLAlchemy

You want the extract command which is documented here -> https://docs.sqlalchemy.org/en/14/core/sqlelement.html#sqlalchemy.sql.expression.extract

The list of options are generally the same regardless of dialect/SQL server so a reference of types can be seen here for SQLite3 https://github.com/sqlalchemy/sqlalchemy/blob/main/lib/sqlalchemy/dialects/sqlite/base.py#L1229

A common base extract type arguments to SQL arguments is here ->https://github.com/sqlalchemy/sqlalchemy/blob/main/lib/sqlalchemy/sql/compiler.py#L299

Extract() is transformed for SQLite3 into `strftime` -> https://github.com/sqlalchemy/sqlalchemy/blob/main/lib/sqlalchemy/dialects/sqlite/base.py#L1272

The base visitor/transformer for extract is here https://github.com/sqlalchemy/sqlalchemy/blob/main/lib/sqlalchemy/sql/compiler.py#L2124

Finally, a basic example might be something like MyTable.query.filter(extract('year', MyTable.date_field) == 2022) which would produce something like SELECT ...hell of a lot of columns... FROM MyTable WHERE STRFTIME("%y", MyTable.date_field);

Process finished with exit code -1073740791 (0xC0000409) – PySide2 & PyQT5

I am working on a WinAmp clone called PySongMan(ager) and kept getting a stackoverflow bug. Drilling my code downward, simplifying it as I went, I got a script like:

Which kept throwing a 0x0…409 error which is a stack overflow error. Finally, somewhat by accident, I figured out my mistake. The good code looks like:

and runs without issue. So the problem is that before any QT widget/window can be created, QApplication must be called first.

Flask class routing example

I’ve written several web frameworks in my life and while I don’t have the desire to keep up with current technology trends, I still like to dabble. book.py is the best example of what this does versus page.py which shows more advanced use cases.

Portable/reusable flask app skeleton

I have a lot of Flask apps running in the background on my home server to do various tasks (home wiki, some CRM stuff, etc) and I end up making the same structure over and over so I figured I would simplify the process and make a repo with just the skeleton of an app.

A few benefits:

as long as you use relative imports . and .. (eg from .. import app) your web application is name agnostic.
The flask application instance of Flask() can be accessed from anywhere in the web application without a risk of circular import problems.
It’s entirely possible to copy and paste web application modules (eg models) into another web application and it will mostly just work (baring configuration needs).

https://github.com/devdave/skeleton_flask

import logging
import sys
from flask import Flask

app:Flask = Flask(__name__)
log:logging.Logger = None

def create_app(config=None)->Flask:
    global app, log
    from . import conf

    log = logging.getLogger(__name__)
    fmt = logging.Formatter(app.config['APP_LOGGING_FMT'])
    hndl = app.config['APP_LOGGING_HANDLER']  # type: logging.Handler
    hndl.setFormatter(fmt)
    hndl.setLevel(app.config["APP_LOGGING_LEVEL"])
    log.propagate = False
    log.handlers.clear() # This removes flask's default handler
    log.addHandler(hndl)
    log.debug(f"{__name__} loading components")



    from . import lib
    from . import models
    from . import views
    from . import settings

    return app

https://github.com/devdave/skeleton_flask/blob/master/init.py

This is the __init__.py file in the base of the web app. To use it with flask you would do something like this on the commandline

#>set FLASK_RUN_PORT=1234
#>set FLASK_APP = "webapp:create_app()"
#>set FLASK_ENV = development
#>python -m flask run
OR 
#>flask run

The FLASK_APP environment variable is documented here https://flask.palletsprojects.com/en/1.1.x/cli/#application-discovery and it’s pretty straight forward module:function_name() where function_name is defined in module/__init__.py

The reason for having the imports for lib models views settings in create_app is to prevent a circular import and allow sub modules like views to do from .. import app to access the Flask application instance.

Data migration with SQLAlchemy and Alembic

I needed to optimize an unruly table filled with floats but I also didn’t want to lose my data. Unfortunately the documentation on the alembic website doesn’t mention anything or give any hints on how to do a data migration versus just a schema migration.

Fortunately I was able to run a symbolic debugger against alembic and figured out that all of the op.<method>`calls are atomic. If you have an add_column call, it adds the column when it executes that method. So that opened the door to data migrations.

One note before I pasted the code. You don’t need to specify all of the columns of the source table when used in a data migration scope. This makes your code a lot cleaner as the working model code is specific to what data you plan on using.

Alright, no more babbling, here is the example code.

A while back I downloaded my google location and history data and ran into these strange lat7 and long7 columns (paraphrasing as I don’t remember their exact names). The data were these large integer numbers that I couldn’t figure out how to decode. Suddenly it became obvious when I noticed all of the latitude fields started with 35 and the longitude started with -104. 35, -104 is approximately a few hundred miles from where I live. By doing lat7 / 10000000 (10e7 or 10**7) I was able to get floating point GPS coordinates.

Since then, when it comes time to optimize database schemas I’ve always started with figuring out if I can shift the percentage out and use integers instead. If using sqlite3, a Float is actually a varchar and that’s huge in comparison to using a byte or two of signed integers. Throw a million records on and it can get up to 30-40% of wasted diskspace.

Anyway where was I. Since I wanted to get rid of all of the floats and replace the real fields with @hybrid_propertyand @hybrid_property.expression I renamed latitude to _latitude, shifted out the percent, and used the aforementioned decorators to transform the integers back to floats on demand.

Non-blocking python subprocess

I am working on a pet project to compress a terabyte of video into a slimmer format. While I have been able to automate working with ffmpeg, I didn’t like the fact that I couldn’t follow along with the subprocess running ffmpeg.

I tried a few different ideas of how to watch ffmpeg but also avoid the script from blocking because I wanted to be able to time and monitor it’s progress

import subprocess

process = subprocess.pOpen

stdout, stderr = process communicate blocks until the process is finished

subprocess.stdout.readline() and subprocess.stderr.readline() will both block until there is sufficient data. In ffmpeg’s case there is never stdout output so it will block indefinitely.

https://gist.github.com/devdave/9b8553d63e24ef19eea7e56f7cb95c78

By using threading Queue and constantly polling the process, I can watch the output as fast as it can come in but not worry about the main process blocking, just the threads.

A further improvement on the idea would be to have two threads (for stdout and stderr respectively) with the queue items put with sentinels like queue.put((STDERR, line_from_stderr)) and a sentinel for STDOUT.

To use - 

r = Runner(["some_long_running_process", "-arg1", "arg1 value"])

for stdout, stderr in r.start():
    print("STDOUT", stdout)
    print("STDERR", stderr)

Hacking around a hack

So google friendly search terms

Pexpect, subprocess.POpen. How to suppress quotes from Popen with Windows. Answer: You cannot BUT you can hack around it.

Note I am using delegator.py BUT it is just a very nice wrapper around PExpect which in turn is a really nice wrapper around Python’s subprocess.Popen which is a really nice wrapper around a ball of shit called CreateProcess. Pedants asside, you can’t make a ball of shit shine no matter how much abstraction you throw at it.

Solution:

fix = f"some_exec -arg abc -arg 123 -arg3=foo"
cmd_line = ["cmd", "/c", fix ]
result = delegator.run(cmd_line)

This looks goofy and it is. The problem I ran into is with a POSIX compliant app that wasn’t prepared for Windows compliant shenanigans https://docs.python.org/3/library/subprocess.html#converting-argument-sequence

Illustration

$some_exec -arg abc -arg2 123 -arg3=foo

That’s what I expected of crafting a subprocess call with an argument like

delegator.run(["some_exec", "-arg abc", "-arg 123", "-arg3=foo"])

What I got was

"some_exec", "-arg abc", "-arg 123", "-arg3=foo"

Which is hilarious because `some_exec` was so not prepared for that and wasn’t able to parse the command line arguments. How we go from a list of command arguments to a quoted text string is all thanks to this https://github.com/python/cpython/blob/3.7/Lib/subprocess.py#L485

Unless you want to write your own standard library version of subprocess.py just so you can work around one single (likely many) “non-complaint” Windows console executables, you need to abuse the cmd.exe shell’s own quirks.

To demonstrate what I mean

    fix = f"some_exec -arg abc -arg 123 -arg3=foo"
    cmd_line = ["echo", "cmd", "/c",  fix ]
    result = delegator.run(cmd_line)

Outputs `cmd /c “some_exec -arg abc -arg2 123 -arg3=foo` which is in turn executed by cmd.exe without quoting every single argument to `some_exec`.

I walked downward from delegator.py’s run() through pexpect to get to subprocess.py’s POpen and this is it, there wasn’t a cheaper fix then this.

For more info on why this works – tldr “cmd /c argument” is run inside of its own temporary shell and all of that sanitizing code in subprocess is skipped because it see’s the `fix` part of the sequence as one gigantic argument to `cmd`. If you try it any other way, say you try to skip out on cmd, you will get `some_exec “-arg abc -arg2 123 -arg3=foo”` which put me right BACK to having my some_exec blow up as it doesn’t know to strip the quotes out (or even to look inside for the arguments).

Summary: I didn’t log a bug report to some_exec’s maintainer because I needed this to work right the hell now and I didn’t have the time or energy to push through a bug report, wait for a fix, and I don’t have time to even wade into some_exec’s code base to figure out what to patch. Nevermind if it is written in C or C++ which I could do but I haven’t used either language in about a year so what I did write would be some ghastly buffer overflow exploit cesspool. I don’t have time to do that. Besides honestly if I was the developer/maintainer of some_exec I would be skeptical if I even wanted to deal with this bullshit on top of everything else that is more pressing.

WSL GNU/Debian Python 3.6.5

Relevant date: 2018 May 30th
This works today but there is no guarantee this is the right way in a month and especially a year later.

After install WSL and then Debian linux, the first thing I did was something recommended from the reviews:

sudo apt update && sudo apt full-upgrade && sudo apt install aptitude
sudo aptitude install ~pstandard ~prequired ~pimportant -F%

Googling led to some goofy answers like using a PPA (which isn’t a bad thing) but you can screw up your environment if not careful

So….. time to go old school and build from source.

sudo apt-get install -y make build-essential libssl-dev zlib1g-dev   
sudo apt-get install -y libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm 
sudo apt-get install -y libncurses5-dev  libncursesw5-dev xz-utils tk-dev

Those are the basic requirements BUT I am fairly certain somethings like lxml (for Beautiful Soup) will be missing. Not an issue if you don’t intend to use lxml AND they are not required for building the python 3.6 interpreter.

wget https://www.python.org/ftp/python/3.6.5/Python-3.6.5.tgz
tar xvf Python-3.6.5.tgz
cd Python-3.6.5
./configure --enable-optimizations --with-ensurepip=install
make
sudo make altinstall

NOTE! you can do

make -j8

for a faster build time BUT ideally using

-j# where the # is the number of cores on your CPU

is better

NOTE! if you have a faster computer, the actual compile process is fairly quick but than it runs a post build unit testing suite that is absolutely massive because of the `–enable-optimizations` flag. If you need speed for a commercial environment AND you are 100% certain your environment across all machines is the same, just retar up the source directory with the pre-built binaries, send it to each machine, and then run `make altinstall`.

From there you can run

user@machine:~/$python3.6.5

some people recommend fiddling with

alternatives

so you can just do

user@machine:~/$python3.6.5

but generally I use virtualenv and explicitly make a new environment with my version of python to keep me sane.

Python 3 – abusing annotations

Since my retirement a few years ago my habit of trying out sometimes useless or convoluted ideas has gone up a few notches.

Latest discovery is that `inspect.signature()` passes parameter annotations straight through. With a bit of function decorator hackery, you can get positional/keyword instrumented transformers.


@magic_params
def some_func(always_str:str):
   print(f"always_str is {type(always_str)} with a value of {repr(always_str)}")

>>some_func(123)
always_str is  with a value of "'123'"

another example


def reversed_str(raw):
    if not isinstance(raw, str):
        raw = str(raw)

    return raw[::-1]

@magic_decorator
def goofy_func(bizarro:reversed_str):
    return bizarro

assert goofy_func("Hello World") == "dlroW olleH"

A working proof of concept

In one of my pet projects, I have a method with a signature like `def process_request(self, action:SomeClass.FromDict)` which takes a dictionary for the `action` parameter and passes that to SomeClass.FromDict which then returns a instance of `SomeClass`.

In another case, when dealing with Twisted in Python3 and that all strings are type `` I used something like the magic_decorator above and a transformer `SafeStr` (ex. def do_something(name:SafeStr)` to ensure that the name parameter is ALWAYS of type str. Anecdotally Python3 blows up if you try to do something like .startswith()`.

Grand scheme I think this is an interesting quirk but if my comments and wording isn’t clear, I would prescribe caution if using this in revenue generating code (or code intended to make you wealthy or at least provide money for pizza & beer).

Flask CRUD with sqlalchemy and jinja2 contextfilters

Quick disclaimer, the Flask CRUD thing is not public domain yet and is very volatile.

The project is here
https://github.com/devdave/wfmastery/tree/revamp_1/wfmastery
And the outline for the crud thing is in this commit https://github.com/devdave/wfmastery/commit/e249895ddc53c0696f59d3def5718e76855af5b9

https://github.com/devdave/wfmastery/blob/revamp_1/wfmastery/crud.py
https://github.com/devdave/wfmastery/blob/revamp_1/wfmastery/views.py
https://github.com/devdave/wfmastery/blob/revamp_1/wfmastery/templates/equipment_list.j2.html

First is how the crud is currently constructed

class Equipment(CrudAPI):

    def populate(self):

        self.record_cls = db.Equipment
        self.identity = "equipment"

        self.template_form = "equipment_form.j2.html"
        self.template_list = "equipment_list.j2.html"

        self._listColumn("id")
        self._listColumn("hidden")
        self._listColumn("name", magic_field="magic-string")
        self._listColumn("pretty_name", magic_field="magic-string")

        self._addRelationship("category", "name", magic_field="magic-filter")
        self._addRelationship("subcategory", "name", magic_field="magic-filter")

both vars “template_form” and “template_list” are going to be preset once I am certain that the templates can stand on their own with the context vars provided. The “magic-” params and their use are very much magic (eg really toxic) and would recommend ignoring them.

From there the CrudAPI takes over. Skipping ahead to how this relates to context filters. I had this tag mess here in the template

-{%-      for column_name in origin.list_columns -%}
 -{%-          if column_name in origin.magic_columns -%}
 -        {{ cell("", column_name|title, classes=origin.magic_columns[column_name]) -}}
 -{%-          else -%}
 -        {{ cell("", column_name|title) -}}
 -{%          endif %}
 -{%-      endfor %}

and was really not happy with it. So I dived into Flask and Jinja2’s documentation and code to figure out if I could apply Python code inline.

The answer is yes via jinja2’s contextfilters which are not exposed to Flask but can still be used.

@App.template_filter("render_header")
def render_header(context, column_name, value="", **kwargs):
    result = ""
    if column_name in context['origin'].magic_columns:
        result = context['cell'](value, column_name.capitalize(), classes=context['origin'].magic_columns[column_name])
    else:
        result = context['cell'](value, column_name.capitalize())


    return result

render_header.contextfilter=True

The trick to going from filter to contextfilter is just applying `my_func.contextfilter = True` outside of your functions scope. From there you have access to almost everything (if not everything). The var “origin” is the CrudAPI’s instance passed to the template.

This has opened a lot more opportunities to do clean up. Taking


{% macro data_attributes(data_map, prefix="data-") -%}

    {%- for name, value in data_map.items() -%}
    {{" "}}{{prefix}}{{name}}="{{value}}"
    {%- endfor -%}
{%- endmacro %}
  

{% macro cell(name, value, classes=None, data_attrs={}) %}
        
      {{- caller() if caller else value -}}
  {%- endmacro -%}

and condensing it down to

{% macro cell(name, value, classes=None, data_attrs={}) %}
        
    {{- caller() if caller else value -}}
{%- endmacro -%}

via a simple non-context filter

@App.template_filter("dict2attrs")
def dict_to_attributes(attributes, prefix=None):
    results = []
    name2dash = lambda *x: "-".join(x)
    format_str = "%s-{}=\"{}\"" % prefix if prefix else "{}=\"{}\""

    for key, value in attributes.items():
        results.append(format_str.format(key, value))

    #TODO disable autoescape
    return " ".join(results)

Just note that at the moment output is still managed by Jinja’s autoescape and I’d rather not shut that off so calls MUST be suffixed with “|safe” as used above.

As for the Crud API, I feel like that is coming along nicely.