Category Archives: Uncategorized

1 Billion monthly pageviews

It’s been an odd journey these last few months, but Lijit just hit 1 billion monthly page views in the last week or so.

http://www.readwriteweb.com/archives/custom_search_startup_hits_1_billion_monthly_pagev.php

What get’s me though is that the page views value is probably substantially lower then the actual count of handled http requests a month. Wish I had the time to compute that number.

Random reoccurring thought

The only safety ( anonymity) anyone has is that no one is worth the effort to be found, of course the random vigilante flesh search’s show it doesn’t take long to be found if the effort arrives.

Instead the real concern for me is both the public internet archives and the private caches sequestered around the world. If the human race survives long enough to breed a singularity, it would be seemingly trivial for an artificial intelligence to back trace people from the dawn of their first public forum post to the dusk of their last

The http CDN key value server of doom

I don’t believe I will be abusing this to much, but it struck me the other night that JSONP script payloads are in a sense key/value result sets.

In pseudo terms, “From server1 provide a document from store XYZ with params A,b,c to handler HandleJSONP”, then tying in a CDN and expiration headers to a time in the future, you’ve now got a mostly inexpensive mechanism for passing dynamic data separate of static Javascript.

In production, this has been working out well for 4 out of 5 cases in saving both my employer and our clients bandwidth & time ( static content can be nearly permacached to the CDN, inline proxy caches, and browser caches ), plus eventually paving the way for transitioning some of our client side code to go from synchronous to asynchronous… but that mostly depends on other service providers not screwing up the client side environment with post load document.write’s, hijacking core DOM methods, or doing a lot of meth before attempting to write semi-valid html.

Git Push – Remote Rejected solution

Scenario:
You made a git repo on workstation 1, committed your changes, then from workstation 2 made a git init/pull. More changes, more commit… and its time to push all that mess back to workstation 1 ( the original )

$git push workstation1.tld

Counting objects: 32, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (22/22), done.
Writing objects: 100% (24/24), 13.63 KiB, done.
Total 24 (delta 6), reused 0 (delta 0)
remote: error: refusing to update checked out branch: refs/heads/master
remote: error: By default, updating the current branch in a non-bare repository
remote: error: is denied, because it will make the index and work tree inconsistent
remote: error: with what you pushed, and will require 'git reset --hard' to match
remote: error: the work tree to HEAD.
remote: error:
remote: error: You can set 'receive.denyCurrentBranch' configuration variable to
remote: error: 'ignore' or 'warn' in the remote repository to allow pushing into
remote: error: its current branch; however, this is not recommended unless you
remote: error: arranged to update its work tree to match what you pushed in some
remote: error: other way.
remote: error:
remote: error: To squelch this message and still keep the default behaviour, set
remote: error: 'receive.denyCurrentBranch' configuration variable to 'refuse'.
To rocket.x.TLD.com:~/code/pydbgp
 ! [remote rejected] master -> master (branch is currently checked out)
error: failed to push some refs to 'rocket.x.lijit.com:~/code/pydbgp'

What the hell?

Stack overflow to the rescue with this Answer at stackoverflow

So basically… on workstation 1 do

Workstation1$git branch BeforeThePushBranch
Workstation1$git checkout BeforeThePushBranch
Workstation2$git push workstation1
Workstation1$git checkout master

Bam, back working again

Thank you spam bots

Most humans will figure out that I’ve banned comments on this blog, to the point that I gutted out parts of WordPress to disallow posting them to the DB. Yet somehow, someway a few are leaking through. Oddly enough most of them are thanking me for how great my blog is…. and recommending I check out their product/blog. So to the spambots of the internet, I appreciate your accolades but I am still not clicking on anything.

Python, sometimes you scare me

for i in range(0, len(sys.path)):    
    if sys.path[i].find("~") > -1:        
        sys.path[i] = path.expanduser(sys.path[i])
        break
else:    
    sys.path.insert(1,path.expanduser("~/lib/python"))

More duct tape code to allow me to rely on ~/lib/python to store common code (ex. apache log parsing & analysis ) when PYTHONPATH might not be set or
set incorrectly.

Basically if ~ is found in sys.path, the break statement skips the else. Alternatively, if not found then else is executed.
Python compound statements: For loops

Unnamed project: PHP doc’s on steroids

Been playing with a tool released by Facebook called xhprof, http://mirror.facebook.net/facebook/xhprof/doc.html , which is a
very nice tool for figuring out what exactly is going on in your PHP code.

Meanwhile I used Doxygen a lot, its great for finding out how spread out calls are to individual libraries, has great call graphs, and if setup correctly can draw out the entire hierarchy of even the most complicated of projects.

What I’d like to do is create something that uses xhprof with documentation logic similar to doxygen, to make dynamically generated documentation that not only shows you how a class is defined, but how it’s defined, what depends on it, and what does it depend on.

UI:
I was thinking of something that was intelligent in the ways of MVC frameworks, recording a list of all urls called into a framework and then displaying them as a list broken down by URL components.

A simple example might be a MVC project with one controller called user
User has a standard CRUDE outline so the url’s might be
/user/create
/user/list
/user/edit
/user/save
/user/delete

So the first screen would be:

Entry points:
User – 5 sub points

The sub points part would be a anchor leading to a digest page listing # of recorded profiles to /user/ then maybe a digest of slowest call with the subcompontent ( say save ) to fastest ( delete ), highest / lowest memory consumption, a link to files common to this URI, and then links to progress further down the URL chain.

Clicking on list would lead to /usr/list – again showing the above, but now specific to this URL. At this point there shouldn’t be anymore child components in the URL ( GET arguments are automatically stripped OFF ) so an additional feature would be to click a link to show a visual graph similar to what stock xhprof UI provides.

The fun part would be when you clicked on one of the listed files. Immediately this would bring up a page similar to a doxygen product, but it would have some additional information:
# of dependant URLS that executed this file
# of classes, functions, and methods used in this file by this url path ( /user/list )
fastest and slowest execution speeds plus memory consumption percentage for /user/list.

Clicking on a class, method, or function would cause it to again drill down and show stats JUST for the specified scope.

Considering its taken me 3 months to get PyProxy to stable useful Alpha, this project is probably going to take me a year or more… but it should be fun.