Category Archives: howto

SSH SOCKS proxy and Amazon EC2 to the rescue

I’m currently somewhere in the process of building a hadoop clouster in EC2 for one of my clients and one of the most important parts for keeping my sanity is the ability to access all of the node’s web interfaces ( jobtracker, namenode, tasktrackers’, datanodes, etc ). If you aren’t abs(crazy) all of these machines are jailed inside a locked down security group, a micro walled garden.

SSH -D 8080 someMachine.amazon-publicDNS.com

That will setup a socks between your machine and some instance that should be in the same SG as the hadoop cluster… now unless you are a saddist and like to write dozens of host file entries, the SOCKS proxy is useless.

But wait! Proxy Auto-configuration to the rescue! All you really need to get started is here at Wikipedia ( http://en.wikipedia.org/wiki/Proxy_auto-config ) but to be fair a dirt simple proxy might look like:

hadoop_cluster.pac
function FindProxyForURL(url, host) {
if (shExpMatch(host, "*.secret.squirrel.com")) {
return "SOCKS5 127.0.0.1:8080";
}
if (shExpMatch(host, "*.internal")) {
return "SOCKS5 127.0.0.1:8080";
}

return "DIRECT";
}

Save this to your harddrive then find the correct “file:///path/2/hadoop_cluster.pac” from there go into your browsers proxy configuration dialog window and paste that URL into the Proxy Auto-configuration box. After that, going to http://ip-1-2-3-4.amazon.internal in a web browser will automatically go through the SSH proxy into Amazon EC2 cloud space, resolve against Amazon DNS servers, and voila you’re connected.

NOTE: Windows users

It shouldn’t be a surprise that Microsoft has partially fucked up the beauty that is the PAC. Fortunately, they provide directions for resolving the issue here ( http://support.microsoft.com/kb/271361 ).

tl;dwrite – Microsoft’s network stack caches the results of the PAC script instead of checking it for every request. If your proxy goes down or you edit the PAC file, those changes can take sometime to actually come into play. Fortunately Firefox has a nifty “reload” button on their dialog box, but Microsoft Internet Explorer and the default Chrome for windows trust Microsofts netstack.

HTML5 Canvas windowing proof of concept/prototype

Code is here ( https://gist.github.com/a17216d5b1db068ab41b )

Taking a break from a client project today and decided to try something real quick out. For one of my unpublished pet projects ( a near real time web based MUD ) I wanted a little map window to give visual feedback to the user on where exactly they were. Fortunately the Canvas API makes this stupid easy.

       var oX = this.player.x * (this.mMX / this.rMX);
    var oY = this.player.y * (this.mMX / this.rMY);
    
    var halfX = ( this.wMX / 2 );
    var halfY = ( this.wMY / 2 );
    
    var startX = Math.max(0, oX - halfX);
    var startY = Math.max(0, oY - halfY);
    
    var endX = Math.max(startX + this.wMX);
    var endY = Math.max(startY + this.wMY);
    
    var buffer = this.ctx.getImageData(startX, startY, endX, endY);
    this.window.putImageData(buffer, 0, 0);

First block takes the player or avatars position then scales up their grid position to the map canvas position.

Second block gets middle point of the window canvas, in this case if the window is 150×150, the origin is (75,75)

Now given the player’s position in the canvas map it finds the upper left and lower right corners of the map that is going to be copied from the map canvas into the window canvas. The Math.Max’s are there to prevent the upper left corner from being an impossible point (123, -50 ).

The end*’s use of Math.Max is actually crap on my part and isn’t needed.

Profiling tools for MongoDB

As MongoDb gains more customers, it will become increasing more important to be able to understand what the hell it is doing. I ran across this post this morning, it’s short but a good place to start in learning about profiling

metaclass service bus decorator concept

While playing around with some idea’s for making PyProxy completely overridable ( and also potentially undebuggable ) I started playing around with metaclasses.

Below is from my toxic directory [ gist url ]

from collections import defaultdict
from functools import wraps

Just some preliminary basic utilities

A simple bus implementation, decoratedCalls is a dictionary of service calls, with a list
of callbacks to each service call hook.

decoratedCalls = defaultdict(list)

def call(name, *args, **kwargs):
    print name, args, kwargs
    if name in decoratedCalls:
        for cb in decoratedCalls[name]:
            cb(*args, **kwargs)

Syntactic suger to add clarity later as to what functions are being bound to what. Adding in
debug/logging hooks here could allow for easier tracing of what is called for what methods.

class Subscribe(object):
    def __init__(self, name):
        self.name = name
        
    def __call__(self, f):
        decoratedCalls[self.name].append(f)
        return f
        

Here’s the actual bus decorator, very simple just wraps the decorated function with a pre and post bus calls.


class BusDecorator(object):
    def __init__(self, name):
        self.name = name
        
    def __call__(self, f):
        
        @wraps(f)
        def decorator(inst, *args, **kwargs):
            call("%s-pre" % self.name, inst, args, kwargs)            
            retval = f(inst, *args, **kwargs)
            call("%s-post" % self.name, inst, retval)            
            return retval
        return decorator

And here’s my Bus Metaclass that combines most of the above.

The ease of wrapping the target class is accomplished by cdict which is a dictionary of
every defined attribute of the target class. As you can see it’s trivial to spin
through and decorate every callable with the BusDecorator


class BusWrap(type):
    
    def __new__(mcs, clsname, bases, cdict):

        modName = cdict.get("__module__", "unknownclass")
        
        for name in cdict.keys():
            prefix = "%s.%s.%s" % ( modName, clsname, name)
            if callable(cdict[name]):                
                cdict[name] = BusDecorator(prefix)(cdict[name])
        
        return type.__new__(mcs, name, bases, cdict)
        

Now give a dirt simple class like

            
class Foo(object):
    __metaclass__ = BusWrap
    
    def __init__(self):
        print "init'd"
        
    def bar(self):
        print "bar"
        
    def blah(self):
        print "blah"
        
    def ich(self):
        print "ich"
        
    def ego(self):
        print "lego"
        
    def say(self, *args):
        print "Saying ", args
      

And two service handlers to pre Foo.bar being called and after Foo.ego is called


@Subscribe("__main__.Foo.bar-pre")        
def preBar(inst, *args, **kwargs):
    if not hasattr(inst, "ext_info"):
        inst.ext_info = "Here"
        
@Subscribe("__main__.Foo.ego-post")
def postEgo(inst, *args, **kwargs):
    if hasattr(inst, "ext_info"):
        print "Extended info is ", inst.ext_info

Our test shows….

x = Foo()
x.bar()
x.blah()
x.ich()
x.ego()
x.say("abba", "dabba")

this as output

__main__.Foo.__init__-pre (<__main__.__init__ object at 0x02637890>, (), {}) {}
init'd
__main__.Foo.__init__-post (<__main__.__init__ object at 0x02637890>, None) {}
__main__.Foo.bar-pre (<__main__.__init__ object at 0x02637890>, (), {}) {}
bar
__main__.Foo.bar-post (<__main__.__init__ object at 0x02637890>, None) {}
__main__.Foo.blah-pre (<__main__.__init__ object at 0x02637890>, (), {}) {}
blah
__main__.Foo.blah-post (<__main__.__init__ object at 0x02637890>, None) {}
__main__.Foo.ich-pre (<__main__.__init__ object at 0x02637890>, (), {}) {}
ich
__main__.Foo.ich-post (<__main__.__init__ object at 0x02637890>, None) {}
__main__.Foo.ego-pre (<__main__.__init__ object at 0x02637890>, (), {}) {}
lego
__main__.Foo.ego-post (<__main__.__init__ object at 0x02637890>, None) {}
Extended info is  Here
__main__.Foo.say-pre (<__main__.__init__ object at 0x02637890>, ('abba', 'dabba'), {}) {}
Saying  ('abba', 'dabba')
__main__.Foo.say-post (<__main__.__init__ object at 0x02637890>, None) {}


Plugin’s for python

As the #1 google result for “python plugin”, the linked to blog post is extremely valuable for jump starting research into implementing your own plugin system ( like me ) or finding one that is viable for implementation in your project ( possibly like me ).

These resources highlight one reason why there is not a standard Python plug-in framework: there are a variety of different capabilities that a user may want, and the complexity of the framework generally increases as these new capabilities are added

http://wehart.blogspot.com/2009/01/python-plugin-frameworks.html

My client’s hate me

I’ve been using mdb-tools, a open source package for extracting Access MDB files, and entering whole new depths of
conceptual joy. Application two and three were written in ASP by a college student, they were stop gap measures meant to live for a year or so… Fast forward 9 years and my client’s hired me to “fix” this problem among others.

In 2004 I replaced a USAF MS Access program used by my squadron with a PHP5 app, now 2011 I feel like I’ve stepped into a time machine and back in the hell that is MS Access and legacy ASP.

   CREATE TABLE CredentialLog
 (
	LogID			Integer, 
	StudentID			Text (100), 
	Cycle			Text (100), 
	AppTo			Text (100), 
	AppFor			Text (100), 
	DayPhone			Text (100), 
	Letter1			Text (100), 
	Letter2			Text (100), 
	Letter3			Text (100), 
	Letter4			Text (100), 
	Letter5			Text (100), 
	Letter6			Text (100), 
	Letter7			Text (100), 
	Letter8			Text (100), 
	Letter9			Text (100), 
	Letter10			Text (100), 
	Letter11			Text (100), 
	Letter12			Text (100), 
	Letter13			Text (100), 
	Letter14			Text (100), 
	Letter15			Text (100), 
	DateRec1			DATETIME, 
	DateRec2			DATETIME, 
	DateRec3			DATETIME, 
	DateRec4			DATETIME, 
	DateRec5			DATETIME, 
	DateRec6			DATETIME, 
	DateRec7			DATETIME, 
	DateRec8			DATETIME, 
	DateRec9			DATETIME, 
	DateRec10			DATETIME, 
	DateRec11			DATETIME, 
	DateRec12			DATETIME, 
	DateRec13			DATETIME, 
	DateRec14			DATETIME, 
	DateRec15			DATETIME, 
	DateSent1			DATETIME, 
	DateSent2			DATETIME, 
	DateSent3			DATETIME, 
	DateSent4			DATETIME, 
	DateSent5			DATETIME, 
	DateSent6			DATETIME, 
	DateSent7			DATETIME, 
	DateSent8			DATETIME, 
	DateSent9			DATETIME, 
	DateSent10			DATETIME, 
	DateSent11			DATETIME, 
	DateSent12			DATETIME, 
	DateSent13			DATETIME, 
	DateSent14			DATETIME, 
	DateSent15			DATETIME, 
	Initial1			Text (8), 
	Initial2			Text (8), 
	Initial3			Text (8), 
	Initial4			Text (8), 
	Initial5			Text (8), 
	Initial6			Text (8), 
	Initial7			Text (8), 
	Initial8			Text (8), 
	Initial9			Text (8), 
	Initial10			Text (8), 
	Initial11			Text (8), 
	Initial12			Text (8), 
	Initial13			Text (8), 
	Initial14			Text (8), 
	Initial15			Text (8)
);

A web-console implemented in nodeJS

Reading over this ( http://gonzalo123.wordpress.com/2011/05/16/web-console-with-node-js/ ) , I felt this was a pretty good and relatively real world example of nodeJS’s capabilities. I did find one line near the end very poignant and reflective of my own opinions at the end of the show & tell post

I don’t know if node.js is the future or is just another hype, but it’s easy. And cool. Really cool.

New language to the toolset: NodeJS

I’ve started self-teaching myself NodeJS and not surprisingly there was a recent popular reddit link pointing to a not very short introduction to nodejs here ( http://anders.janmyr.com/2011/05/not-very-short-introduction-to-nodejs.html ) . At this point I haven’t found any good books for reference lookup… but hopefully in a week or so I’ll have gotten spooled up enough to be able to report back.

In the meantime as I use Ubuntu for everything but games, this short article here ( http://www.codediesel.com/linux/installing-node-js-on-ubuntu-10-04/ ) was enough to get a working environment running though I recommend using a pristine virtual machine instance to keep things simple.

Last point, a reliable friend & peer of mine recommended I check out this MVC like framework for NodeJS ( http://expressjs.com/ ). Preliminary it feels like a hybrid between Twistd and CherryPy in Python or Limonade PHP as far as style… aiming more for simplicity over feature/complexity.