Development notebook

Thursday, December 15, 2011

Index webpage

I like to be able to point my browser at my media collection, so I expose the content using a pair of scripts. The first opens the index and lists the contents as a set of anchors. The second provides for the download of the file and setting the name.

Also, I create a folder accessible to my web server called 'hashes' in that folder I create symbolic links to the hash folders that were used in the index setup.

The first script:

#!/usr/bin/python
import pickle

index = pickle.load(open("/var/media/index.pickle"))

print "Content-Type: text/html\n"

index = list(set([(a.split('/')[-1], b, c) for a,b,c in index]))
index = sorted(index, key=lambda x: x[0].lower())

for name, size, check in index:
    print '<a href="fetch.py?%s%s">%s (%s bytes)</a><br/>' % (check, name, name, size)

This script is creates the links to the second script. Open the index file and un-pickle the content. Print a header for the browser, (the trailing new line is important for HTTP.)

Next we do some transforms on the index. First we discard the path portion of the filename. Then we do a case insensitive sort by filename.

The for loop then iterates the index and prints each tuple as an anchor, referring to fetch.py, our second script.

The second script:

#!/usr/bin/python
import sys
import os
import urllib

query = os.environ['QUERY_STRING']
fileid = ''
filename = 'noname'

try:
    fileid = query[:40] # Get the checksum
    fileid[39] # Check the checksum length
    int(fileid, 16) # Check the checksum could be hexadecimal
except:
    print "Content-Type: text/plain\n\nBad ID"
    exit(0)

try:
    filename = query[40:]
    filename = urllib.unquote(filename)
    # Replace double quotes with singles for the filename="" below
    filename.replace('"', "'")
except:
    pass

print "Content-Type: application/octet-stream"
print 'Content-disposition: attachment; filename="%s"\n' % filename

inobject = open("/var/www/hashes/%s/%s" % (fileid[0], fileid))

filebuffer = inobject.read(4096)

while len(filebuffer) != 0:
    sys.stdout.write(filebuffer)
    filebuffer = inobject.read(4096)

Again nothing too exciting here, the main reason for this script is to provide a filename to the user agent and also to force a download instead of displaying in the browser.

As I invoke this script by CGI I grab the query string from the environment. Now I do some checking on the checksum provided. Since it comes from the user agent we want to make sure that it isn't too exciting, (like having a '/' or '..' in the filename.)

Our hashes are 40 hexadecimal characters and so we chop the string at 40 characters, check there is a character in the 40th position and check if it could be a hexadecimal integer. If not we show a dull error page.

Next try clock is getting a filename from the user agent to give to the user agent. We grab any characters after the checksum, unquote the characters from HTTP and then we replace any double quotes with singles. The quote replacement helps stop the filename argument to the content-disposition header from exploding.

We then open the file and pipe it to the user agent 4k at a time.

The End :D

Thursday, November 24, 2011

Easy media index.

Recently I started going through my old backup CDs. On these discs are folders of photos that I've taken on my digital camera or copies of photos from other peoples cameras. Some of the folders have place names, others have date ranges. It's quite disorganised and there are lots of duplicates.

To try and make sure I get every unique photo without storing duplicates I devised a simple system to organise the photos.

Firstly I created a directory under /var/ called media. In this directory I do some set up:

# python
>>> import pickle
>>> pickle.dump([],"index.pickle")
>>> ^D
# mkdir by_hash
# cd by_hash
# mkdir 0 1 2 3 4 5 6 7 8 9 a b c d e f

Now I'm going to write a helper to copy files in and update the index. Firstly, we import a few modules we will use. Then we assign some paths to variables, (so we can reuse them later.)

#!/usr/bin/python
import sys
import os.path
import hashlib
import pickle

# Some paths and directories we use
index_location = "/var/media/index.pickle"
media_root = "/var/media/by_hash/%s/%s"

The name of the file to index comes from the first command line argument. We calculate the length and a checksum for the file.

# Open the file and read contents
file_argument = os.path.abspath(sys.argv[1])
filecontent = open(file_argument).read()
filelength = len(filecontent)
checksum = hashlib.sha1(filecontent).hexdigest()
file_entry = (file_argument, filelength, checksum)

Now we load the index and check if a file with this length and checksum are already indexed.

# Check index for any files with the same length and checksum.
index = pickle.load(open(index_location))
indexed_files = set([(entry_length, entry_checksum) 
                     for entry_name, entry_length, entry_checksum
                     in index])
indexed = (filelength, checksum) in indexed_files

If the file isn't indexed we add it to the correct by_hash subdirectory and add it to the index.

# If no file with this length and checksum exist add it.
if not indexed:
    open(media_root % (checksum[0], checksum),'w').write(filecontent)
    # Update index, (reload as it may take some time to write a large file.)
    index = pickle.load(open(index_location))
    index.append(file_entry)
    pickle.dump(index, open(index_location,"w"))

All finished. A short and sweet way to get one copy of each unique photo.

If it is important to capture the originating path of each copy then the index update code can be moved out of the if-block and there will be an entry for each file added.

This simple design allows for a lot of flexibility. (For example I have each of the by-hash folders symbolically linked to folders on different disk partitions to better use my available space.)

This index can now be queried directly using Python or accessed by other applications. The next post will detail creating a simple web-page which will list the index data.

Tuesday, November 22, 2011

Many mice in a game.

After finishing a PyGame tutorial at Writing a game in Python with Pygame. Part II it reminded me of the Commodore Amiga version of Lemmings which had a two mouse mode, (recently World of Goo also included a multi-mouse mode.)

At this point I started looking at the PyGame mouse API and realised it didn't support multiple independent mice, instead using the motion from all mice to move the one cursor. (It was good fun playing with two mice, one cursor and watching the tug-of-war.)

To support more than one mouse would need the mouse events to include a mouse number. So I resolved to implement a many mouse mode for the Creeps game from the tutorial.

Firstly I needed the mouse input. As I'm using a Linux system I have a series of files in the folder:

/dev/input/mouse0
/dev/input/mouse1

these files have the messages from the mice that are plugged in, (one file per mouse.) Dumping through The message format was three bytes and seemed to conform to PS/2. The events can be printed using:

$ hexdump -C mouse0

My laptop touchpad doesn't appear to be linked to any of the files, when I plug in a USB mouse it will have it's own file with it's events. Some reading on PS/2 later I was ready to implement.

The mice module opens all the input files and sets the handles to non-blocking so that read will not block if there are no bytes waiting.

>>> import os
>>> import fcntl
>>> mouse_files = [(filename[5:],open('/dev/input/%s' % filename))
...     for filename in os.listdir('/dev/input')
...     if filename.startswith('mouse')]
>>> for mouse_no, mouse_file in mouse_files: # Set non-blocking on each file
...     fcntl.fcntl(mouse_file.fileno(), fcntl.F_SETFL, os.O_NONBLOCK)

We also define a Mouse class for each mouse file found which when polled tracks the byte stream from the file and generates the events with an ID associated.

>>> class Mouse(object):
...     def __init__(self, in_file, num, context=None):
...         self.num = num
...         self.f = in_file
...         self.b = ''
...         self.lmb = False
...         self.rmb = False
...         self.mmb = False
...         self.x = 0
...         self.y = 0
...         if context == None:
...             context = [-99999,99999,-99999,99999]
...         self.min_x, self.max_x, self.min_y, self.max_y = context

The b member is for holding any partial message that has been read, (each message is three bytes.) The context member is for setting an absolute coordinate range for the pointer. I think the other members are fairly clear.

...     def poll(self, notify=None):
...         read_a_message = False
...         expect_n_bytes = 3
...         # Check how many bytes we have buffered
...         if len(self.b):
...             # Reduce our expected bytes by the buffer length
...             expect_n_bytes = expect_n_bytes - len(self.b)
...         try:
...             i = self.f.read(expect_n_bytes)
...         except:
...             i = '' # On error set to zero length string
...         # Was the input the length we were expecting?
...         if len(i) == expect_n_bytes:
...             read_a_message = True
...             if expect_n_bytes != 3:
...                 # Merge any buffered bytes
...                 i = self.b + i
...                 self.b = ''
...             # Dump the message
...             self.dump(i, notify)
...         else:
...             # Buffer partial message
...             self.b = self.b + i
...         return read_a_message

This helper is used when there is no call back passed, it will print each message.

...     def _notify_print(num, message, values):
...         print "Device %s,",
...         if message == 'delta':
...             print "(%s, %s)" % (values[2], values[3]),
...             print "Device %s, %s down" % (self.num, name.upper())
...         else:
...             print "%s" % values.upper(),
...         print "%s" % message

The dump method changes the three byte PS/2 message into an event that makes more sense to an application: mouse button down, mouse button up, or mouse move. If notify is supplied then it is assumed to be a callable and invoked with each message. Otherwise the message is printed.

...     def dump(self, i, notify=_notify_print):
...         i = (ord(i[0]), ord(i[1]), ord(i[2]))
...         move_left = i[0] & 0x10 > 0
...         move_down = i[0] & 0x20 > 0
...         lmb = i[0] & 0x1 > 0
...         rmb = i[0] & 0x2 > 0
...         mmb = i[0] & 0x4 > 0
...         x_d = i[1]
...         y_d = i[2]
...         if move_left:
...             x_d = (256 - x_d) * -1
...         if move_down:
...             y_d = (256 - y_d) * -1
...         for button, name in ((lmb,'lmb'),(rmb,'rmb'),(mmb,'mmb')):
...           if button:
...             if not getattr(self, name):
...                 # Button is down in message, not down in member
...                 setattr(self, name, True)
...                 notify(self.num, 'down', name)
...           else:
...             if getattr(self, name):
...                 setattr(self, name, False)
...                 # Button up
...                 notify(self.num, 'up', name)
...         if x_d != 0 or y_d != 0:
...             # Update our absolute position, clamped by context
...             new_x = self.x + x_d
...             new_y = self.y - y_d
...             self.x = new_x
...             if new_x < self.min_x: self.x = self.min_x
...             if new_x > self.max_x: self.x = self.max_x
...             self.y = new_y
...             if new_y < self.min_y: self.y = self.min_y
...             if new_y > self.max_y: self.y = self.max_y 
...             notify(self.num, 'delta', [self.x, self.y, x_d, y_d])

Lastly we create a member in the module to store the Mouse instances for each file.

>>> mice = [Mouse(mouse_file, mouse_no)
...         for mouse_no, mouse_file in mouse_files]

Now to adapt Creeps to support the new mouse protocol, firstly we need to manage the mouse as it is seen by PyGame.

>>> import pygame.mouse
>>> pygame.mouse.set_visible(0)

And in the "while True:" game loop we need to reset the mouse position each frame to stop the (invisible) mouse cursor from escaping the application.

...         time_passed = clock.tick(50) # insert below here
...         pygame.mouse.set_pos([250,200])

At the top below where we set the cursor to invisible we need to import the mice module, create a map of mice by ID and set each mouse to only move within the game area:

>>> import mice
>>> mice_by_id = {}
>>> for mouse in mice.mice:
...     mouse.min_x = 50
...     mouse.max_x = 349
...     mouse.min_y = 50
...     mouse.max_y = 349
...     mice_by_id[mouse.num] = mouse

Without a mouse cursor it is not possible to see where the mouse is, to minimise effort we largely lift the code for the creeps:

>>> creeps = None # insert below here
>>> cursors = None
>>> cursors_list = None

Then to setup the cursors:

>>> # insert below the creeps loop
>>> global cursors 
>>> cursors = pygame.sprite.Group()
>>> global cursors_list
>>> cursors_list = [] 
>>> for i in range(len(mice.mice)):
...     c = Creep(  screen=screen,
...         creep_image=choice(creep_images),  
...         explosion_images=explosion_images, 
...         field=FIELD_RECT, 
...         init_position=(randint(FIELD_RECT.left, FIELD_RECT.right),
...             randint(FIELD_RECT.top, FIELD_RECT.bottom)),  
...         init_direction=(choice([-1, 1]), choice([-1, 1])), 
...         speed=0.00) 
...     cursors.add(c)
...     cursors_list += [c]

We also need to paint the cursors each frame:

...                 creep.draw()  # insert below here
...             for cursor in cursors: 
...                 cursor.update(time_passed) 
...                 cursor.draw()

With no way to click on the close button we need to add a way to quit without it:

...                 paused = not paused  # insert below here
...             if event.key == pygame.K_ESCAPE: 
...                 exit_game()

Below the event loop add a second loop to poll the mice for events, (we will define this function soon):

...         for mouse in mice.mice:
...             mouse.poll(handle_mouse)

Now to tie it all together the mouse handler:

>>> def handle_mouse(device, event_type, argument): 
...     mouse = mice_by_id[device] 
...         if event_type == 'down': 
...             for creep in creeps: 
...                 creep.mouse_click_event([mouse.x, mouse.y]) 
...         if event_type == 'delta': 
...             cursors_list[int(device)].pos = vec2d(mouse.x, mouse.y)

The handler either hits a creep or moves a cursor.

Multi mouse Creeps!

Getting started!

This blog is intended to cover my development work. To start with I'll put some of my existing projects online and follow up as they develop.