Posts tagged with “programming”


Tue 24 Jul

Creating and importing custom Python packages

I'm building a custom Python package for a project I'm working on, and it took me more time than should have been needed to figure out how to achieve the import behavior I wanted for that package. The directory structure looks like this:

project_dir/
    foo.py
    package/
        __init__.py
        Bar.py
        Baz.py
        Qux.py

Each of the files in the package defines some classes; for now, we can assume they each have just one eponymous class. foo.py is the driver script that the user actually runs, which mainly just accepts command line arguments and imports the package.

What I'd like to be able to do in foo.py is say:

import package
b = package.Bar("asdf")
b.something()

If all my classes were in a single Python source file called package.py, this would be the default behavior. Alternatively, I could say from package import * in foo.py, but that would import my modules directly and frankly I think it looks ugly.

The way to achieve the desired behavior is to perform the imports of each module in the package in the __init__.py file, like so:

from Bar import *
from Baz import *
from Qux import *

Now, when we import the package, all the associated modules come with it.

I found this post very helpful in understanding how modules work, as well as the official documentation.

· Tags: ,

Fri 2 Mar

Gitorious on Ubuntu Server 11.10

I set up a Gitorious installation for my research group today, following the quite good instructions from Lucas here. I had to make a couple minor adjustments to his setup steps, which I'll document below.

I started with a fresh installation of Ubuntu Server 11.10. To get started, run the following:

sudo apt-get install build-essential tcl-dev libgeoip-dev postfix apache2 mysql-server mysql-client apg libsqlite3-dev imagemagick libpcre3-dev zlib1g-dev libyaml-dev libmysqlclient-dev apache2-dev libonig-dev ruby-dev rubygems libmysql-ruby libdbd-mysql-ruby libmagick++-dev zip unzip memcached git-core git-svn git-doc git-cvs irb git sphinxsearch libcurl4-openssl-dev libxslt1-dev libxslt-ruby

You'll be asked to set up a mysql root password; remember what you enter.

sudo gem install rake daemons rmagick stompserver passenger bundler

cd /var/www
git clone git://gitorious.org/gitorious/mainline.git gitorious
git submodule init
git submodule update

cd /var/www/gitorious/doc/templates/ubuntu/ && sudo cp git-daemon git-poller git-ultrasphinx stomp /etc/init.d/ && cd /etc/init.d/ && sudo chmod 755 git-daemon git-poller git-ultrasphinx stomp
sudo update-rc.d git-daemon defaults && sudo     update-rc.d git-poller defaults &&   sudo   update-rc.d git-ultrasphinx defaults && sudo    update-rc.d stomp defaults

sudo ln -s /usr /opt/ruby-enterprise

sudo $(gem contents passenger | grep passenger-install-apache2-module)

At this point, the installer the last script runs will ask you to copy some configuration code into a file. It will look something like what's in the echo below:

echo "LoadModule passenger_module /var/lib/gems/1.8/gems/passenger-3.0.11/ext/apache2/mod_passenger.so
    PassengerRoot /var/lib/gems/1.8/gems/passenger-3.0.11
    PassengerRuby /usr/bin/ruby1.8" | sudo tee /etc/apache2/mods-available/passenger.load

Let's move on.

sudo a2enmod passenger &&  sudo a2enmod rewrite && sudo a2enmod ssl

# NOTE: This step requires thinking. Replace the server name appropriately.
echo "<VirtualHost *:80>
    ServerName your.website.com
        DocumentRoot /var/www/gitorious/public
        </VirtualHost>" | sudo tee /etc/apache2/sites-available/gitorious

# Clearly, use a real SSL cert/key... 
echo "<IfModule mod_ssl.c>
    <VirtualHost _default_:443>
        DocumentRoot /var/www/gitorious/public
        SSLEngine on
        SSLCertificateFile    /etc/ssl/certs/ssl-cert-snakeoil.pem
        SSLCertificateKeyFile /etc/ssl/private/ssl-cert-snakeoil.key
        BrowserMatch ".*MSIE.*" nokeepalive ssl-unclean-shutdown downgrade-1.0 force-response-1.0
    </VirtualHost>
</IfModule>" | sudo tee /etc/apache2/sites-available/gitorious-ssl

sudo a2dissite default && sudo a2dissite default-ssl && sudo a2ensite gitorious && sudo a2ensite gitorious-ssl

mysql -u root -p

# In mysql, enter:
# mysql> GRANT ALL PRIVILEGES ON *.* TO 'gitorious'@'localhost' IDENTIFIED BY '<your password>' WITH GRANT OPTION;
# mysql> FLUSH PRIVILEGES;

cd /var/www/gitorious/ && sudo bundle install && sudo bundle pack

adduser --system --home /var/www/gitorious/ --no-create-home --group --shell /bin/bash git
chown -R git:git /var/www/gitorious

Alright. At this point we're almost done. The next couple steps have some room for creativity (aka, pick options and paths that make sense for how you want to deploy Gitorious).

sudo su git
cd /var/www/gitorious
mkdir .ssh && touch .ssh/authorized_keys && chmod 700 .ssh && chmod 600 .ssh/authorized_keys && mkdir tmp/pids && mkdir repositories && mkdir tarballs

cp config/database.sample.yml config/database.yml && cp config/gitorious.sample.yml config/gitorious.yml && cp config/broker.yml.example config/broker.yml

Follow the configuration instructions provided by Lucas. The next step, quoting him: "Because of an incompatibility of RubyGems with Rails < 2.3.11 you need to add the following line at the top of config/boot.rb:"

require 'thread'

(I can verify this is still necessary in 11.10). Back to Lucas, let's wrap up:

export RAILS_ENV=production && bundle exec rake db:create && bundle exec rake db:migrate && bundle exec rake ultrasphinx:bootstrap

# NOTE: The path to bundle has changed in 11.10! This is an update.
crontab -e * * * * * cd /var/www/gitorious && /usr/local/bin/bundle exec rake ultrasphinx:index RAILS_ENV=production

env RAILS_ENV=production ruby1.8 script/create_admin

Now, a couple small changes before you're ready to run. Some paths have changed in 11.10, as noted by a very helpful commenter on Lucas' article. First, we need to update /etc/init.d/stomp. Change GEMS_HOME="/usr/local" to GEMS_HOME=”/usr/local”. Next, we need to edit /etc/init.d/git-daemon and /etc/init.d/git-poller. As provided, these are each run with /usr/bin/ruby; I had to modify each to run with bundle exec. So make the following changes to the two respective files:

# /etc/init.d/git-daemon
GIT_DAEMON="$RUBY_HOME/local/bin/bundle exec $GITORIOUS_HOME/script/git-daemon -d"

# /etc/init.d/git-poller
GIT_POLLER="$RUBY_HOME/local/bin/bundle exec $GITORIOUS_HOME/script/poller"

At this point, I believe you should be able to reboot and have a working installation of Gitorious. Please let me know if I missed anything so I can update this (I'm actually re-enabling comments on this blog just for this). A couple gotchya's I ran into while I was setting this up:

  • Gitorious relies on several background scripts, most of which are located in the script directory. It also relies on stompserver for message passing. If the site is very slow, especially if pushes don't work, repository creation takes a long time, etc., check that both stompserver and poller are running. Note also that you can see the log output of the gitorious scripts under $GITORIOUS_HOME/tmp/pids.
  • If you can't push to a repo, stompserver is probably not running.

I'm pretty excited about having a simple way to host our projects internally. While I'm a big fan of Github, and our group pushes most of our public code there, having a private hosting location will also be helpful for things like in-submission papers, configuration files, and sensitive data. Our particular installation takes advantage of the very reliable file storage system that our department offers, so by using it all of our work can take advantage of their replication and automated backup/recovery systems. This is really important to me after hearing a harrowing tale from our sysadmin of a grad student who lost 3 years of work due to the theft of the only server that contained his data and code. Eek!


Sat 23 Apr

File reading performance in Python

There are a few ways to read a file in Python, some of which are outlined in this page about their relative performance. I am working on a project right now that involves reading large amounts of data from text files, so I repeated the analysis on Python 2.6.6, the version currently shipping with Ubuntu 10.10. I ran three implementations (below) against a file with 1 million lines.

My test script is available here, and the functions I tested are below. Here were my results:

ScriptTime (sec)Lines read per sec
fileread1:0.16955,899,280 lines/sec
fileread2:1.6387610,236 lines/sec
fileread3:0.12787,823,156 lines/sec
def fileread1():
    file = open("test.txt")
    while 1:
        line = file.readlines()
        if not line:
            break
        pass
    file.close()

def fileread2():
    for l in fileinput.input("test.txt"):
        pass

def fileread3():
    file = open("test.txt")
    for l in file:
        pass
· Tags: ,

Sun 12 Dec

On making a minified Click Modular Router driver for OpenWRT

My group is using the Click Modular Router for a project we're doing. We've written several custom elements for our configuration, and we're attempting to run it on space-constrained devices, Ubiquiti NanoStation M5's that have only 4MB of ROM. Thus, we need to build a minified version of Click that includes only the elements we actually use. You can do this by specifying a series of elements to to the click-mkminidriver in the form "-E -E ... ". To extract the elements we're actually using in our Click configuration files, we can use this bash script from our configurations directory:

cat *.click | tr ' ' '\n' | tr '(' '\n' | egrep "^[A-Z]" | grep "[a-z]$" | sort | uniq | sed "s/^/ -E /g "

Note that this assumes that all your Click elements begin with an upper-case letter. Fortunately, it's simple to remove false positives.

· Tags: , ,

Sun 3 Oct

Instant Runoff Voting: A bash implementation

I love shell scripts. If I have something simple to write that a normal person would try to do in a scripting language like Perl or Python, I try to do it with bash. And, as much as possible, I try to avoid using fancy stuff like variables, sed, or awk.

I didn't succeed in the latter goal this time, but I did succeed in implementing Instant Runoff Voting as a bash script. It takes as input a file called irv.txt, which lists each ballot (ranked set of choices) on a separate line as a comma-delimited ordered list.

Here's a sample input file, and here's the script itself. It could be much simpler and probably cleaner, but I tried to make it clear what was going on at each step (and demonstrate the power of piped commands).

· Tags: ,
Next → Page 1 of 2