Win a free book at the February Python Book Contest

This month is a special month. It’s not because of Valentines day or even the exciting day where we see groundhogs. No, this month is special because I’m have a book contest where you, the reader, get to win something free for doing absolutely nothing more than posting a comment saying that you want one of the several books I have available in the contest.

So without getting into boring details I’ll keep this short. I’ve been reviewing a lot of books lately and I think it’s time to get some books into people’s hands to enjoy themselves. This month the giveaways are all Python oriented.

So, all you have to do is take a look at the following titles and post a comment here saying that you want one of them. At the end of the month two readers will be chosen via a random list sorting python script I’ve whipped up for just this purpose. You will then get an email from the publisher who will send a brand new e-copy of the book free of charge. I’ll also be reviewing these books at a later date for those that do not win the contest.



Python Text Processing with NLTK 2.0 Cookbook


Python 2.6 Text Processing: Beginners Guide


Python 2.6 Graphics Cookbook

Post a comment now and tell me which book you want!

Read More

Simple Python: a job queue with threading

Every so often you need to use a queue to manage operations in an application. Python makes this very simple. Python also, as I’ve written about before, makes threading very easy to work with. So in this quick program I’ll describe via comments, how to make a simple queue where each job is processed by a thread. Integrating this code to read jobs from a mysql database would be trivial as well; simply replace the “jobs = [..." code with a database call to a row select query.

#!/usr/bin/env python
## DATE: 2011-01-20
## FILE: queue.py
## AUTHOR: Matt Reid
## WEBSITE: http://themattreid.com
from Queue import *
from threading import Thread, Lock

'''this function will process the items in the queue, in serial'''
def processor():
    if queue.empty() == True:
        print "the Queue is empty!"
        sys.exit(1)
    try:
        job = queue.get()
        print "I'm operating on job item: %s"%(job)
        queue.task_done()
    except:
        print "Failed to operate on job"

'''set variables'''
queue = Queue()
threads = 4

'''a list of job items. you would want this to be more advanced,
like reading from a file or database'''
jobs = [ "job1", "job2", "job3" ]

”’iterate over jobs and put each into the queue in sequence”’
for job in jobs:
     print “inserting job into the queue: %s”%(job)
     queue.put(job)

”’start some threads, each one will process one job from the queue”’
for i in range(threads):
     th = Thread(target=processor)
     th.setDaemon(True)
     th.start()

”’wait until all jobs are processed before quitting”’
queue.join()
Read More

A simple load test script in Python

Lately I’ve had to do some environment load testing so I wrote this quick script. It can be modified as needed but the basic idea is that it spawns $x threads (–threads) and then sends two connections (or however many you want with –per-connection=) per thread to the URL (–url=). You can have it wait a configurable time between connections as well (–wait=).

The url is appended with a 32 character randomized string so that any database/caching on the backend of the site isn’t serving data from a warm cache. You can hunt down the string length for 32 and change it to whatever you want. Feel free to change and use as needed, just keep my info at top.

#!/usr/bin/python
################################################################################
## DATE: 2010-10-26
## AUTHOR: Matt Reid
## MAIL: mreid@kontrollsoft.com
## SITE: http://kontrollsoft.com
## LICENSE: BSD http://www.opensource.org/licenses/bsd-license.php
################################################################################

from __future__ import division
import threading
import sys
import urllib2
import select
import random
import string
import getopt
import time

class threader(threading.Thread):
    def __init__(self):
        threading.Thread.__init__(self)
    def run(self):
        global url
        global per
        global u
        for i in range(per):
            if wait > 0:
                time.sleep(wait)
            str = randstr(32)
            # IMPORTANT: this is where we append the search string to the main URL
            # you might need to change this for your site.
            url = "%s/search/%s"%(u,str)
            print "polling url: %s"%(url)
            urllib2.urlopen(url)

def randstr(length):
    global url
    twoletters = [c+d for c in string.letters for d in string.letters]
    r = random.random
    n = len(twoletters)
    l2 = length//2
    lst = [None] * l2
    for i in xrange(l2):
        lst[i] = twoletters[int(r() * n)]
        if length & 1:
            lst.append(random.choice(string.letters))

    return "".join(lst)

def init_thread():
    backgrounds = []
    for thread in range(threads):
        print "Spawning thread: %s"%(thread)
        background = threader()
        background.start()
        backgrounds.append(background)
    for background in backgrounds:
        background.join()

def print_help():
    print '''loader.py - URL load test script
==================================================
Date: 2010-08-26
Website: http://themattreid.com
Author: Matt Reid
Email: themattreid@gmail.com
License: new BSD license
==================================================
Use the following flags to change default behavior

   Option                 Description
   --url=                 URL to test
   --per-connection=      Number of sequential reqests per connection (default 2)
   --threads=             Number of threads for url connections (default 50)
   --wait=                Time to wait in-between requests
   --help                 Print this message

   -u                     Same as --url
   -p                     Same as --per-connection
   -t                     Same as --threads
   -w                     Same as --wait
   -h                     Same as --help
   '''

def main():
    init_thread()
    sys.exit(0)

if __name__ == "__main__":
    global threads #num threads/connections to open
    global u #url to hit
    global per #per connection url hits
    try:
        options, remainder = getopt.getopt(
            sys.argv[1:], 'ptuw', ['per-connection=',
                                   'threads=',
                                   'url=',
                                   'wait=',
                                   'help'])
    except getopt.GetoptError, err:
        print str(err)
        sys.exit(2)

    for opt, arg in options:
        if opt in ('--per-connection'):
            per = int(arg)
        elif opt in ('--threads'):
            threads = int(arg)
        elif opt in ('--url'):
            u = arg
        elif opt in ('--wait'):
            wait = int(arg)
        elif opt in ('--help'):
            print_help()
            sys.exit(2)

    try:
        threads
    except NameError:
        print "No thread quantity specified."
        print_help()
        sys.exit(2)
    try:
        per
    except NameError:
        per = 2
    try:
        u
    except NameError:
        print "No URL Specified"
        print_help()
        sys.exit(2)
    try:
        wait
    except NameError:
        wait=0

    main()
Read More

Easy Python: multi-threading MySQL queries

There are many times when writing an application that single threaded database operations are simply too slow. In these cases it’s a matter of course that you’ll use multi-threading or forking to spawn secondary processes to handle the database actions. In this simple example for Python multi-threading you’ll see the how simple it is to improve the performance of your python app.

#!/usr/bin/python
## DATE: 2010-08-30
## AUTHOR: Matt Reid
## WEBSITE: http://themattreid.com
## LICENSE: BSD http://www.opensource.org/licenses/bsd-license.php
## Copyright 2010-present Matt Reid

from __future__ import division
from socket import gethostname;
import threading
import sys
import os
import MySQLdb

class threader(threading.Thread):
    def __init__(self,method):
        threading.Thread.__init__(self)
        self.tx =
        self.method = method
    def run(self):
        run_insert()

def run_insert():
    sql = "INSERT INTO table (`id`,`A`,`B`,`C`) VALUES (NULL,'0','0','0');")
        try:
            cursor.execute(sql)
            db.commit()
        except:
            print "insert failed"

def init_thread(): backgrounds = []
    for db in connections:
       logger("Spawning thread: %s"%(db),"d")
       quant = tx / THREADS
       background = threader(method,quant,db)        
       background.start()
       backgrounds.append(background)
    for background in backgrounds:
       background.join()

def main():
    try:
        init_thread()
    except:
        print "failed to initiate threads"

    sys.exit(0)

if __name__ == "__main__":
    mysql_host = "localhost" #default localhost
    mysql_pass = "pass" #default dbbench
    mysql_user = "user" #default dbbench
    mysql_port = 3306 #default 3306
    mysql_db = "schema" #default dbbench
    threads = 4 #must be INT not STR #create connection pool

    connections = []
    for thread in range(THREADS):
      try:
       connections.append(MySQLdb.connect(host=mysql_host, user=mysql_user, passwd=mysql_pass, db=mysql_db, port=mysql_port))
      except MySQLdb.Error, e:
       print "Error %d: %s"%(e.args[0], e.args[1])
       sys.exit (1)

    main()
    
Read More

Easy Python: display LVM details in XML

If you need to work with LVM in your scripts but haven’t found a good method to access details about Logical Volume Groups, here’s a simple Python script that will print the details about any volumes on your system. This could be useful for writing a partition check script for your MySQL data directory (if you’re not using a standard monitoring system like Nagios).

import sys
import os
import commands
import subprocess
import select

def lvm():
    print ""
    LVM_PATH = "/sbin"
    LVM_BIN = os.path.join(LVM_PATH, 'lvm')
    argv = list()
    argv.append(LVM_BIN)
    argv.append("lvs")
    argv.append("--nosuffix")
    argv.append("--noheadings")
    argv.append("--units")
    argv.append("b")
    argv.append("--separator")
    argv.append(";")
    argv.append("-o")
    argv.append("lv_name,vg_name,lv_size")

    process = subprocess.Popen(argv, stdout=subprocess.PIPE)
    output = ""
    out = process.stdout.readline()
    output += out
    lines = output.splitlines()
    for line in lines:
        line = line.strip()
        words = line.split(";")

        lvname = words[0].strip()
        vgname = words[1].strip()
        lv_size = int(words[2])
        print '''
    %s
    %s
    %s
  '''%(lvname, vgname, lv_size)

    print ""

lvm()
Read More

Easy MySQL: how to backup databases to a remote machine

Here’s a simple answer to a simple question. “How do I run a backup of MySQL to another machine without writing to the local server’s filesystem?” – this is especially useful if you are running out of space on the local server and cannot write a temporary file to the filesystem during backups.

Method one – this writes a remote file.
mysqldump [options] [db_name|--all-databases]| gzip -c | ssh user@host.com "cat > /path/to/new/file.sql.gz"

Method two – this writes directly into a remote mysql server
mysqldump [options] [db_name|--all-databases]| mysql --host=[remote host] –user=root –password=[pass] [db_name]

Read More

How to: rotate wordpress posts into headline/feature status

If you’re using the new Arthemia theme for WordPress you might notice that there are two areas of the theme that can have articles promoted to; namely Headline and Featured sections. This is controlled by category association. Basically you have a post and if you want it in the Headline area of the theme you attach the category “headline” to it, similarly for the featured section. Now, let’s say you don’t want to manually change this all the time since it can be time consuming to promote posts to those categories if you want rotating content.

Here’s a simple solution. In this bash script I connect to MySQL and remove the current associations from posts and then randomly choose posts to be promoted to the Headline and Featured categories. This can be modified for other ideas you might have involving categories/posts/randomized associations in WordPress.

The queries contain IDs for the Headline and Featured categories. In my installation, which will be different than yours, has the Headline category as ID=’103′ and Featured as ID=’104′ – replace as needed. I’m also doing some matching (see the WHERE sections) so that I don’t promote posts with certain IDs that are specific to the site for this script. You’ll want to customize the queries as needed for your site. You can find the script here: http://pastebin.com/1QqiM5rh

Read More

N900 – control all of your accounts with this script

If you own a Nokia N900 cellular device you might be interested in the ability to control all of your IM accounts from the command line. For those that do not know, the N900 runs Maemo Linux and is capable of running MySQL embedded if you so choose. Here’s a quick script I wrote to provide that functionality for IM accounts. It’s at the bottom of the page, called “im-connections”.

wiki: http://wiki.maemo.org/N900_Mission_Control#Set_all_SIP_accounts_to_online_or_offline
pastebin: http://pastebin.com/qAC57E1N

Read More

Reviewed: Python Testing by Daniel Arbuckle

I’ve recently had the pleasure of reading “Python Testing: An easy and convenient approach to testing your python projects” from Packt Publishing. It’s been a quick read but a solid set of instructions on the different methods for the subject.

The book starts out very quickly with details about the various methods that are available, the means of automation for testing, and of course the environment you’d want to be in for working on the subjects that the book covers. It then, in the second chapter, moves into the guts of testing by describing the basics of doctest via syntax and some simple examples, and then moves on to a real world example via the AVL tree. It’s all very basic testing until chapter three where the author gets into unit testing, which is probably the most useful method in my opinion, and he goes to prove it’s usefulness with examples of it’s use in different parts and stages of the development process. Later in the book the python mocker is used to separate unit sections, and then the actual unittest framework is discussed with more examples and a enough details that if you don’t understand it by then, you may never. By chapter six we are into the Nose app that drives the unittest, which is very useful of course.

The most useful part of the book comes toward the end where the author discusses and the walks through the method used to create a test-driven application and then even shows examples via the whole chapter dedicated towards making a testable web application frontend. Very impressive for such a quick read. Integration testing and System testing is also covered, thankfully. The final chapter covers some useful tools and techniques of which I particularly enjoyed the section on version control hooks. If you are not using version control in your development process you need to start now, as such the hooks for integration with the test framework are rather useful to know.

Overall this is a very nice book that discusses python application testing from the ground up. It’s perfect for a beginner or an intermediate python programmer that has little to no experience in automated testing methods. More advanced programmers that have already used these methods will probably not find the book too useful except for the last chapter that covers extra tools and techniques that they might not have seen before. If I didn’t have this book and needed to learn about python testing, it would be my first choice and my only recommendation so far. Well written and very useful.

If there is one thing I do not like about the book, it would be the reliance on the python CLI for running commands. I am a CLI kind of person and I keep lots of terminals open at the same time, so I prefer to write my code in an editor or IDE in one term tab, then switch to another and execute the script; I do not use the python command line to do much of anything. So following some of the steps in the book require that you follow the CLI method and that gets old for me. It’s a personal preference but one worth noting as there is a lot of it in the book. That’s the only thing I did not enjoy in a book that was otherwise basically perfect for the subject.

Read More

Is emacs not coloring your Python comments?

This is a simple matter with a simple solution that might help someone save time and confusion. Emacs wasn’t coloring my comments correctly so I went ahead and had it change them to red-italic. If you are having similar issues you can drop the following into your home directory’s .emacs file. Enjoy. Keep in mind that if you are using emacs in a terminal session as opposed to the X-server gui then you will not see the italics.


(global-font-lock-mode 1)
(custom-set-variables
'(gud-gdb-command-name "gdb --annotate=1")
'(large-file-warning-threshold nil))
(custom-set-faces
'(font-lock-comment-face ((((class color) (background light)) (:foreground "red" :slant italic)))))

Read More