Win a free book at the February Python Book Contest

This month is a special month. It’s not because of Valentines day or even the exciting day where we see groundhogs. No, this month is special because I’m have a book contest where you, the reader, get to win something free for doing absolutely nothing more than posting a comment saying that you want one of the several books I have available in the contest.

So without getting into boring details I’ll keep this short. I’ve been reviewing a lot of books lately and I think it’s time to get some books into people’s hands to enjoy themselves. This month the giveaways are all Python oriented.

So, all you have to do is take a look at the following titles and post a comment here saying that you want one of them. At the end of the month two readers will be chosen via a random list sorting python script I’ve whipped up for just this purpose. You will then get an email from the publisher who will send a brand new e-copy of the book free of charge. I’ll also be reviewing these books at a later date for those that do not win the contest.



Python Text Processing with NLTK 2.0 Cookbook


Python 2.6 Text Processing: Beginners Guide


Python 2.6 Graphics Cookbook

Post a comment now and tell me which book you want!

Read More

Easy Python: multi-threading MySQL queries

There are many times when writing an application that single threaded database operations are simply too slow. In these cases it’s a matter of course that you’ll use multi-threading or forking to spawn secondary processes to handle the database actions. In this simple example for Python multi-threading you’ll see the how simple it is to improve the performance of your python app.

#!/usr/bin/python
## DATE: 2010-08-30
## AUTHOR: Matt Reid
## WEBSITE: http://themattreid.com
## LICENSE: BSD http://www.opensource.org/licenses/bsd-license.php
## Copyright 2010-present Matt Reid

from __future__ import division
from socket import gethostname;
import threading
import sys
import os
import MySQLdb

class threader(threading.Thread):
    def __init__(self,method):
        threading.Thread.__init__(self)
        self.tx =
        self.method = method
    def run(self):
        run_insert()

def run_insert():
    sql = "INSERT INTO table (`id`,`A`,`B`,`C`) VALUES (NULL,'0','0','0');")
        try:
            cursor.execute(sql)
            db.commit()
        except:
            print "insert failed"

def init_thread(): backgrounds = []
    for db in connections:
       logger("Spawning thread: %s"%(db),"d")
       quant = tx / THREADS
       background = threader(method,quant,db)        
       background.start()
       backgrounds.append(background)
    for background in backgrounds:
       background.join()

def main():
    try:
        init_thread()
    except:
        print "failed to initiate threads"

    sys.exit(0)

if __name__ == "__main__":
    mysql_host = "localhost" #default localhost
    mysql_pass = "pass" #default dbbench
    mysql_user = "user" #default dbbench
    mysql_port = 3306 #default 3306
    mysql_db = "schema" #default dbbench
    threads = 4 #must be INT not STR #create connection pool

    connections = []
    for thread in range(THREADS):
      try:
       connections.append(MySQLdb.connect(host=mysql_host, user=mysql_user, passwd=mysql_pass, db=mysql_db, port=mysql_port))
      except MySQLdb.Error, e:
       print "Error %d: %s"%(e.args[0], e.args[1])
       sys.exit (1)

    main()
    
Read More

DDR-Drive gets 300,000 IOPS

Sure it’s expensive and hard to justify for most budgets but this is the future of drive technology happening now. http://www.ddrdrive.com

Read More

themattreid – RSS change – now feedburner

Update for all of the RSS subscribers of this site: please update your RSS feed to the new feedburner address: http://feeds.feedburner.com/Themattreid?format=xml

I’ll keep the original feed link active for a while but please change to the new one when you get a chance. Oh, why not just do it now since you’re on the internet and thinking about it, eh? It’s a good idea and a win-win situation. :)

Read More

Update: OSS MySQL Monitoring Solution

Progress on the Open Source enterprise grade MySQL monitoring system;  the schema for Monolith version 2 has been designed. Due to the many suggestions for features and the interest it has received I’ve put this on the front burner. That said, here is some more info on the next steps I’ll be taking.

  • Monitored servers will use a command line agent (called remotely) to pull information from both MySQL and the OS.
  • Historical information will include all values from global status and global variables, as well as CPU/Memory/IO/Disk usage.
  • Standard graphing functions – the ones in the list from the previous post – will gather information from various view tables that contain historical data that is collected from the agent script. User defined graphing will allow you to look at historical values over time for any of the various global status or global variable settings (integer based ones anyway).
  • The cnf file for each server will be stored in the monitoring database and available for viewing for historical purposes. 
Read More

Request: What do you want in a OpenSource MySQL Monitoring solution?

What would you like to see in a free enterprise-grade monitoring system for your daily MySQL needs?I’m rewriting Monolith – MySQL DBA Console from the ground up. This will be version 2 and I would like to get some input from the global MySQL community.So far I am going with the following; comment with any improvements/additions.

  • Variable interval polling of server statistics
  • Over 50 different alerts (see list below)
  • Graphing of various server statistics (see list below)
  • Tuning recommendations with cnf file changes to apply to server
  • Change control documents for recommended performance/security tuning
  • Threshold based alerting with multiple alert groups: info,warn,critical
  • Sorting/ordering of servers via groups. ie: client -> dev,stage,prod
  • RSS feeds for each alert group
  • XML export with user defined fields for external applications (API of sorts)
  • Slave server alerts, IO thread / SQL thread, seconds behind master (with threshold)
  • Various general stats: version, system ram size, # of schema, default table type
  • Ability to view current cnf file from server
  • Ability to view current global variables as a list
  • Email / pager reporting for alerts (choose warn/crit alerts to report on)

Alerts - can be enabled or disabled for reporting on a per server basis

  • number of connection failures (threshold)
  • number of connections errors (threshold)
  • binary logging not enabled
  • sync_binlog not set
  • max_used_connections too high compared to max_connections (ratio over 85% utilization)
  • query cache size too small: improper utilization
  • query cache size too large: improper utilization
  • memory usage over 85% of system ram
  • table scans excessive
  • tmp table to disk ratio too high
  • innodb buffer too small
  • innodb buffer too large (resources can be used elsewhere)
  • key buffer too large (resources can be used elsewhere)
  • key buffer too small
  • sort buffer too small
  • sort buffer too large (resources can be used elsewhere)
  • join buffer too small
  • join buffer too large (resources can be used elsewhere)
  • open_files_limit too small compared to open_files usage ratio
  • table locks too high
  • table cache too small
  • table cache too large (resources can be used elsewhere)
  • thread cache too small
  • thread cache too large (resources can be used elsewhere)
  • binlog size too small
  • excessive disk tmp table usage
  • flush time not zero value
  • indexes not being utilized properly
  • innodb doublewrite buffer enabled
  • innodb flush method not properly set
  • innodb transaction isolation level not properly set for ACID
  • innodb log wait time too high
  • conncurrent insert not set correctly
  • query cache not enabled
  • table locks too high
  • thread cache not enabled
  • concurrent queries too high (threshold)
  • binary logs not set to auto-purge
  • binary logging not enabled
  • slave not active if server is set to slave
  • slave sql thread stopped
  • slave io thread stopped
  • slave behind master too long (threshold)
  • slave not set as read-only
  • slave relay logs not automatically purged
  • account has global privs
  • account set to old_password
  • accounts able to be added with empty password
  • account has grant option
  • root can login remotely
  • root has no password
  • blank username detected
  • test database exists
  • usage user exists
  • general query log enabled
  • CPU usage over threshold
  • Load Average over threshold

Graphs - default

  • Query cache utilization, in/out hit ratio
  • Temp tables usage
  • Temp files usage
  • Open tables usage
  • Thread cached / created
  • Connections
  • Query rate
  • Data size, Index size, overall size
  • Uptime
  • CPU usage
  • I/O usage
  • Load average
  • Data transfer in/out
  • Memory allocation
  • Swap usage
  • Slave delay (if configured as a slave)

As mentioned before, this is a OSS application that has no license or subscription fees and will be updated on a regular basis. It’s a mix of PHP for the web side and Perl for the system polling and reporting processes. The installer will be new, web based, and a simple process, much improved from the current method.Please add your comments now, as I am drawing up plans this week for version 2. :)

Read More

The Rack

Everyone likes pictures and everyone loves pictures of server racks. This is my development environment at the home office. Consisting of top to bottom. Then there’s a pic of my desk. Gotta love the 32″ HDTV.

Sun v40z – workstation via VNC / FreeNX
SGI 1200 – MySQL master
3com 24port GigE switch
Compaq DL360 – nodemaster
Compaq DL360 – radio.9600baud.com
Cisco 3620 router – self explanatory
Compaq DL360 – MySQL slave 1
Compaq DL360 – MySQL slave 2
Compaq DL360 – Utility
APC Smart UPS 1500
APC UPS 1000

Rack From Afar Rack Close Up Desk Setup

Read More

Perl Parallel Processing for data collection 2

Figured I’d update and say that I solved the issue with threading. (Thanks Steve). The agent.pl script is now using the Proc::Queue module to automatically manage the number of spawned processes via forking. Here’s the run down.

use Proc::Queue;
Proc::Queue::size(10); # changing limit to 10 concurrent processes
Proc::Queue::trace(0); # trace mode off
Proc::Queue::debug(0); # debug is off
Proc::Queue::delay(0.5); # set 500 miliseconds as minimum
# delay between fork calls

…..

$count = @Array;
# loop throuhg the results and spawn processes with Proc::Queue
while($i < $count) {
my $f=fork;
if(defined ($f) and $f==0) {
thread_exec(); #call the poller kickoff function
exit(0)
}
$i++;
}

Read More

Standard Issue

Some common issues I run into.

1. Disk space usage – usually a result of binary logs hogging up space. Usually alleviated by setting expire_logs_day=n but in some cases the server sees so much traffic it can fill up before n days is reached. Hence the usual method of “mysql> purge master logs to ‘log-name.xxx’;”

2. Replication issues: seconds behind master too high, duplicate key errors, local data infile associated errors

3. Users need more accounts and don’t have grant options to create new ones. Trivial but menial labor. Meh.

4. Connection limits – someone decides not to use connection pooling or persistent connections, therefore we see some stale connections just hanging out and using up the max_connections variable. Please use connection pooling / persistence.

5. Latency – usually comes into play with MySQL Cluster setups where the management node misses heartbeats from the data/storage nodes. Solution: faster interconnects and vlans that separate traffic – or bonded NICs to aggregate bandwidth. Easier said than done depending on who runs the network.

6. RAM: please get as much ram as you can afford, then you can complain to your DBA about performance.

7. Using MyISAM when you should be using InnoDB. Sure your application may not use transactions but in many cases, almost all, InnoDB will perform better – especially if you have lots of RAM. Crash recovery? Yes, InnoDB is better at that as well.

8. Backup  processes taking too long. I need to have alternate scripts that make use of mysql-parallel-dump. End of story.

Read More