SAN vs Local-disk :: innodb_flush_method performance benchmarks

If you’ve been tuning your MySQL database and have wondered what effect the innodb_flush_method settings have on write performance, then this information might help. I’ve recently been doing a lot of baseline load tests to show performance differences between localdisk and the new SAN we’re deploying. Since we run InnoDB for everything in production, and writes are very heavy, I decided to run comparison tests between two identical servers to find the best setting for innodb_flush_method. We have the following specs for the hardware:

  • Dell R610
  • 24 core Intel Xeon X5670 @ 2.93ghz
  • 72GB ECC RAM
  • Brocade 825 HBA
  • Local disk: RAID-10 15K SAS Ext3 (ugh)
  • SAN: Oracle 7420 with four Intel Xeon X7550 @ 2.00GHz, 512GB RAM, 2TB read-cache(MLC-SSD), 36GB write cache (SLC-SSD), 3 disk shelves populated with 60x2TB 7200RM SATA drives setup in mirrored format with striped logs, dual 8Gb FC links to redundant fabric, connected to Brocade DCX 8510-4.
  • The my.cnf file being used for the tests: click-click

I’m using the following sysbench command to run the tests. On each server the same commands are used. I ran a 1B row prepare prior to the 1B row test.

sysbench –db-driver=mysql –num-threads=64 –max-requests=1000000000 –max-time=3600 –test=oltp –verbosity=3 –validate=off –oltp-test-mode=complex –oltp-read-only=off –oltp-table-name=sbtest –oltp-table-size=1000000000 –oltp-dist-type=special –mysql-host=localhost –mysql-port=3306  –mysql-table-engine=innodb run

On the server that is utilizing SAN paths there are two LUNS presented for MySQL use. /db/data01 for InnoDB data files, /db/logs01 for InnoDB logs. These filesystems are both formatted as XFS. The server running local-disk tests is running Ext3. I might run some more tests later with the local-disk setup as XFS if time allows.

Here are the results. Clearly a well designed SAN infrastructure is superior to even RAID-10 15K SAS drives. And of course you can see the different performance values from using O_DIRECT for the innodb_flush_method for the different data storage mediums.

1B Row Complex Transactional Test, 64 threads

  • SAN O_DIRECT: read/write requests: 31560140 (8766.61 per sec.)
  • SAN O_DSYNC: read/write requests: 5179457 (1438.52 per sec.)
  • SAN fdatasync: read/write requests: 9445774 (2623.66 per sec.)
  • Local-disk O_DIRECT: read/write requests: 3258595 (905.06 per sec.)
  • Local-disk O_DSYNC: read/write requests: 3494632 (970.65 per sec.)
  • Local-disk fdatasync: read/write requests: 4223757 (1173.04 per sec.)
Read More

MySQL Community – what do you want in a load testing framework?

So I’ve been doing a fair number of automated load tests these past six months. Primarily with Sysbench, which is a fine, fine tool. First I started using some simple bash based loop controls to automate my overnight testing, but as usually happens with shell scripts they grew unwieldy and I rewrote them in python. Now I have some flexible and easily configurable code for sysbench based MySQL benchmarking to offer the community. I’ve always been a fan of giving back to such a helpful group of people – you’ll never hear me complain about “my time isn’t free”. So, let me know what you want in an ideal testing environment (from a load testing framework automation standpoint) and I’ll integrate it into my existing framework and then release it via the BSD license. The main goal here is to have a standardized modular framework, based on sysbench, that allows anyone to compare their server performance via repeatable tests. It’s fun to see other people’s benchmarks but it’s often difficult to repeat and compare since most tests aren’t fully documented in their blog posts – this could be a solution to that.

Currently I have the harness doing iterations based on:

  • incrementing (choose a global dynamic variable, ie: sync_binlog=0-1000) system values
  • storage engine vs storage engine for the same workload
  • thread quantity increments for read-only or read+write
  • N-nodes in a cluster workloads with WRR traffic distribution (need to code WLC and others)
  • QPS testing for connection pool vs open/close connection
  • multi-table vs single-table workloads

Outputs available: CSV, XML, JSON for easy integration into any number of the various graphing frameworks available. I’ll probably code up a light weight python http server preloaded with Highcharts and Sparklines so you can see your benchmarks easily without having to roll your own graphs.

Quick now, tell me what you’d like me to code for you!

Read More

Python for Automation: using pdsh for a menu-driven command execution environment

I’ve been playing around with some quick system automation scripts that are handy to use when you don’t want / need to setup a chef or puppet action. I like to keep all of my hostnames and login details in a MySQL database (a cmdb actually) but for this example we’ll just use a couple of nested lists. This script executes commands in parallel across the hosts you choose in the menu system via the “pdsh” command, so make sure you have that installed before running. Alternately you can change the command call to use ssh instead of pdsh for a serialized execution, but that’s not as fun or fast. With some customizations here and there you can expand this to operate parallelized jobs for simplifying daily work in database administration, usage reporting, log file parsing, or other system automation as you see fit. Here’s the code. Comments welcome as always!

#!/usr/bin/env python
## NAME: menu_parallel_execution.py
## DATE: 2011-02-04
## AUTHOR: Matt Reid
## WEBSITE: http://kontrollsoft.com
## EMAIL: mreid@kontrollsoft.com
## LICENSE: BSD http://www.opensource.org/licenses/bsd-license.php

import commands
import sys
import os
import operator

ver = sys.version.split(' ')[0].split(".")
major=ver[:1]
minor=ver[1:2]
version="%s.%s"%(major[0],minor[0])
if version in ('2.4','2.3','2.2','2.1','2.0'):
	pyver = "old"
else:
	from subprocess import Popen, PIPE, STDOUT, call
	pyver = "new"

def main():
        #This section of host definitions could be replaced by a database
        #call if you are storing your hosts in that manner, but we'll keep
        #this script simple for the time being.
	dc0 = [
		["host0-0","10.0.0.1","ssh-username"],
		["host0-1","10.0.0.2","ss-username"]]

	dc1 = [
		["host1-0","10.1.0.1","ssh-username"],
		["host1-1","10.1.0.2","ss-username"]]

	dc2 = [
		["host2-0","10.2.0.1","ssh-username"],
		["host2-1","10.2.0.2","ss-username"]]

	print '''[1] Datacenter-1
[2] Datacenter-2
[3] Datacenter-3
'''
	dc = int(raw_input("Datacenter ID: "))
	if dc == 1:
		hosts = dc0
	elif dc == 2:
		hosts = dc1
	elif dc == 3:
		hosts = dc3

	nodes = []
	stay = True
	while stay == True:
		i = 0
		nest = sorted(hosts, key=operator.itemgetter(0))
		for x in range(len(nest)):
			print "[%i] %s | %s"%(i,nest[i][1],nest[i][0])
			i+=1

		print "\nSelect node to add to execution list"
		ch = int(raw_input("ID: "))
		xx = [nest[ch][1],nest[ch][2]]
		nodes.append(xx)
		s = str(raw_input("\nAdd another node? [Y/n] "))
		if s == "n" or s == "N":
			stay = False

	if(pyver == "new"):
		addrs = ""
		for node in nodes:
			address = node[1]+"@"+node[0]+","
			address = address.rstrip("\n")
			addrs = addrs+address

		addrs = addrs.strip(",")
		cmd = str(raw_input("\nEnter the command to execute: "))				

		try:
			c = "pdsh -w %s %s"%(addrs,cmd)
			print "Executing: %s"%(c)
			call(c,shell=True)
		except:
			print "Failed to execute pdsh command: %s"%(c)
			sys.exit(1)

		sys.exit(0)

	if(pyver == "old"):
		print "Please upgrade to Python 2.6+"
		sys.exit(1)

## START
if __name__ == "__main__":
	try:
		retval = main()
	except (KeyboardInterrupt, SystemExit):
		sys.exit(1)
Read More

Event based programming vs threading by Rob von Behren, Jeremy Condit and Eric Brewer

Saw this interesting paper about highly concurrent programming methods and figured the word should be spread! It’s not new material but it’s a good read. See the full article here: http://www.usenix.org/events/hotos03/tech/full_papers/vonbehren/vonbehren_html/

“Highly concurrent applications such as Internet servers and transaction processing databases present a number of challenges to application designers. First, handling large numbers of concurrent tasks requires the use of scalable data structures. Second, these systems typically operate near maximum capacity, which creates resource contention and high sensitivity to scheduling decisions; overload must be handled with care to avoid thrashing. Finally, race conditions and subtle corner cases are common, which makes debugging and code maintenance difficult.

Threaded servers have historically failed to meet these challenges, leading many researchers to conclude that event-based programming is the best (or even only) way to achieve high performance in highly concurrent applications. “

Read More