Archive for September, 2008

To headline or to summarize 1000+ articles per week?

Ok, here’s your chance to make a recommendation on the front page layout for relationalnews.com , the new website that features news for all of the major relational database systems. Check out the three options below and leave a comment or email me with your own solution. If selected you will be credited one the site for your influence. 

The question is: we’re getting just over 1000 new articles per week from the feeds, and the current front page displays the title and summary of the latest 10 articles. This means lots of pagination clicking to see even the approximate 150 new articles for the current day. 

Option 1: display only the headlines, no article summaries, for the past 24 hours with links to the full article from the archive. Sorted by published date, limited to the past 24 hours regardless of number of posts. Paginate the rest.

Option 2: display blocks of headlines sorted by category tags, limit to 10 headlines. Paginate the rest.

Option 3: leave the layout how it currently is.

Option 4: make a suggestion?

Tags: , , ,

In search of database related RSS feeds

If you have a database related blog and would like to add it to the relationalnews.com website for syndication, please post a comment or email your feed to contact@relationalnews.com

Tags: ,

A better solution to utf-8 and encoding… it’s so easy…

In my previous post I had a function to remove characters greater than ascii code 126. Josh Sled and William Newton emailed me and some ideas went back and forth. As a result, here’s a much easier function to solve the issues. It came down to multi-byte encoding not being considered.  

function mb_sanity($text)
{
return mb_convert_encoding($text, 'HTML-ENTITIES', "UTF-8");
}

Battling XHTML :: Storing UTF-8 data in MySQL

In the xml parser that I’ve been writing for rss/atom feeds I’ve encountered what many people have found; bizarre encoding issues when displaying the data from the database on a webpage. Since this is not really well explained by the searches I did on google I’ll explain it here.

Issue: you have utf-8 data coming from a source, you put it into a utf8_general_ci column of a mysql database table. You read the data from the database and display it as html/xhtml. Instead of getting things like double backquotes or long dashes you get euro signs or umlaut type of characters, usually strings of them instead of the correct format.

Potential solution: use utf8_encode and htmlentities in PHP to clean the data before going into the database. This does not work. Why? Those characters are not covered by html standards since they are above ascii code 126. See here for the full code chart: http://www.ascii.cl/htmlcodes.htm

Solution: clean out the invalid characters when the data is displayed on a browser. Similar to this post, but with changes: htmlentities


function htmlfriendly2($txt){
$len = strlen($txt);
$res = "";
for($i = 0; $i < $len; ++$i) {
$ord = ord($txt{$i});
if($ord >= 127) {
$res .= " ";
}
else {
$res .= $txt{$i};
}
}
return $res;
}

Tags: , ,

RelationalNews.com is online

Good news fellow DBAs; adding to the already packed list of RSS/Atom aggregation sites out there on the internet, there is a new site catering to DBAs called Relationalnews. Feel free to add your feed(s) for aggregation, because what else do bloggers want but more visibility to search engines, right? This was basically a coding project to get familiar with CodeIgnitor as well as RSS and Atom xml feed processing in PHP. Pretty simple looking back on it, and it was generally a fun project.I’ll probably add more features to the site at a later time, with free time being what it is…So read the news! http://relationalnews.com 

Tags: , , , ,

MySQL, PHP, XML = mysql-dba.com

This is a basic heads up post, perhaps even blatant self marketing. So, please continue reading. :)

If anyone recalls the website http://mysql-dba.com they would know that it’s based on the planet.py codebase that is written in python. I originally wrote a simple php script that utilized the lastRSS.php class for parsing feeds on the backend for archival purposes to be used at a later date. I say archival and later date because the site itself did not utilize any of the relational data storage to run the site. The site’s python code and cache was updated by cron scripts every 15 minutes and new data was scp’d from my dev server to my webhost’s servers.  This process eventually was quite randomly run since my development server rack in the garage at home gets really hot during the summer months and I ended up taking the servers offline unless I was actively using them for other purposes. You could say the priority of the site came below an overheating Sun v40z server.

Things are a changin’ now. I’ve been working with Roy Lindauer on a newly designed layout with a CodeIgnitor backend that we will be porting to phpsyndicate.com, or rather the other way around since we initially designed and planned the code+layout for the phpsyndicate site but figured we could have multiple sites using the same code but with different markets. Expect more sites with this codebase in addition to mysql-dba and phpsyndicate.

In addition to the new MVC layout,  I’ve written a brand new XML RSS/Atom parser and MySQL loader for the aggregation functions. It’s been tiring to cover the differences between the various feed formats, but also a good learning experience to write it all by hand.

As such, expect to see http://mysql-dba.com totally redesigned, up and running very soon!

Tags: , , , ,