MySQL – inner join vs. exists() may yield better performance

Here’s another example of why it’s always a good idea to refer to execution plans and test different query options.  Here’s a query I was running as part of some manual data work:

select ua.* from vn_user_achievements ua
where exists (
select 1 from _users_achievements_set_to_notified m
where ua.user_id = m.user_id
)

Given that this was running under manual data work context, vs. being part of our actual product, I wasn’t too worried about how slow it was performing.  However, before too long I was running this query against bigger data sets and, as a result, I was waiting for over 30 seconds for it to return; that got me curious.  A quick look at it’s execution plan (shown below) revealed 2 big negatives:

  1. it wasn’t using an index:
  2. it was executing a correlated subquery

As for the index – I knew there was one on vn_user_achievements.user_id, so I was curious why it wasn’t being used.  In addition, correlated subqueries are well known to be painfully slow due to the amount of additional lookups they perform.  So as a next step I decided to rewrite the query using an inner join …

select ua.* from vn_user_achievements ua
inner join _users_achievements_set_to_notified m on ua.user_id = m.user_id

… to see if MySQL would use a smarter execution plan.

It did!  The key and ref columns show that MySQL used the user_id portion of the primary key on vn_user_achievements, and the rewritten query is no longer resulting in a correlated subquery.  As a result, this version returned an identical result set in just 1.5 seconds, or about 20 times faster!

So, if you’re working on tuning a slow query that happens to use exists(), consider rewriting it to use an inner join, especially if the execution plan is dismissing an index you suspect should be used, or if it’s forcing a correlated subquery to be executed.

It’s also worth noting that you need to test any exists() vs. inner join comparisons for identical result sets.  In my case, this wasn’t a problem due to a 1:many relationship, but if the data was many:many, the inner join rewrite would have resulted in a different result set.

Posted in Database, Performance | Leave a comment

MySQL – Got error 28 from storage engine

Just after a late night deployment I received this error message while running queries against the production database I just deployed to:

Got error 28 from storage engine query

I could connect to the database fine, but any database query returned that error. To make things worse, that error had also prevented some of the deployment process I just ran from completing successfully, so our game was down until I found a fix. Luckily that didn’t take long. Some quick Googling revealed that the database server was out of disk space:

From http://drupal.org/node/44160

This happened to me this morning =)

The error come when MySQL doesn’t have any free hard disk space to write to. Check your /tmp directory, that’s where I had run into problems. I had forgot to make my backup script delete older backups and it kept writing backups until the disk was full.

That post, and most others I found, suggest to check and clear the server’s /tmp directory, but ours was fine. It had 20 – 30 session files in it and little else – that wasn’t the problem. Instead, our deployment process backs up the database on each publish, and doesn’t clean out old backups. Needless to say, w/multiple publishes happening w/in the same day at times, w/each creating a new database backup, we clearly hit a point where our backups were consuming too much disk space. Once those were deleted, we were back up.

It’s worth noting that whether it be the /tmp directory, or removing other files, the problem is disk space.  Free some up, regardless of how you do it, and you’ll get past this error.

I hope this saves someone some time and panic. Now to see about flushing old backups. 🙂

Posted in Database | 3 Comments

Magento: There was an error processing your request

I recently jumped into a Magento project and, while setting everything up in my local environment, I encountered the following error when I hit the index page:

“There has been an error processing your request. Exception printing is disabled by default for security reasons.”

It took me a little over half an hour weeding through forum posts to find a fix.  It turns out a lot of users hit that generic error for a number of different reasons, so before the real debugging can begin, I needed to figure out what the real problem was, which turned out to be pretty simple.

If you hit that error, the first thing you should do is change errors/local.xml.sample to errors/local.xml and refresh your page.  Once that’s done, you should see a more detailed error message that points to the real problem.  In my case it was database permissions, which was easily fixed, but it could be something different for someone else.  Regardless, the win here is getting to a helpful error message that points to the real problem, which is what the XML file change allows.

Posted in General | Leave a comment

PlainCart login problem, and why are they still on MySQL versions < 4.1?

A client shot me an email asking for help w/a PlainCart installation he’s trying to setup. The symptom was that the supposed login credentials of admin, admin weren’t working. Reproducing the problem was simple enough, going to the login page and trying to login as admin, admin failed, sure enough. This was a little odd given that it was a fresh install.

I’m completely unfamiliar w/PlainCart, so the first place I looked was in the tbl_users table to make sure it had users – it did.  Next was to examine the application’s login function. Nothing out of the ordinary there, the login query was simple enough:

$sql = "SELECT user_id FROM tbl_user
WHERE user_name = '$userName' AND user_password = PASSWORD('$password')";

I decided to revisit the tbl_users table and look at the hashed passwords. At first glance nothing seemed out of the ordinary, but after running the following sql statement to verify the hash:

 select PASSWORD('admin');

… I saw that the hash returned was *4ACFE3202A5FF5CF467898FC58AAB1D615029441, whereas the stock installation had already written a hash of 43e9a4ab75570f5b in tbl_users.user_password. Why the difference in password hashes?

According to the MySQL documentaion, it turns out that versions < MySQL 4.1 produced 16 byte hashes whereas newer versions produce 41 byte hashes. The fix was simple enough, I:

  1. altered the user_password column to varchar(41) so that it could hold 41 bytes; it was previously typed as varchar(32)
  2. updated the passwords via a call to MySQL’s PASSWORD() function

Overall this was a pretty easy fix, but since MySQL 4.1 was released in early 2005, it does beg the question: Why is PlainCart stuck on versions of MySQL older than 4.1??

Posted in Database, General | Leave a comment

Ubuntu 10.10: Installing XDebug and KCacheGrind with PHP 5

After doing a fresh Ubuntu 10.10 install I spent some time re-installing some much relied on dev tools, 2 of which are XDebug and KCacheGrind. Their purpose is to add enhanced debugging and profiling capabilities to any dev environment. If you’re serious about seeing what your applications are doing at a lower level, or about performance, then leveraging these 2 tools are a must. They’re not very hard to install and once you get used to them, they’ll both turn into tools you continue to rely on.

Here’s how to install and configure them in an Ubuntu 10.10 environment. While some of these steps are Ubuntu specific, it should be easy to do the same in any other distro since there aren’t many steps involved (overall), and some of them aren’t specific to Ubuntu, but rather, PHP and XDebug.

First, install XDebug:

sudo apt-get install php5-xdebug

… and configure PHP to recognize XDebug and it’s profiling capabilities. To do so, you’ll need to find where xdebug.so was installed:

dpkg -L php5-xdebug | grep xdebug.so

Once you have the full path to xdebug.so, add the following lines to your php.ini file to enable XDebug and have it output cachegrind files to a specified directory:

zend_extension="[path to your xdebug.so file]"
xdebug.profiler_enable=1
xdebug.profiler_output_dir="[directory to write cachegrind output files]"

Now just restart Apache:

sudo service apache2 restart

To confirm correct configuration, hit a phpinfo() page and check for 2 things:

  • that you have a section titled XDebug
  • that you have a cachegrind.out file in the xdebug.profiler_output_dir directory you specified above

If those 2 are confirmed then the install is working as expected.

Next up is to install KCacheGrind. Just a word of warning that this will take a few minutes since it’s a pretty big install:

sudo apt-get install kcachegrind

Once installed, you should be able to open KCacheGrind via Applications > Programming > KCacheGrind. Once running, choose File > Open, and browse to any of the cachegrind.out files to inspect their profile.

Posted in Performance, PHP | 2 Comments

PHPUnit in Kohana 2.3.4

I wanted to add unit test support to a current Kohana 2.3.4 project I’m working on.  Unfortunately, most of the posts I found on how to integrate unit testing into Kohana were for version 3.x, not 2.3.x.  The few hits I did find for the 2.x branch had dead links, or weren’t entirely sufficient to get me to the point where I could run tests.  Luckily this wasn’t my first time using PHPUnit, so I went ahead and tried to see what it would take to get PHPUnit working directly w/out having to enlist Kohana’s unittest module.  Suprisingly, it was pretty straight forward.

The first thing  did was install PHPUnit (I’m working on Ubuntu, hence the use of apt-get):

sudo apt-get install phpunit

Second, I created the following 2 directories to house all the test related code:

  • application/tests
  • application/tests/classes

From here, the work begins.  You’ll need 3 more things to get to the point where you can write tests:

  1. a phpunit.xml file
  2. a simple php test file
  3. a PHPUnit specific Bootstrap file

The phpunit.xml file is the config file for PHPUnit.  Create the file in the application/tests directory w/this content, be sure to add the path to your index.php file:

<!--?xml version="1.0" encoding="UTF-8"?-->

./

 

Next, add a simple test file.  Create application/tests/classes/BaseTest.php w/the following content:

<?php

require_once 'PHPUnit/Framework.php';

class BaseTest extends PHPUnit_Framework_TestCase
{
    public function testBase()
    {
        $this->assertTrue(true);
        $this->assertEquals(1, 1);
    }
}

That’s obviously not a worthy test for a real application, but it’ll serve as an indicator that the setup is correct when we try to run tests.

Next, cd to your application/tests directory and run phpunit.  You’ll probably get a bunch of HTML output, but at the very bottom you should see a message that reads: OK (1 test, 2 assertions). If so, then you’re ready to start adding your own tests.

Now, the 3rd item I listed above is to get rid of all the HTML output I just mentioned.  There are 2 reasons you’ll want to do this:

  1. To rid of the output nuisance when running tests from the command line
  2. To avoid problems when running unit tests from an IDE.  I was happy to find PHPUnit support in NetBeans, but while tests ran fine from the command line, running those tests from an IDE always returned the message No Tests Executed.  It wasn’t until I removed the HTML output that NetBeans was able to find my tests and parse the results.

So for item 3, we’ll need to do 3 things:

  1. copy system/core/Bootstrap.php as application/tests/BootstrapPHPUnit.php.
  2. comment out these 3 calls in that file:

    Event::run('system.routing');
    Benchmark::stop(SYSTEM_BENCHMARK.'_system_initialization');
    Event::run('system.execute');
  3. In your index.php file, you’ll need to route to either the core Bootstrap file, or the PHPUnit specific Bootstrap file depending on whether PHPUnit is running your app vs. a real user.  For that, I replaced:
    require SYSPATH.'core/Bootstrap'.EXT;

    with…

    if(strpos($_SERVER['SCRIPT_NAME'], 'phpunit') && empty($_SERVER['SERVER_NAME']))
    {
        require APPPATH.'tests/BootstrapPHPUnit'.EXT;
    }
    else
    {
        require SYSPATH.'core/Bootstrap'.EXT;
    }

Once that’s done, you should be able to run phpunit again from the command line and not have to put up with the HTML output.

Posted in PHP | 15 Comments

Some Numbers Regarding Latency and Geolocation

This won’t uncover anything most people familiar with server latency already know:  latency increases as the distance to a given server increases.  This post isn’t much more than that, except that there are numbers and data attached to supplement that point.  Hopefully this illustration will help some folks realize that you have to consider distance when evaluating server latency.

The site I used to run this test was my own.  At the time I wrote this, I had my site deployed on 2 servers: one in Dallas, another in Missouri.  I first bought the Missouri server as my 1st step to get away from shared hosting.  After deploying my site to that server I was surprised to find out that it was served slower than from my shared hosting server located in California.  That prompted me to purchase another server nearby.  After a brief search, I went with Tailor Made Servers in Dallas.

Before I start, a few things should be noted about each server:

  • the Dallas server runs an Intel 3.1 ghz dual core cpu, 4gb of RAM, and 7200 rpm drives
  • the Missouri server runs an AMD 1.8 ghz dual core cpu, 8gb of RAM, and 15k drives
  • both servers run PHP 5.3.3, APC (configured identically), and CentOS 5.5
  • Apache and other configs are identical on both servers; to supplement, they score identically in YSlow and PageSpeed.

Overall, I’ve done everything I can to make sure both servers are identical.  I don’t consider the hardware differences that significant given the trivial amount of resources it takes to load my home page (which is the only page I tested against).  Additionally, ping and traceroute tests, which I’ll illustrate below, don’t tax a server’s hardware very much either, if at all.

The first thing I tested was ping latency, which provided a pretty quick glimpse at the impact distance has on latency.  Results show about 77ms for the Missouri server vs. about 18ms for the Dallas server:

Next up, I wanted to see how many network hops it took to reach each server, which traceroute can tell us.  As shown, the Missouri server requires 16 network hops while the Dallas server only requires 12:

Finally, I pointed siege to both instances to measure differences in first-byte latency and requests per second each instance can process:

Missouri: 27 requests per second at around 160ms first-byte latency

Dallas: 100 requests per second at around 50ms first-byte latency

So with both instances setup almost identically, save a cpu difference, a change in location alone resulted in about a 4x speedup for users viewing my site from Austin.  I’ve since changed DNS to point solutionfactor.net to the Dallas server, so it’s important to note that other users viewing my site from areas close to Missouri, or further north, will probably experience higher latency.  However, given that 90% of my users are based in Austin, I’ll take that hit in favor of speeding things up locally.

Posted in Performance | Leave a comment

MySQL: Index Column Order Matters

I was recently asked about whether column order matters in composite indexes, and if so, to what extent.  I recall this being the case when working with SQL Server 2000 and 2005, so I became curious how MySQL handled scenarios where a query touched different columns in a composite index.  The short answer is yes, column order does matter.  If you’re interested in seeing how, and when, follow along.

Before running some tests I loaded test data, chose 1 table, and ran a baseline query with no indexes on the table (except for it’s primary key). The data I used is one table from one of MySQL’s sample databases, sakila.  You can download the table’s dump here. It’s also worth noting that the table uses the InnoDB storage engine.

Next, I ran the following query to ensure no indexes were being used for my test query. It’s execution plan is below to confirm this, indicated by the key column being null:

select first_name, last_name, last_update
from actor
where last_name = 'Witherspoon'
and first_name = 'Angela'

This is where the useful tests start.  First, I added an index on last_name and first_name:

create index IX_actor_last_name_first_name on
    actor (last_name, first_name)

Next, I re-ran EXPLAIN on the baseline query above.  This time, you can see that the new index on last_name and first_name is being used (again, by referencing the key column):

With use of the index established, I moved on to test dependence on column order. I looked at 3 scenarios, each illustrated with an execution plan.

The first involved modifying the where clause such that it only references last_name.  In this case the index can still be used since last_name is in the left-most position of the index:

select first_name, last_name, last_update
from actor
where last_name = 'Witherspoon'

The second involved modifying the where clause such that it only referenced first_name.  In this case, you can see the index is not used. It’s worth noting that in this case, SQL Server would probably use an index scan operation (vs. an index seek, which is the faster of the 2). It would still leverage the index, but would have to scan into the index to compare against the first_name data since it’s not in the left-most position of the index. Unfortunately, MySQL doesn’t act similar, it dismisses the index entirely.

select first_name, last_name, last_update
from actor
where first_name = 'Angela'

The third test was done out of curiosity, but demonstrates intelligence on the optimizer’s part prior to executing the query.  I modified the where clause such that both columns were included, but the order of the columns cited in the query are opposite of the column order in the index.  As you can see, the order in the where clause makes no difference since the optimizer is smart enough to know which order should be applied to make use of the index.

select first_name, last_name, last_update
from actor
where first_name like 'Angela'
and last_name like 'Witherspoon'

So, all this reinforced 2 things:

  1. Index column order does matter
  2. Know what’s going on with your queries!  Run EXPLAIN and study execution plans so that you know what your queries are doing vs. possibly making a costly guess.  That relates in this case because indexes aren’t free, and as demonstrated above, there are cases where they’re added but the optimizer simply dismisses them.  In such cases you only make things worse since DML operations actually become slower when operating on columns that are indexed.

Finally, note that all this only covers non-clustered indexes, but the same effect happens on tests I’ve ran against clustered indexes.  Plus, column order in clustered indexes also has another impact on the data itself, which I’ll follow up on in another post that digs into clustered indexes in more detail.

Posted in Database, Performance | Leave a comment

Enabling compression for faster web pages

If you’re familiar with Steve Souders, or use tools like YSlow or PageSpeed to optimize your sites, you probably already know that enabling compression for your web sites is a proven way to speed up response times. Even though that’s well known by now, I remember having a hard time finding a resource that illustrated how to do this back when I first started doing this for my sites. It’s a very simple procedure, here’s what’s involved.

First, make sure you have the deflate_module installed.  To do that:

  1. find your Apache’s config file (for CentOS – /etc/httpd/conf/httpd.conf)
  2. open it and make sure you have the deflate_module installed.  Search for lines beginning with LoadModule – if you see a line starting with LoadModule deflate_module, it’s installed and ready to use.  Otherwise, you’ll need to add a line for the deflate_module.

Next, you’ll need to add AddOutputFilterByType directives to the Apache config file.   These basically inform Apache which content types you want the deflate_module to act on.  Here are the ones I’ve added:

AddOutputFilterByType DEFLATE text/plain
AddOutputFilterByType DEFLATE text/html
AddOutputFilterByType DEFLATE text/xml
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE application/xml
AddOutputFilterByType DEFLATE application/xhtml+xml
AddOutputFilterByType DEFLATE application/rss+xml
AddOutputFilterByType DEFLATE application/javascript
AddOutputFilterByType DEFLATE application/x-javascript

Keep in mind there are 2 ways of adding these directives. You can add those anywhere in your Apache config file to enable compression globally (I appended mine to the end of my conf file). However, you may only have specific apps or sites that you want to target vs. enabling compression globally; in such cases you can add those directives inside their respective virtual host blocks.

You can see the impact this change made, even to simple pages like my blog.  The entire page is now 30% smaller and 131 ms have been shaved from the total download time.

Before

After

Posted in Performance | Leave a comment

WordPress: call to undefined function get_option()

So I finally got around to installing WordPress; all seemed fine until I deployed to my live server and tried to log into the Admin panel.  After doing so, I hit the following error:

Fatal error: Call to undefined function get_option() in ../wp-admin/admin.php on line xyz.

After doing some research, it seems like a lot of folks have hit this problem.  Googling the error surfaced 2 common fixes:

  • deleting trailing spaces from the wp-config.php and functions.php files
  • re-uploading the entire Word Press installation (seems as if Word Press files have a way of becoming corrupt during their upload, which sounds odd)

Neither of those worked for me, but after suspecting the issue had something to do with php 5.3 and calls to include_once, my Google-fu uncovered this post:

http://wordpress.org/support/topic/wp-301-apc-314-php-533-white-wp-adnmin-page

So, it turns out that my issue was due to having php 5.3 and a specific version of APC installed.  After setting this APC directive in my apc.ini:

apc.include_once_override = 0

… all is well again! Hopefully this saves someone the hours of research, debugging, and frustration that I went through!

Posted in General | 20 Comments