NetBeans is Slow on Ubuntu 10.10 – Switch Your JDK

NetBeans can be slow, very slow.  After trying out everything on the NetBeans Performance FAQ to no avail, I finally reached the point where I decided to switch to another IDE instead of trying to manage NetBeans’ performance problems.  Before doing so, I got a tip from a friend to check my JDK and switch to the Sun JDK if I wasn’t using it already.  Turns out I wasn’t, I was using the OpenJDK all along, so I decided to give the Sun JDK a shot. It actually turned out to be the silver bullet I needed.  After installing it, NetBeans returned to being snappy and responsive again … here’s how to make the change on Ubuntu 10.10.

First, you may need to add the sun-java6 repository before hand, but that’s not that big of a deal. You can run the following 3 commands below to add that repository and install the Sun Java 6 JDK:

sudo add-apt-repository ppa:sun-java-community-team/sun-java6
sudo apt-get update
sudo apt-get install sun-java6-jre sun-java6-bin sun-java6-jdk

Once that’s done, you’ll need to update your Java alternatives so that everything that requires the JDK will use the Sun Java 6 SDK vs. whatever other JDK you have installed. To do so, run the following command and choose the Sun JDK:

sudo update-alternatives –config java

Once your JDK is updated, you’ll still need to update the JDK that NetBeans uses. To do so, open the netbeans.conf file and update the netbeans_jdkhome directive to point to the new JDK:

Once that’s done restart NetBeans and you’ll be welcomed with a snappy, responsive IDE!

Posted in General, Performance | 2 Comments

MySQL – WordPress & Avoiding unneeded Group By clauses

A client recently reached out in regard to a reproducible performance problem on their WordPress site.  The use case and conditions were as follows:

  • Performance was fine until a change happened to the wp_posts or wp_term_relationships tables (i.e. when a new blog post is created, or an existing post is modified)
  • Once those tables changed, related result sets in the query cache were cleared, rebuilt, and re-cached
  • During re-caching, the site was brought to it’s knees due to (otherwise performant) queries executing against an empty cache .  Once re-caching completed, those queries returned their result sets much faster, and everything returned to normal.

Fortunately, there had already been some investigation by the time I began troubleshooting.  The client had their MySql long_query_time set to 10 seconds, and the slow query log was reporting tens of variants of this query during problematic conditions:

SELECT wp_posts.* FROM wp_posts
INNER JOIN wp_term_relationships ON wp_posts.ID = wp_term_relationships.object_id
WHERE 1=1
AND wp_term_relationships.term_taxonomy_id IN (32)
AND wp_posts.post_type = ‘post’
AND wp_posts.post_status = ‘publish’
GROUP BY wp_posts.ID
ORDER BY wp_posts.post_date DESC
LIMIT 0, 12;

Unfortunately, that query’s execution plan cites Using temporary and Using filesort operations, which are 2 very expensive operations due to the potential disc activity required (creating temp table(s), sorting the result, etc.).  The fact that the query was taking 1.5 seconds to return 12 records (on a cold query cache) supplements that claim.

Furthermore, something I questioned even before the execution plan was why the query made use of a group by clause on the table’s primary key w/out the use of any aggregate functions.  The join conditions and where clause used in the query assured a 1:1 join result, so using distinct, whether explicitly or implicitly via group by, wasn’t needed.  After removing it, I confirmed Using temporary and Using filesort were no longer cited in the execution plan:

As a result, we now have a query that performs 1,500 times faster.  The original query completed in about 1.5 seconds, whereas the revised query completes in less than a millisecond!  Even better news … after the optimization, the query is no longer reported in the MySQL slow query log.

Posted in Database, Performance | Leave a comment

MySQL – inner join vs. exists() may yield better performance

Here’s another example of why it’s always a good idea to refer to execution plans and test different query options.  Here’s a query I was running as part of some manual data work:

SELECT ua.* FROM vn_user_achievements ua
WHERE EXISTS (
SELECT 1 FROM _users_achievements_set_to_notified m
WHERE ua.user_id = m.user_id
)

Given that this was running under manual data work context, vs. being part of our actual product, I wasn’t too worried about how slow it was performing.  However, before too long I was running this query against bigger data sets and, as a result, I was waiting for over 30 seconds for it to return; that got me curious.  A quick look at it’s execution plan (shown below) revealed 2 big negatives:

  1. it wasn’t using an index:
  2. it was executing a correlated subquery

As for the index – I knew there was one on vn_user_achievements.user_id, so I was curious why it wasn’t being used.  In addition, correlated subqueries are well known to be painfully slow due to the amount of additional lookups they perform.  So as a next step I decided to rewrite the query using an inner join …

SELECT ua.* FROM vn_user_achievements ua
INNER JOIN _users_achievements_set_to_notified m ON ua.user_id = m.user_id

… to see if MySQL would use a smarter execution plan.

It did!  The key and ref columns show that MySQL used the user_id portion of the primary key on vn_user_achievements, and the rewritten query is no longer resulting in a correlated subquery.  As a result, this version returned an identical result set in just 1.5 seconds, or about 20 times faster!

So, if you’re working on tuning a slow query that happens to use exists(), consider rewriting it to use an inner join, especially if the execution plan is dismissing an index you suspect should be used, or if it’s forcing a correlated subquery to be executed.

It’s also worth noting that you need to test any exists() vs. inner join comparisons for identical result sets.  In my case, this wasn’t a problem due to a 1:many relationship, but if the data was many:many, the inner join rewrite would have resulted in a different result set.

Posted in Database, Performance | Leave a comment

MySQL – Got error 28 from storage engine

Just after a late night deployment I received this error message while running queries against the production database I just deployed to:

Got error 28 from storage engine query

I could connect to the database fine, but any database query returned that error. To make things worse, that error had also prevented some of the deployment process I just ran from completing successfully, so our game was down until I found a fix. Luckily that didn’t take long. Some quick Googling revealed that the database server was out of disk space:

From http://drupal.org/node/44160

This happened to me this morning =)

The error come when MySQL doesn’t have any free hard disk space to write to. Check your /tmp directory, that’s where I had run into problems. I had forgot to make my backup script delete older backups and it kept writing backups until the disk was full.

That post, and most others I found, suggest to check and clear the server’s /tmp directory, but ours was fine. It had 20 – 30 session files in it and little else – that wasn’t the problem. Instead, our deployment process backs up the database on each publish, and doesn’t clean out old backups. Needless to say, w/multiple publishes happening w/in the same day at times, w/each creating a new database backup, we clearly hit a point where our backups were consuming too much disk space. Once those were deleted, we were back up.

It’s worth noting that whether it be the /tmp directory, or removing other files, the problem is disk space.  Free some up, regardless of how you do it, and you’ll get past this error.

I hope this saves someone some time and panic. Now to see about flushing old backups. :)

Posted in Database | 1 Comment

Magento: There was an error processing your request

I recently jumped into a Magento project and, while setting everything up in my local environment, I encountered the following error when I hit the index page:

“There has been an error processing your request. Exception printing is disabled by default for security reasons.”

It took me a little over half an hour weeding through forum posts to find a fix.  It turns out a lot of users hit that generic error for a number of different reasons, so before the real debugging can begin, I needed to figure out what the real problem was, which turned out to be pretty simple.

If you hit that error, the first thing you should do is change errors/local.xml.sample to errors/local.xml and refersh your page.  Once that’s done, you should see a more detailed error message that points to the real problem.  In my case it was database permissions, which was easily fixed, but again, it could be something else entirely for someone else.  Regardless, the win here is getting to a helpful error message that points you to the real problem, which is what the XML file change allows.

Posted in General | Leave a comment

PlainCart login problem, and why are they still on MySQL versions < 4.1?

A client shot me an email asking for help w/a PlainCart installation he’s trying to setup. The symptom was that the supposed login credentials of admin, admin weren’t working. Reproducing the problem was simple enough, going to the login page and trying to login as admin, admin failed, sure enough. This was a little odd given that it was a fresh install.

I’m completely unfamiliar w/PlainCart, so the first place I looked was in the tbl_users table to make sure it had users – it did.  Next was to examine the application’s login function. Nothing out of the ordinary there, the login query was simple enough:

$sql = "SELECT user_id FROM tbl_user
WHERE user_name = ‘$userName‘ AND user_password = PASSWORD(‘$password‘)"
;

I decided to revisit the tbl_users table and look at the hashed passwords. At first glance nothing seemed out of the ordinary, but after running the following sql statement to verify the hash:

SELECT PASSWORD(‘admin’);

… I saw that the hash returned was *4ACFE3202A5FF5CF467898FC58AAB1D615029441, whereas the stock installation had already written a hash of 43e9a4ab75570f5b in tbl_users.user_password. Why the difference in password hashes?

According to the MySQL documentaion, it turns out that versions < MySQL 4.1 produced 16 byte hashes whereas newer versions produce 41 byte hashes. The fix was simple enough, I:

  1. altered the user_password column to varchar(41) so that it could hold 41 bytes; it was previously typed as varchar(32)
  2. updated the passwords via a call to MySQL’s PASSWORD() function

Overall this was a pretty easy fix, but since MySQL 4.1 was released in early 2005, it does beg the question: Why is PlainCart stuck on versions of MySQL older than 4.1??

Posted in Database, General | Leave a comment

Ubuntu 10.10: Installing XDebug and KCacheGrind with PHP 5

After doing a fresh Ubuntu 10.10 install I spent some time re-installing some much relied on dev tools, 2 of which are XDebug and KCacheGrind. Their purpose is to add enhanced debugging and profiling capabilities to any dev environment. If you’re serious about seeing what your applications are doing at a lower level, or about performance, then leveraging these 2 tools are a must. They’re not very hard to install and once you get used to them, they’ll both turn into tools you continue to rely on.

Here’s how to install and configure them in an Ubuntu 10.10 environment. While some of these steps are Ubuntu specific, it should be easy to do the same in any other distro since there aren’t many steps involved (overall), and some of them aren’t specific to Ubuntu, but rather, PHP and XDebug.

First, install XDebug:

sudo apt-get install php5-xdebug

… and configure PHP to recognize XDebug and it’s profiling capabilities. To do so, you’ll need to find where xdebug.so was installed:
sudo updatedb && locate xdebug.so

dpkg -L php5-xdebug | grep xdebug.so

credit to S. Tomatis for providing a much faster command!

Once you have the full path to xdebug.so, add the following lines to your php.ini file to enable XDebug and have it output cachegrind files to a specified directory:

zend_extension="[path to your xdebug.so file]"
xdebug.profiler_enable=1
xdebug.profiler_output_dir="[directory to write cachegrind output files]"

Now just restart Apache:

sudo service apache2 restart

To confirm correct configuration, hit a phpinfo() page and check for 2 things:

  • that you have a section titled XDebug
  • that you have a cachegrind.out file in the xdebug.profiler_output_dir directory you specified above

If those 2 are confirmed then the install is working as expected.

Next up is to install KCacheGrind. Just a word of warning that this will take a few minutes since it’s a pretty big install:

sudo apt-get install kcachegrind

Once installed, you should be able to open KCacheGrind via Applications > Programming > KCacheGrind. Once running, choose File > Open, and browse to any of the cachegrind.out files to inspect their profile.

Posted in Performance, PHP | 2 Comments

PHPUnit in Kohana 2.3.4

I wanted to add unit test support to a current Kohana 2.3.4 project I’m working on.  Unfortunately, most of the posts I found on how to integrate unit testing into Kohana were for version 3.x, not 2.3.x.  The few hits I did find for the 2.x branch had dead links, or weren’t entirely sufficient to get me to the point where I could run tests.  Luckily this wasn’t my first time using PHPUnit, so I went ahead and tried to see what it would take to get PHPUnit working directly w/out having to enlist Kohana’s unittest module.  Suprisingly, it was pretty straight forward.

The first thing  did was install PHPUnit (I’m working on Ubuntu, hence the use of apt-get):

sudo apt-get install phpunit

Second, I created the following 2 directories to house all the test related code:

  • application/tests
  • application/tests/classes

From here, the work begins.  You’ll need 3 more things to get to the point where you can write tests:

  1. a phpunit.xml file
  2. a simple php test file
  3. a PHPUnit specific Bootstrap file

The phpunit.xml file is the config file for PHPUnit.  Create the file in the application/tests directory w/this content, be sure to add the path to your index.php file:

<?xml version="1.0" encoding="UTF-8"?>
<phpunit colors="true" bootstrap="[path to your index.php file]">
  <testsuites>
    <testsuite name="Base Tests">
      <directory>./</directory>
    </testsuite>
  </testsuites>
</phpunit>

Next, add a simple test file.  Create application/tests/classes/BaseTest.php w/the following content:

require_once ‘PHPUnit/Framework.php’;
class BaseTest extends PHPUnit_Framework_TestCase
{
    public function testBase()
    {
        $this->assertTrue(true);
        $this->assertEquals(1, 1);
    }
}

That’s obviously not a worthy test for a real application, but it’ll serve as an indicator that the setup is correct when we try to run tests.

Next, cd to your application/tests directory and run phpunit.  You’ll probably get a bunch of HTML output, but at the very bottom you should see a message that reads: OK (1 test, 2 assertions). If so, then you’re ready to start adding your own tests.

Now, the 3rd item I listed above is to get rid of all the HTML output I just mentioned.  There are 2 reasons you’ll want to do this:

  1. To rid of the output nuisance when running tests from the command line
  2. To avoid problems when running unit tests from an IDE.  I was happy to find PHPUnit support in NetBeans, but while tests ran fine from the command line, running those tests from an IDE always returned the message No Tests Executed.  It wasn’t until I removed the HTML output that NetBeans was able to find my tests and parse the results.

So for item 3, we’ll need to do 3 things:

  1. copy system/core/Bootstrap.php as application/tests/BootstrapPHPUnit.php.
  2. comment out these 3 calls in that file:
    Event::run(‘system.routing’);
    Benchmark::stop(SYSTEM_BENCHMARK.‘_system_initialization’);
    Event::run(‘system.execute’);
  3. In your index.php file, you’ll need to route to either the core Bootstrap file, or the PHPUnit specific Bootstrap file depending on whether PHPUnit is running your app vs. a real user.  For that, I replaced:
    require SYSPATH.‘core/Bootstrap’.EXT;

    with…

    if(strpos($_SERVER[‘SCRIPT_NAME’], ‘phpunit’) && empty($_SERVER[‘SERVER_NAME’]))
    {
        require APPPATH.‘tests/BootstrapPHPUnit’.EXT;
    }
    else
    {
        require SYSPATH.‘core/Bootstrap’.EXT;
    }

Once that’s done, you should be able to run phpunit again from the command line and not have to put up with the HTML output.

Posted in PHP | 3 Comments

Some Numbers Regarding Latency and Geolocation

This won’t uncover anything most people familiar with server latency already know:  latency increases as the distance to a given server increases.  This post isn’t much more than that, except that there are numbers and data attached to supplement that point.  Hopefully this illustration will help some folks realize that you have to consider distance when evaluating server latency.

The site I used to run this test was my own.  At the time I wrote this, I had my site deployed on 2 servers: one in Dallas, another in Missouri.  I first bought the Missouri server as my 1st step to get away from shared hosting.  After deploying my site to that server I was surprised to find out that it was served slower than from my shared hosting server located in California.  That prompted me to purchase another server nearby.  After a brief search, I went with Tailor Made Servers in Dallas.

Before I start, a few things should be noted about each server:

  • the Dallas server runs an Intel 3.1 ghz dual core cpu, 4gb of RAM, and 7200 rpm drives
  • the Missouri server runs an AMD 1.8 ghz dual core cpu, 8gb of RAM, and 15k drives
  • both servers run PHP 5.3.3, APC (configured identically), and CentOS 5.5
  • Apache and other configs are identical on both servers; to supplement, they score identically in YSlow and PageSpeed.

Overall, I’ve done everything I can to make sure both servers are identical.  I don’t consider the hardware differences that significant given the trivial amount of resources it takes to load my home page (which is the only page I tested against).  Additionally, ping and traceroute tests, which I’ll illustrate below, don’t tax a server’s hardware very much either, if at all.

The first thing I tested was ping latency, which provided a pretty quick glimpse at the impact distance has on latency.  Results show about 77ms for the Missouri server vs. about 18ms for the Dallas server:

Next up, I wanted to see how many network hops it took to reach each server, which traceroute can tell us.  As shown, the Missouri server requires 16 network hops while the Dallas server only requires 12:

Finally, I pointed siege to both instances to measure differences in first-byte latency and requests per second each instance can process:

Missouri: 27 requests per second at around 160ms first-byte latency

Dallas: 100 requests per second at around 50ms first-byte latency

So with both instances setup almost identically, save a cpu difference, a change in location alone resulted in about a 4x speedup for users viewing my site from Austin.  I’ve since changed DNS to point solutionfactor.net to the Dallas server, so it’s important to note that other users viewing my site from areas close to Missouri, or further north, will probably experience higher latency.  However, given that 90% of my users are based in Austin, I’ll take that hit in favor of speeding things up locally.

Posted in Performance | Leave a comment

MySQL: Index Column Order Matters

I was recently asked about whether column order matters in composite indexes, and if so, to what extent.  I recall this being the case when working with SQL Server 2000 and 2005, so I became curious how MySQL handled scenarios where a query touched different columns in a composite index.  The short answer is yes, column order does matter.  If you’re interested in seeing how, and when, follow along.

Before running some tests I loaded test data, chose 1 table, and ran a baseline query with no indexes on the table (except for it’s primary key). The data I used is one table from one of MySQL’s sample databases, sakila.  You can download the table’s dump here. It’s also worth noting that the table uses the InnoDB storage engine.

Next, I ran the following query to ensure no indexes were being used for my test query. It’s execution plan is below to confirm this, indicated by the key column being null:

SELECT first_name, last_name, last_update
FROM actor
WHERE last_name LIKE ‘Witherspoon’
AND first_name LIKE ‘Angela’

This is where the useful tests start.  First, I added an index on lastname and firstname:

CREATE INDEX IX_actor_last_name_first_name ON actor (last_name, first_name)

Next, I re-ran EXPLAIN on the baseline query above.  This time, you can see that the new index on last_name and first_name is being used:

With use of the index established, I moved on to test dependence on column order. I looked at 3 scenarios, each illustrated with an execution plan.

The first involved modifying the where clause such that it only references last_name.  In this case the index can still be used since last_name is in the left-most position of the index:

SELECT first_name, last_name, last_update
FROM actor
WHERE last_name LIKE ‘Witherspoon’

The second involved modifying the where clause such that it only referenced first_name.  In this case, you can see the index is not used. It’s worth noting that in this case, SQL Server would probably use an index scan operation (vs. an index seek, which is the faster of the 2). It would still leverage the index, but would have to scan into the index to compare against the first_name data since it’s not in the left-most position of the index. Unfortunately, MySQL doesn’t act similar, it dismisses the index entirely.

SELECT first_name, last_name, last_update
FROM actor
WHERE first_name LIKE ‘Angela’

The third test was done out of curiosity, but demonstrates intelligence on the optimizer’s part prior to executing the query.  I modified the where clause such that both columns were included, but the order of the columns cited in the query are opposite of the column order in the index.  As you can see, the order in the where clause makes no difference since the optimizer is smart enough to know which order should be applied to make use of the index.

SELECT first_name, last_name, last_update
FROM actor
WHERE first_name LIKE ‘Angela’
AND last_name LIKE ‘Witherspoon’

So, all this reinforced 2 things:

  1. Index column order does matter
  2. Know what’s going on with your queries!  Run EXPLAIN and study execution plans so that you know what your queries are doing vs. possibly making a costly guess.  That relates in this case because indexes aren’t free, and as demonstrated above, there are cases where they’re added but the optimizer simply dismisses them.  In such cases you only make things worse since DML operations actually become slower when operating on columns that are indexed.

Finally, note that all this only covers non-clustered indexes, but the same effect happens on tests I’ve ran against clustered indexes.  Plus, column order in clustered indexes also has another impact on the data itself, which I’ll follow up on in another post that digs into clustered indexes in more detail.

Posted in Database | Leave a comment