You know, assuming everything goes according to plan…

Which never ever happens.

Right after posting about this fantastic large MySQL file upload script…  it stalled out around insert #2,650,000.  Thus, much fiddling with file offsets and start positions ensued.  So, fair warning – the script has zero upload recovery modes.

That said, it’s still significantly easier than using any of the methods I had tried earlier.

Anyhow, if your BigDump <shudder – what a name!> script fails on you, there is a way to force it to resume the upload.  I was performing the upload on a GZipped file – which makes the recovery process more difficult.  Nevertheless, here’s how you do it:

  1. Using FireFox and the FireBug plugin (you are using FireFox, right?) examine the last AJAX call
  2. Copy and paste the URL of the last AJAX call into a new browser tab
  3. Look for the GET variables in the URL named “start” and “foffset”
  4. This is the tedious part.  You’ll need to play with those two numbers until your script restarts.  When you get these numbers wrong, BigDump will be kind enough to show you the last bad MySQL line – it will almost certainly be a partial line.
  5. Looking at what has already been uploaded into your database, estimate whether the current MySQL error falls before or after the last entry.
  6. If after, crank the “foffset” number way down and re-try.  You want to start ratcheting the “foffset” number up – but from a point below the last good entry.  This will ensure you don’t accidentally resume the upload at a point after the last good entry.
  7. When you hit the correct “foffset” value, the browser window will take a while to pull up the page.  When it does, you will see the standard status screen with the file transfer data resumed.

I’ve used this process exactly once to resume a transfer of a GZipped file.  There’s no guarantee it will work for you, so use the above at your own risk.

Sweet PHP script, tragically terrible name

I’ve probably mentioned before how much of a PHP/MySQL/WordPress nerd I am.  If I haven’t so far, I’ve said so now.

One of the sites I operate requires numerous tables with about five million rows all together.  Automated backups are easy with any number of WordPress plugins.   But, what to do when you’ve got a HUGE SQL file and really small upload limits in phpMyAdmin and short server timeouts preventing a single script from reviving said huge file?

The brilliant scheme I came up with (I’m being sarcastic here) was to take those large files and split them up into smaller files of about 3MB each.  This approach was problematic for a number of reasons.  It required me to download a huge file and slice it up into small files between MySQL inserts.  Copying and pasting 25,000 rows at a time was giving my poor laptop fits.  It also meant I had to manipulate the full plaintext MySQL file, rather than a GZipped version.  On top of all of that, I was paranoid that I would accidentally omit a single row and have to start all over again.  In order to overcome the problem of server timeouts, I wrote a script that would load one small file, finish, then call itself again incrementing to the next file to be loaded, LRR.

There’s nothing good about this extremely time and labor intensive approach that was greatly error prone.

I wrote the above original serial-file-upload schema about a year and a half ago.  My PHP/MySQL kung fu is now much stronger.  Back then I never used scripts written by others – foolishly thinking that it would take me longer to understand how to integrate their scripts than it would be to build a serviceable script from scratch.  I went so far as to write my own AJAX library.  It was functional, but no where near the quality and reliability of jQuery.

Much humility and less hubris later, I’m more confident in my ability to read, understand, and interface with scripts written by others.  This now allows me to focus on only writing those things that are truly unique and critical to my projects, without having to re-invent the wheel.

What’s the name of this magnificent script I’m using for bulk uploads of really large MySQL databases?  Well, this is where the “tragically terrible name” bit from the subject line of this post.  The script is called <wince cringe>, “BigDump.”</wince cringe>  This script just automated a HUGE chunk of work for me.  Instead of struggling with more than 4 million lines of MySQL, I uploaded a SQL file, changed some variables in the BigDump script, and ran the script.

The total upload time is probably about the same as the script I was using.  The benefit is in not having to handle a ton of code before getting around to running the script