Sunday, April 28, 2013

A tale of one or more kludges

Let me tell you about a special incident I had while maintaining one of our projects. I wanted to write this while it was fresh in my mind, so that I could appropriately convey the various subtle levels of fail, not all of which I bothered to correct.

Please keep in mind this is for posterity, don't take this as a recommendation of how to go about things.

One of our customers has a business process that involves a MySQL database table which contains a list of users who are opearating under a given department's license for a particular piece of software. The department wanted to provide a mass update of all their user records in that table, and the way this is accomplished is with a CSV file. They e-mail a CSV to my customer, who then massages it into a particular format, and uses a web script which we maintain to "validate" and update the database table.

So far, WTF meters should be active, but in context, it's mostly shrugworthy. I mean, they need to do a bulk update, why not provide a CSV update feature? As long as we validate the data right?

Now the reason I even had to touch this code, which was written by a Master's student under the supervision of my group, was that it wasn't working. The uploaded CSV would not validate. Well, it seems that at some point someone decided that the table should have a Timestamp column added to it. No problem, right?

Not exactly. You see, whoever wrote the code using vanilla Perl DBI was unfamiliar with Perl hashes, or at least how to use them with DBI. Every column was indexed by its numerical position. And yes, of course, the Timestamp column was added at the beginning of the table definition.

It seems the programmer who did this tried to modify the supporting code. They added a check to make sure column zero had a valid date and time in it. They didn't change the numerical indices for the range of columns on which to operate (so our records were truncated), nor did they change the indices for the validation. So of course when we check whether column 17 is non-null, I had to actually count the columns to see that this was a check for the primary key being not null. Yes, you read that correctly. For some reason the primary key was at index 17. ... Which means now it's at position 18.

While fixing this, I noticed that the code which exports the code is duplicated... approximately 15 lines apart. I don't know why. At this point, my ability to care was minimal, I just fixed both of them.

So, doing these fixes, I tried again. The "validation" was still failing, even though my columns were properly numerically indexed at this point (did I just say "properly numerically indexed?" ...) I scrolled through the output the web-based tool was providing, and it seems that some of the rows had extra columns inserted in the middle for some reason. Of coure, I had an inkling as to why this was. You see, some people hear "CSV" and they believe "comma-separated" is the complete specification for that data format. I checked the export code, and sure enough it did nothing but dump the data from each field verbatim with commas separating them. And of course some fields had internal commas.

Now if I had reported this to my customer at this point, I have no doubt he would have some of his student employees set to the task of eliminating commas from his database, rather than "bother me" with proper field handling. So I went ahead and fixed that, too.

So, all set, attempt an import again, and the validation says... every blank field is now set to the literal string "nullstring." And the fields are still being split on comma. Whaaaa....? So I look back at the import code, and of course it is simply splitting on comma. But it looks like it was written to take quotes into account? Well of course I should have known better, the import would never had had quotes, so why would it handle them properly? The behavior was to split on commas first, then strip leading and trailing quotation marks from each field, and if any quotes were actually stripped and the field is now empty, set the field value to the literal "nullstring".

Sigh.

Okay, so I fix that. Now it seems to be working. Out of curiosity, I look at the rest of the code base. Horrors. This is only one of at at least four separate but virtually identical scripts that handle import/export in the same way. Copy/Paste FTL! ...

At this point, I remind myself that one of the guys on my team is currently engaged in writing the replacement for this system. All I have to do is put duct tape over the band-aid holding the rubber bands in place, and move on. There is more work to do on other projects.

Reluctantly, I inform the customer that the script is fixed. And he can now proceed, and it works great... You might be curious how exactly it works. Well, I've saved the best for last. You see, first the import script generates all of the SQL necessary to run the import. But, rather than run the SQL through Perl DBI, it uses something I like to call SQL Injection as a Service. Yes, the SQL is then embedded in JavaScript and output back to the user's browser.. What good does it do there? Well, you see, when the "validation" script reports that the data is good, the user clicks on "Import", and the system runs a JS loop which sends a single GET request for every row, where the parameter is the literal SQL to run on the server. And the server dutifully runs each literal SQL query as the JavaScript commands.

Like I said, seems to be working. I'm going to go sit in a corner and cry now.