Well, all you there in the IT world. Just understand that it's not rocket science. I am not trying to discourage but that's the truth, and sooner it is accepted the better it is. Here simple layman ideas find a better place because the people you are serving don't like rockets and scientists. All they are interested in is 2 + 2 = 4 and NOT how it is calculated. And when time budget and resources are burrying the projects and your tired IT arse further down in the ground you don't think about rockets, you think how to escape.
It sounds sh** but I am talking from my experience. So we are going through a data migration exercise and we have to migrate addresses from one system into another. And in the source system the addresses are present in free text format which can hold 4 lines and 160 characters in all. But the target system accepts the data in a particular format - Street in one field, Building in one field, Town, Region, Post Code etc etc - and can hold only upto 120 characters. And no existing functionality had to be changed in the target system. So these sources addresses had to be formatted and accommodated in the target format.
And since the source addresses were free form text and users being ethnically, psychologically and emotionally different, each of those 20,000 addresses had their own way of being written there.
So yours truly thought why can't we clean such addresses and migrate them across to target system by automating the cleaning process instead of asking the users to clean the whole data. I thought like a rocket scientist. While entering an address how a person usually thinks? He writes city the last, before that street, before that building and before that first line of address like c/o So and So. Oh no. There can be just two lines of address also, not necessarily all the lines will be present. Wait, some lines may be more than 30 characters, some more than 20. Some lines will have explicit house (Flaunders House, Westminster House..) or street (Fleet Street, Dame Street, Park Road..) phew!
And so I started building this cleaner routine in REXX and the code kept on becoming fatter and fatter. At last out of those 20,000 addresses this routine cleaned up around 90% of the addresses. Phew! I thought this will be my magnum opus of this year. Two days before production rollout, business users tell me - we are not able to map the address of source system properly to the target system. I said what the f*** (in my mind) - that may not happen always because addresses are not structured properly and need lot of readjustments before they are migrated across. They say - no, we want one to one mapping because our dim brains cannot understand what you are talking about. Somehow, we escaped saying "You didn't recognize this during UAT and previous testing sprints, now suddenly you are waking up from your slumber when we are two days away from going live. We cannot change code at this stage and this has to go the way it is now." We had to hold them on ransom and get them to accept it the way it was.
But it triggered in me a thought that we shouldn't really think so much when it comes to data migrations. There should be one to one mapping. Forget about intelligent address deciphering routines which might have taken inputs from psychology, ergonomics, ethicity or even evolution. Ultimately what matters is your audience should understand what's really happening.
It's true. Because now when I look at my magnum opus, I myself cannot understand what it does!
Keeping the logic simple and understandable is very important, so that in future if someone looks at the code things become clear in just an instant. That's the reason, I think, someone once said - it is very easy to write in difficult words, but very difficult to write in simple words.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment