It sounds sh** but I am talking from my experience. So we are going through a data migration exercise and we have to migrate addresses from one system into another. And in the source system the addresses are present in free text format which can hold 4 lines and 160 characters in all. But the target system accepts the data in a particular format - Street in one field, Building in one field, Town, Region, Post Code etc etc - and can hold only upto 120 characters. And no existing functionality had to be changed in the target system. So these sources addresses had to be formatted and accommodated in the target format.
And since the source addresses were free form text and users being ethnically, psychologically and emotionally different, each of those 20,000 addresses had their own way of being written there.
So yours truly thought why can't we clean such addresses and migrate them across to target system by automating the cleaning process instead of asking the users to clean the whole data. I thought like a rocket scientist. While entering an address how a person usually thinks? He writes city the last, before that street, before that building and before that first line of address like c/o So and So. Oh no. There can be just two lines of address also, not necessarily all the lines will be present. Wait, some lines may be more than 30 characters, some more than 20. Some lines will have explicit house (Flaunders House, Westminster House..) or street (Fleet Street, Dame Street, Park Road..) phew!

But it triggered in me a thought that we shouldn't really think so much when it comes to data migrations. There should be one to one mapping. Forget about intelligent address deciphering routines which might have taken inputs from psychology, ergonomics, ethicity or even evolution. Ultimately what matters is your audience should understand what's really happening.
It's true. Because now when I look at my magnum opus, I myself cannot understand what it does!
Keeping the logic simple and understandable is very important, so that in future if someone looks at the code things become clear in just an instant. That's the reason, I think, someone once said - it is very easy to write in difficult words, but very difficult to write in simple words.
No comments:
Post a Comment