Wednesday, July 8, 2015

Whether you like it or not, no one should ever claim to be a data analyst until he or she has done string manipulation.

I am reading Gaston Sanchez' book Handling and Processing Strings in R (pdf).

In the preface, I found the following quote, to which I wholeheartedly agree:

Perhaps even worse is the not so uncommon believe that string manipulation is a secondary non-relevant task. People will be impressed and will admire you for any kind of fancy model, sophisticated algorithms, and black-box methods that you get to apply. Everybody loves the haute cuisine of data analysis and the top notch analytics. But when it comes to processing and manipulating strings, many will think of it as washing the dishes or pealing and cutting potatos. If you want to be perceived as a data chef, you may be tempted to think that you shouldn’t waste your time in those boring tasks of manipulating strings. Yes, it is true that you won’t get a Michelin star for processing character data. But you would hardly become a good data cook if you don’t get your hands dirty with string manipulation. And to be honest, it’s not always that boring. Whether you like it or not, no one should ever claim to be a data analyst until he or she has done string manipulation.

No comments:

Post a Comment