Back in May, writer Jun Wu told in her blog how Perl excels at text manipulation. She often uses it to tidy data sets, a necessity as data is often collected with variations and cleaning it up before use is a necessity. She goes through many one-liners which help make that easy.
Having old reliables is my key to success. Ever since I learned Perl during the dot com bubble, I knew that I was forever beholden to its powers to transform.
You heard me. Freedom is the word here with Perl.
When I'm coding freely at home on my fun data science project, I rely on it to clean up my data.
In the real world, data is often collected with loads of variations. Unless you are using someone's "clean" dataset, you better learn to clean that data real fast.
(Score: 2) by gringer on Friday September 20 2019, @03:04AM
R is terrible for text processing. Extracting matches from regular expressions involves compiling the expressions and parsing a list. Some of the newest tidyverse packages are greatly improving the syntax, but the speed for text manipulation still remains an issue.
Ask me about Sequencing DNA in front of Linus Torvalds [youtube.com]