I love it when an applied scientist asks me how to get into data science or machine learning.
Sure, any scientist, but geologists, biologists, oceanographers and environmental scientists especially. Applied disciplines that are commonly looked down upon by fields like physics or math. Those are the gems in the rough.
I hear the outcry already:
But math! But statistics! Those are mud people!
But here's the thing.
Few people have an intuitive understanding of complexity, like applied scientists. Few people can make a decision and gain insight in an environment where new data isn't available, and uncertainty is high.
Applied scientists know data.
It's a marvel to watch an applied scientist dig through a data set. Depending on the scientist, it may also be good to keep the kids out of earshot, but that's another story.
Data is messy.
Nothing pokes a hole in well-laid-out plans like getting the real-world data set. Andrew Ng, after teaching the world machine learning, now teaches the world how important data is. And for a good reason, your neural network, your random forest, they're random numbers in a computer until they are conditioned on data.
Applied scientists have seen it before.
An applied scientist has seen it before, even if the imposter syndrome is there because this knowledge is hard to quantify. Analyzing the genes from a dinosaur's bone marrow that is thousands of years dead or understanding how a region formed over millions of years after looking at a weathered cliff are awe-inspiring feats.
Those are hard skills. Those are skills that teach you to handle data. Those are skills that are hard to teach and hard to learn.
Teaching basic statistical concepts is easy after that.
That’s why I love when applied scientists are interested in data science.