Like many others, I've recently enjoyed chess a lot.

Like many others, I've recently enjoyed data science a lot.

Like many others, I enjoy shoe-horned analogies, but hear me out on this one.

Learning about chess has taught me a lot about data science.

The Rules

Don't worry. I am not going to explain the rules of chess to you.

However, the rules are relatively straightforward, and anyone above Elo 500 arguably understands how the pieces move.

At that point, people learned all the individual movement types and probably also about castling.

Then around an ELO of 1000, people learn about En Passant and will never miss a chance to use it. Ever. Probably until they played another 100 games and get slightly bored of the move.

That's all there is to chess, right?

I mean, yes, sure, except for all the strategy and tactics.

Playing someone outside of your ELO feels like having your mind read. Every move you deliberate on they have an immediate and significantly better response to.

Levels of Understanding

Professional chess players study something called theory. Yes, chess theory.

How does a particular movie sequence impact the long-term outcome of the game? How can I incrementally improve my position and maybe even win a pawn?

I, on the other hand, sometimes lose a queen, the most powerful piece, in a one-move blunder.

Would reading a 400-page book on the intricacies of the move Queen on B6 help me? Absolutely not.

I don't have the necessary context that this book could live in. There is no fertile ground this new-found knowledge could fall upon.

Finally, the data science analogy starts. You made it!

The Intricacies of Queen to B6

Chess pieces chilling on the board

This is the reason we teach strong fundamentals to beginners in data science.

Focus on fundamentals like:

  • Keep models simple at first to avoid overfitting.

  • Always use proper model validation, like I lay out in my Understanding Machine Learning Validation ebook.

  • Use exploratory data analysis to get a feel for the data and test how well it was cleaned.

A beginner data scientist has no context for the intricacies of the difference between a K-Fold and a Group Shuffle Fold validation scheme.

A beginner chess player benefits most from learning solid openings and not blundering queens in one move.

Good habits.

Solid foundations.

Constant Improvements - Continuous Learning

Data science is a practice. Just like chess, you get better by studying, but in the end, you need practice.

How many times have I sat through a Gotham Chass Recap video where Levy says:

and this move has never been played before!

No recorded game has reached the state of the board, despite millions of people playing this game through the decades.

Every analysis is ever so slightly different, just like many games in chess will teach you a new way you don't want to lose again.

This is why I advocate for learning with applied projects and incrementally more challenging pieces of work.

Replace me with a Computer

Cute doggie at a computer

My current skill level in chess is on a level where a chess program on a graphing calculator could beat me.

Can you automate large parts of data science analysis?

Yes.

Are those the bits that will replace a team of experienced data scientists?

No.

Data Science is more than just the analysis.

It is about asking whether a question is even worth asking. It is about finding the hidden patterns in data, like finding a systematic pattern in missing variables.

While machine learning and data science principles are important and powerful tools, it takes skill to iterate on the EDA or model architectures in the machine learning setup.

And don't get me started on any problem that isn't a table or a regularly sampled cube. Looking at you weather data!

Move 37

I love machine learning. I loved watching Alpha Go play Move 37 in a game I barely grasped and every commentator losing their mind over the creativity of a computer program in the game of Go.

I love automating everything I can.

I think many jobs will slowly be automated included some parts of data science over time.

In the end, there will be space for human innovation and understanding, particularly in complex topics, for quite some time.

Also, who would want to miss the fun drama around chess tournaments or the incredibly entertaining Tiktoks from Ilona Maher from the US Rugby Team. Humans can be very entertaining.

Human creativity should not be under-estimated. Neither in Tiktoks, Chess, nor Data Science.

Is it worth it?

Yes.

Chess is a lot of fun.

Yes.

Data Science can be very full-filling.

Yes.

Machine Learning is what I work in now and it feels like my brain is being challenged every day.

Learning any of these is a fantastic journey and have enriched my life.