I shared 3 links to Python, machine learning and AI in my newsletter every week.
For 2022, what were the favourites of my 1000 subscribers?
Explainer Dashboard
Explainability is becoming increasingly important.
This package makes it easy to deploy a dashboard that shows various model explainability metrics and visualizations among others:
- Shap values
- Permutation importances
- Partial dependence plots
- Shap interaction values
- Visualisation of individual decision trees
- Precision plots, confusion matrix, ROC AUC plot, PR AUC plot, for classifications
- Goodness-of-fit plots and residual plots for regressions.
github.com/oegedijk/explainerdashboard
LOFO Importance
Feature importance can be a little misleading.
Instead of asking if a feature is important at all, even if correlations are present. It asks how important a feature is relative to other features that are present.
This can be an important difference!
Leave one feature out importance is a way to make robust feature selection in machine learning.
github.com/aerdem4/lofo-importance
100 ML Tips
I am very happy to say that my 100 tips for machine learning video was really popular!
Stream Processing
Almost any ML practitioner works on historical data.
The switch to real-time stream processing with tools like Kafka can be quite the leap.
But Chip Huyen, the ML engineer extra-ordinaire shared insights, how the workflow changes
huyenchip.com/.../stream-processing-for-data-scientists
Awesome Diffusion Models
Diffusion models are all the rage right now.
Keeping up is basically impossible.
So here's at least an Awesome List of Diffusion Models!
github.com/heejkoo/Awesome-Diffusion-Models
What's going on with the big tech hiring freeze
That big hiring freeze was weird.
Especially when you were looking around, there seemed to be a bit of a divide between seniority and software roles.
This analysis was really interesting!
Language Generalization
Generalization... that elusive goal of every machine learning modeller.
Turns out language is very difficult to generalize and I loved some of the insight into different modalities of generalization in this article.
To Understand Language is to Understand Generalization
Machine Learning MOOC in Weather and Climate
People were really interested in the machine learning in weather and climate MOOC that we're building at ECMWF.
That is lovely to hear, since it's a ton of work that is going into this project!
Subscribe to receive insights from Late to the Party on machine learning, data science, and Python every Friday.
Python's match-case
Apparently, I shared the PEP for the Python match-case
statement.
Good to hear!
Throughout advent of code, I have been loving the feature more and more!
Transformers United
Stanford uploaded their transformer course to Youtube.
Here's the playlist:
Idiot proof git aliases
I had a phase when I wanted to improve my git game.
And I did!
This article was really important in this process:
softwaredoug.com/ ...idiot-proof-git-aliases
Elon Code Review
This website was such a lovely troll!
Parsr
Parsr works as a minimal document extraction tool
Layout Parser
They really liked document extraction tools.
No wonder. So much information is tucked away in PDFs.
github.com/Layout-Parser/layout-parser
Explainpaper
This webapp went viral.
Upload a paper.
Get an explanation.
Subscribe to receive insights from Late to the Party on machine learning, data science, and Python every Friday.
Books about ML Ops
My post about books for MLOps somehow got really popular.
I'm glad it resonates!
dramsch.net/articles/mlops-books/
Quantus
Since shap is in a bit of a state currently, people are looking for other explainability tools.
Quantus is one of these tools.
github.com/understandable-machine-intelligence-lab/Quantus
Dangit Git?!
We all mess up sometimes.
Luckily we have git to revert those errors.
Dangit Git is a short collection of the most common "help, I messed up, fix this please"-commands.
Machine Learning Engineering Flashcards
Anyone at NormConf can now cross off another Bingo square.
Regardless, these machine learning engineering flashcards were quite popular.
github.com/b7leung/MLE-Flashcards
Easier matplotlib subplots with mosaic
I was as excited about this one as the subscribers.
An easier way to make complicated subplots in matplotlib?!
Yes, please!
matplotlib.org/ ...mosaic.html
Conclusion
It was been a pretty fantastic year for open-source and real-world machine learning.
People still want to wrangle information out of their PDFs.
Explainability has been a huge topic.
We all seem to struggle with git sometimes.
And some of my content even made the list, which is lovely!
On to 2023!
Subscribe to receive insights from Late to the Party on machine learning, data science, and Python every Friday.