I shared 3 links to Python, machine learning and AI in my newsletter every week.

For 2022, what were the favourites of my 1000 subscribers?

Explainer Dashboard

Explainability is becoming increasingly important.

This package makes it easy to deploy a dashboard that shows various model explainability metrics and visualizations among others:

  • Shap values
  • Permutation importances
  • Partial dependence plots
  • Shap interaction values
  • Visualisation of individual decision trees
  • Precision plots, confusion matrix, ROC AUC plot, PR AUC plot, for classifications
  • Goodness-of-fit plots and residual plots for regressions.

github.com/oegedijk/explainerdashboard

LOFO Importance

Feature importance can be a little misleading.

Instead of asking if a feature is important at all, even if correlations are present. It asks how important a feature is relative to other features that are present.

This can be an important difference!

Leave one feature out importance is a way to make robust feature selection in machine learning.

github.com/aerdem4/lofo-importance

100 ML Tips

I am very happy to say that my 100 tips for machine learning video was really popular!


Stream Processing

Almost any ML practitioner works on historical data.

The switch to real-time stream processing with tools like Kafka can be quite the leap.

But Chip Huyen, the ML engineer extra-ordinaire shared insights, how the workflow changes

huyenchip.com/.../stream-processing-for-data-scientists

Awesome Diffusion Models

Diffusion models are all the rage right now.

Keeping up is basically impossible.

So here's at least an Awesome List of Diffusion Models!

github.com/heejkoo/Awesome-Diffusion-Models

What's going on with the big tech hiring freeze

That big hiring freeze was weird.

Especially when you were looking around, there seemed to be a bit of a divide between seniority and software roles.

This analysis was really interesting!

blog.interviewing.io

Language Generalization

Generalization... that elusive goal of every machine learning modeller.

Turns out language is very difficult to generalize and I loved some of the insight into different modalities of generalization in this article.

To Understand Language is to Understand Generalization

Machine Learning MOOC in Weather and Climate

People were really interested in the machine learning in weather and climate MOOC that we're building at ECMWF.

That is lovely to hear, since it's a ton of work that is going into this project!

ecmwf.int/mlwc-mooc

Subscribe to receive insights from Late to the Party on machine learning, data science, and Python every Friday.

Python's match-case

Apparently, I shared the PEP for the Python match-case statement.

Good to hear!

Throughout advent of code, I have been loving the feature more and more!

PEP 0622

Transformers United

Stanford uploaded their transformer course to Youtube.

Here's the playlist:


Idiot proof git aliases

I had a phase when I wanted to improve my git game.

And I did!

This article was really important in this process:

softwaredoug.com/ ...idiot-proof-git-aliases

Elon Code Review

This website was such a lovely troll!

eloncodereview.com

Parsr

Parsr works as a minimal document extraction tool

github.com/axa-group/Parsr

Layout Parser

They really liked document extraction tools.

No wonder. So much information is tucked away in PDFs.

github.com/Layout-Parser/layout-parser

Explainpaper

This webapp went viral.

Upload a paper.

Get an explanation.

explainpaper.com

Subscribe to receive insights from Late to the Party on machine learning, data science, and Python every Friday.

Books about ML Ops

My post about books for MLOps somehow got really popular.

I'm glad it resonates!

dramsch.net/articles/mlops-books/

Quantus

Since shap is in a bit of a state currently, people are looking for other explainability tools.

Quantus is one of these tools.

github.com/understandable-machine-intelligence-lab/Quantus

Dangit Git?!

We all mess up sometimes.

Luckily we have git to revert those errors.

Dangit Git is a short collection of the most common "help, I messed up, fix this please"-commands.

dangitgit.com

Machine Learning Engineering Flashcards

Anyone at NormConf can now cross off another Bingo square.

Regardless, these machine learning engineering flashcards were quite popular.

github.com/b7leung/MLE-Flashcards

Easier matplotlib subplots with mosaic

I was as excited about this one as the subscribers.

An easier way to make complicated subplots in matplotlib?!

Yes, please!

matplotlib.org/ ...mosaic.html

Conclusion

It was been a pretty fantastic year for open-source and real-world machine learning.

People still want to wrangle information out of their PDFs.

Explainability has been a huge topic.

We all seem to struggle with git sometimes.

And some of my content even made the list, which is lovely!

On to 2023!

Subscribe to receive insights from Late to the Party on machine learning, data science, and Python every Friday.