Today I learned about git flow.
It's a little utility that sits on top of git and makes my development flow much easier.
It will also make you forget git pull --rebase
and git merge --squash branch
immediately, because you won't need it again... maybe...
Oh and it has a VS Code extension!
Basically, the idea is that you have your main
branch with the release, develop
for staging and all other branches have a "prefix" to make clear what they're for these:
Then git flow enables easy rebasing in your feature branches while your developing. You can do mediocre git commits, as we do, and fix those nasty typos. And when the feature is ready, we can just squash those commits into a single big commit with the history preserved.
If you're developing the feature together, you can publish the feature and collaborate as per usual with git. Git flow just makes sure you get all the latest changes from both develop
and the feature branch.
When you're done, git flow helps you merge the feature and tidy up any remnant branches.
Source: Git Kraken
Just incredibly convenient, especially when working together on a software project.
Get it on Github
I got a new toy for my computer.
It's an audio interface that has shiny knobs and buttons.
But I was disappointed right away. If you go into Spotify, there is no way to select an output device. So how do I route my Spotify music and other audio separately?!
I searched for tools and nifty open source thingamajigs. But nothing was doing quite what I wanted.
Until...
I found an obscure Microsoft Windows Help article.
Sound Settings
Advanced sound options
App volume and device preference
and then you can select the output device per app.
It's possible that you're not seeing all your apps, they tend to pop up when you press play, or open them up.
After that they remember your selection!
Incredibly useful if you have more advanced audio tools connected to your windows machine.
Search-engine optimization is a mystery to me.
But if you build an awesome thing and make a Jupyter book, you may as well help people find it, right?!
I was working on ml.recipes, pythondeadlin.es, and data-science-gui.de, and they have a ton of different pages with content. Things I want people to see.
A sitemap is a file that contains a list of all the pages and URLs on a website. This file supposedly helps search engines to crawl and index a website more effectively.
There are two types of sitemaps: XML and HTML. XML sitemaps are machine-readable and intended for search engines, while HTML sitemaps are intended for human visitors to navigate a website.
When a search engine crawls a website, it starts by looking for a sitemap. If the sitemap exists, the search engine will use it to discover all the pages on the website and add them to its index. This ensures that all the pages on the website are indexed and can potentially show up in search results. Pretty useful, right?!
Sitemaps are particularly helpful for websites that have a complex structure, or that contain pages that are difficult to discover through regular crawling, such as pages with limited internal links or pages that are blocked by robots.txt.
Having a sitemap also helps with SEO (search engine optimization) because it provides search engines with a clear view of the website's structure and content. This makes it easier for search engines to understand what a website is about, and can help to improve the website's visibility in search results.
Jupyter books have a ton of different pages.
So it would be great to have them indexed properly by Google and Bing. But Jupyter Books do not natively build a sitemap!
But Jupyter books are based on Sphinx, so we can make use of the Sphinx ecosystem. Namely, extensions!
You can pip install sphinx-sitemap
to get the sitemap plugin or add sphinx-sitemap
to your requirements.txt
if you use Github actions or other CI to deploy your book.
Then in your _config.yml
you need to find your sphinx entry and modify it to contain your html_baseurl
and additional_extensions
, basically looking like this if you have no other sphinx entries:
sphinx: config: html_baseurl: 'https://path_to_book/' extra_extensions: - sphinx_sitemap
If you're curious about the discussion, why this isn't in Jupyter book from the beginning, you can check out this Github issue.
Now my pages are a little bit more optimized for the modern web!
Check out an example sitemap here.
My day has been a bit of a rollercoaster.
I was scrolling Mastodon, when suddenly a post appeared that made it clear I need to switch browsers.
I had been using Brave Browser for a while, I wanted to be more security and privacy-conscious, so I switched away from pure Google Chrome. That felt good... Until it didn't.
Turns out the CEO of Brave Brendan Eich donated to the campaign in California to make same-sex marriage illegal. But that was just the last straw for me, a big straw that had me act immediately. Regardless, there was some weirdness in Brave of pushing their cryptocurrency BAT on users and adding their in-browser crypto-wallet prominently without prior question. The weird "give artists crypto but they will only get it when they sign up and we won't tell them". The whole thing about Chrome not exactly being fast.
Should I give Firefox another try?
I had tried Firefox again a few years back. And I hated the experience.
Nothing was where it's supposed to be. Nothing worked how I expected. It added so much unwanted friction and there was no benefit to using it.
But it was time to give it another shot.
So I busted out my favourite package managers.
brew install firefox
and choco install firefox
and I had firefox on all my devices.
It imported all my bookmarks right away, so it was time to look at extensions.
They're all there!
None of the extensions I used were Chrome-exclusive!
But not just that, Firefox made it easy to simply set it to "strict" when it comes to privacy to lock out finger-printers and other trackers natively.
The first thing I noticed from the last time trying, was that finally Firefox has merged the navigation and search bar like Chrome does (but gives you the choice to separate them).
Now... let's clear one thing up first. Firefox has a couple of settings I did not appreciate.
Those were checked by default.
I don't like facebook, nor my browser to perform studies on me. So I unchecked that. If you want to go the whole 9 yards, you can create a privacy config file fore free here, but that may be a bit more advanced.
In that same tab, however, you can also set the Privacy to strict, which seems like a pretty drastic move, as it might break some pages. But so far all websites have been ok.
We can start out with a classic ad blocker: uBlock Origin
And then we'll get the Privacy Badger to cover any trackers that slip through the cracks of everything before.
Then I found all those other extensions that improve online privacy.
A lot of websites (this one included) use so-called "Content Delivery Networks" as a way to get scripts to run in your browser faster. These are also exceptional for tracking you across websites...
So instead LocalCDN intercepts those calls and serves those scripts directly from your browser. Pretty neat honestly.
Then we can also build a lovely cage around facebook, because I'm still not over Cambridge Analytica.
And finally, we can scrub links that we click. Those links often contain some information where you came from, so clean URLs takes care of that.
One final really neat privacy tool that is similar to the facebook container extension:
Firefox Multi-account Containers
Here the idea is that you can basically build groups of "online personas" that are presented to the web-trackers that might make it past your settings.
Here's a basic approach I took:
I have a container for social media profiles to group those together. My banking is in a specific container. My shopping is in a container. And something of a work profile. Basically considering which specific trackers shouldn't interact with the other trackers to build a full profile of my online activities.
Isn't that neat?!
You can even force some containers to only open websites you added previously to make sure there are no accidental spill-overs by absent-mindedly browsing amazon.
And there are also fun bits Firefox has Chrome didn't have. Every list I looked at pointed out the Firefox Dark Reader, which forces any page you want into dark mode.
Let me be honest here. This one almost made me stop using Firefox again.
If you're used to adding search engines to Chrome, you are used to simply adding an URL with %s
as the query string. This is actually also how Firefox on Android does it!
Firefox on Desktop doesn't and I had to Google how to change the default search engine to something other than Google or Duck Duck Go.
Googling... gave a ton of wrong results!
But I figured it out by trying out the solutions suggested, and here is how you can add a custom search engine.
Let's say you want to add the privacy-aware meta-search engine searx, like searx.work. Follow these steps:
🔎 Add "searx.work"
about:preferences#search
in the settings soSettings
🠒 Search
🠒 Default Search Engine
searx.work
It's a bit counter-intuitive, but it's possible!
Try it out!
I was a bit of work getting all the fun extensions I use set up again. Luckily they mostly have some kind of data export and import.
It feels like I could set up Firefox to be extremely customized and privacy-focused (with a bit of coercing sometimes).
I can't wait to slowly make this the mess I like to live in!
I was trying to solve Day 16 of the Advent of Code, when I realised that sorting values can speed up your code.
But that was the wrong choice of data structure for this problem.
Normally, I would use the set()
data structure, but I was caching the inputs and needed an immutable data structure.
I knew frozenset()
was a thing, but when .add()
didn't work, I decided to work with tuples instead.
But the values were supposed to be unique, and the sort order didn't matter.
In fact, the sort order got in the way!
While frozensets are a bit cumbersome to "add" values to, which I simply did with frozenset(tuple(old_frozen_set) + (new_val,))
, we get an interesting property.
The frozenset
of (0,1)
is equal to the frozenset
of (1,0)
.
So it's immutable, a bit cumbersome to work with, and not available in all Python versions...
But it is extremely useful to check for memebership and equality for unique items.
Why?
Sets are implemented as hash tables, so checking for membership is a O(1)
operation.
How neat!
I was working on the Advent of Code puzzle 2022 day 16.
It's one of those "which one is the highest value path" problems, where recursion comes into play.
It's one of those "this problem is huge and you want to cache your function" problems.
Usually, one would use lru_cach(maxsize=None)
for this one from the functools
library, or if you're on a newer version of Python, you could use the alias for that call which is just cache()
.
This cache works fairly simple:
Save all inputs into a function, pass them as keys to a dictionary, and save the output as value.
Then you can simply build a look-up table for your function call, which significantly speeds up redundant paths.
When you traverse a path and save the visited nodes on a graph, you have to build some sort of hashable function input (because of the caching). This can be a tuple
or a frozenset
for example.
But frozensets lose a lot of the functionality that I like about sets, like adding and removing items. So I chose building up a tuple.
Was that the best decision?
But I learned something that is useful for the future regardless, when we don't just have unique items in an input variable.
Sometime sorting values is faster, despite the cost of a sorting algorithm.
Normally, my intution is that sorting should be avoided, as it's a relatively expensive operation. There's a ton of research into sorting algorithms for a reason.
But when we have a tuple of ('A', 'A', 'B')
and ('A', 'B', 'A')
and the order doesn't matter, these should be treated as "the same" input into a function.
Sorting normalises the input, therefore, giving us a cache entry for the first call and the lookup for the seconds. Otherwise, we would treat them as disparate inputs and re-calculate the entire function.
Quite counter-intuitive, but good to think about.
Most websites use Gzip to compress the data to send it to you.
The idea is that internet speeds are still much slower than compute time, so spending compute to decompress a file that was sent to you, reduces load-times overall, compared to an uncompressed site.
So naturally I enabled it on this website.
This website is generated with a static website generator called Nikola.
It's written in Python, so it's easy for me to add changes, compared to Hugo, Jekyll, or Atom.
Why static sites?
I use Notion to plan a lot of high- and low-level goals in my life.
One example is that I sit down every Saturday morning to review my last week and plan the next.
I try to journal and sometimes I even accomplish a nice streak of journaling, which is always great for my mental health.
Now I learned, that you can add a special version of @Today
to templates.
This version will change to the "today when it is duplicated"!
I definitely want more of those dynamic items in Notion, but I'm just happy this exists!
It's Advent of Code! 🎄
Every year I add to my utils.py
and try to make the coding experience a bit nicer.
This year, I discovered that once again I didn't update the Readme after the last year. I also didn't finish last year, must've been a stressful time...
On this website and places like the profile Readme on Github, I figured out that you can use HTML comments to insert text at certain locations, basically like a placeholder.
So I could do this for the Advent of Code readme, to have a place for the stars to go once my pytest
test cases pass, right?
I was reading through the Mastodon API documentation to figure out how to get my posts.
On this website, I have a small widget that shows my last "Tweets" but unfortunately I locked myself out of my Twitter account after accidentally setting my age to under 13, while deleting personal information. This means my Twitter timeline isn't available, so I made the jump and figured out how to use the Mastodon API to pull a user's timeline.
For Twitter this is an entire thing:
Gotta love a bit of self-referential writing.
I read about TIL (Today I Learned) posts by Simon Willison, as an easy way to document small snippets of learning and get in the habit of writing and publishing.
Then I saw multiple people do this inspired by J Branchaud: