I was working on the Advent of Code puzzle 2022 day 16.

It's one of those "which one is the highest value path" problems, where recursion comes into play.

It's one of those "this problem is huge and you want to cache your function" problems.

Usually, one would use lru_cach(maxsize=None) for this one from the functools library, or if you're on a newer version of Python, you could use the alias for that call which is just cache().

This cache works fairly simple:

Save all inputs into a function, pass them as keys to a dictionary, and save the output as value.

Then you can simply build a look-up table for your function call, which significantly speeds up redundant paths.

When you traverse a path and save the visited nodes on a graph, you have to build some sort of hashable function input (because of the caching). This can be a tuple or a frozenset for example.

But frozensets lose a lot of the functionality that I like about sets, like adding and removing items. So I chose building up a tuple.

Was that the best decision?

Probably not.

But I learned something that is useful for the future regardless, when we don't just have unique items in an input variable.

Sometime sorting values is faster, despite the cost of a sorting algorithm.

Normally, my intution is that sorting should be avoided, as it's a relatively expensive operation. There's a ton of research into sorting algorithms for a reason.

But when we have a tuple of ('A', 'A', 'B') and ('A', 'B', 'A') and the order doesn't matter, these should be treated as "the same" input into a function.

Sorting normalises the input, therefore, giving us a cache entry for the first call and the lookup for the seconds. Otherwise, we would treat them as disparate inputs and re-calculate the entire function.

Quite counter-intuitive, but good to think about.