Jesper Dramsch , , read in 5 minutes

Data Classes? In this economy?!

What is Advent of Code?

In this little series, I want to write about the thought processes I have behind solving the puzzles presented in Advent of Code.

During the Advent of Code small coding puzzles are presented with varying levels of difficulty. They're always Christmas-themed text puzzles where you help elves out of some mess to save the holidays. The puzzles are deliberately ambiguous to not bias puzzlers into a certain solution, they always have an example input in the text to explain some details and an input file for each day that comes in the form of a text file you can download. The solution is a simple number or a string you enter, which is checked by the website automatically. After you finish part 1, part 2 unlocks, which is a variation on part 2. This can be a little twist on part 1 or a large twist that changes the problem formulation into a problem that would take brute force solutions to compute longer than the heat death of the universe would take.

Find my solutions here: github.com/JesperDramsch/advent-of-code

These descriptions won't be a perfect narrative, but rather how I think about these problems and where I see different solutions. I apologise for the possibility of rambling!

The example input

Checking that example input, it got awfully familiar. That's a Linux terminal!

$ cd /
$ ls
dir a
14848514 b.txt
8504156 c.dat
dir d
$ cd a
$ ls
dir e
29116 f
2557 g
62596 h.lst
$ cd e
$ ls
584 i
$ cd ..
$ cd ..
$ cd d
$ ls
4060174 j
8033020 d.log
5626152 d.ext
7214296 k

I guess we're building a bot or some virtual environment.

The implementation

I am wondering if I should build some custom classes for this one. Might be interesting to work with!

I decided to built a class for Directory and File so I can build up that structure in a small-scale way.

I first used lists to get the children and files, but searching those lists was weird, I couldn't find children by name, so I decided to use dictionaries instead. That means I have to iterate ".values()" but it solves all other problems and provides O(1) constant complexity access.

The I can simply recursively traverse the directory tree:

def find_small_dirs(tree, size):
    small_dirs = []
    for child in tree.children.values():
        if child.size() <= size:
            small_dirs.append(child.size())
        small_dirs += find_small_dirs(child, size)
    return small_dirs

And return the sum()!

Part 2

First I figured, I should do this with proper comparisons, but it turns out I just need to turn around the comparison and then take the minimum min().

def find_dirs(tree, size, over=False):
    small_dirs = []
    comp = ge if over else le
    for child in tree.children.values():
        if comp(child.size(), size):
            small_dirs.append(child.size())
        small_dirs += find_dirs(child, size, over)
    return small_dirs

I import great equals ge and and lesser equal le from the operator module to make the comparison nice and short.

It was nice using data classes for this task and playing around with the operator module.

Refactor

I looked at the Reddit solution Megathread and realised I should have used match for the command matching in the parse, so I adjusted it to this pretty thing:

def parse_directory(data):
    tree = Directory("/")
    dir = tree
    for line in data:
        match line.split(" "):
            case "$", "cd", "..":
                dir = dir.parent
            case "$", "cd", "/":
                dir = tree
            case "$", "cd", target_dir:
                dir = dir.children[target_dir]
            case "$", "ls":
                pass
            case "dir", name:
                dir.add_child(name)
            case start, name:
                dir.add_file(name, int(start))
    return tree

Much nicer than what I did with if statements and splitting.

def parse_directory(data):
    tree = Directory("/")
    dir = tree
    for line in data:
        if line.startswith("$"):
            # Command
            if "cd" in line:
                # Change directory
                target_dir = line.split(" ")[2].strip()
                if ".." == target_dir:
                    dir = dir.parent
                elif "/" == target_dir:
                    dir = tree
                else:
                    dir = dir.children[target_dir]
            elif "ls" in line:
                # List directory
                pass
        else:
            start, name = line.split(" ")
            # File or directory
            if line.startswith("dir"):
                # directory
                dir.add_child(name)
            else:
                # File
                dir.add_file(name, int(start))
    return tree

Error Log

I coded a lot, before testing piece-wise so I generated a few errors along the way:

  • elif "ls" in line: SyntaxError: invalid syntax forgot a closing bracket
  • dir = dir.children[dir.children.index(line.split(" ")[1])] ValueError: 'cd' is not in list forgot that commands have 2 spaces
  • ValueError: '/' is not in list follow up to error before where I don't return to root properly, I'll split the command as a result
  • ValueError: 'cwdpn' is not in list – no idea what that even means, I made a logic error somewhere. Gotta go debug!
  • dir.add_file(line.split(" ")) TypeError: Directory.add_file() missing 1 required positional argument: 'name' gotta broadcast the tuple!
  • start, name = line.split(" ")[1] ValueError: too many values to unpack (expected 2) - oops, I should maybe be a bit more careful
  • AttributeError: 'str' object has no attribute 'size' changed to a dictionary representation and didn't adjust the size calculation
  • TypeError: unsupported operand type(s) for +: 'int' and 'str' didn't convert the sizes to an integer in the File class
  • ValueError: invalid literal for int() with base 10: 'grgj' another very nice mistake parsing. Gotta take it slower. The sizes are first in the files.
  • AttributeError: 'NoneType' object has no attribute 'print' forgot to return the tree from the parse_directory() function
  • Used if instead of elif, which caused an unexpected output on the $ cd .. command
  • Part 2: wrong answer: 1186199 too high. So I made some logic error.