Data Classes? In this economy?!
What is Advent of Code?
In this little series, I want to write about the thought processes I have behind solving the puzzles presented in Advent of Code.
During the Advent of Code small coding puzzles are presented with varying levels of difficulty. They're always Christmas-themed text puzzles where you help elves out of some mess to save the holidays. The puzzles are deliberately ambiguous to not bias puzzlers into a certain solution, they always have an example input in the text to explain some details and an input file for each day that comes in the form of a text file you can download. The solution is a simple number or a string you enter, which is checked by the website automatically. After you finish part 1, part 2 unlocks, which is a variation on part 2. This can be a little twist on part 1 or a large twist that changes the problem formulation into a problem that would take brute force solutions to compute longer than the heat death of the universe would take.
Find my solutions here: github.com/JesperDramsch/advent-of-code
These descriptions won't be a perfect narrative, but rather how I think about these problems and where I see different solutions. I apologise for the possibility of rambling!
The example input
Checking that example input, it got awfully familiar. That's a Linux terminal!
$ cd / $ ls dir a 14848514 b.txt 8504156 c.dat dir d $ cd a $ ls dir e 29116 f 2557 g 62596 h.lst $ cd e $ ls 584 i $ cd .. $ cd .. $ cd d $ ls 4060174 j 8033020 d.log 5626152 d.ext 7214296 k
I guess we're building a bot or some virtual environment.
The implementation
I am wondering if I should build some custom classes for this one. Might be interesting to work with!
I decided to built a class for Directory
and File
so I can build up that structure in a small-scale way.
I first used lists to get the children and files, but searching those lists was weird, I couldn't find children by name, so I decided to use dictionaries instead. That means I have to iterate ".values()" but it solves all other problems and provides O(1)
constant complexity access.
The I can simply recursively traverse the directory tree:
def find_small_dirs(tree, size): small_dirs = [] for child in tree.children.values(): if child.size() <= size: small_dirs.append(child.size()) small_dirs += find_small_dirs(child, size) return small_dirs
And return the sum()
!
Part 2
First I figured, I should do this with proper comparisons, but it turns out I just need to turn around the comparison and then take the minimum min()
.
def find_dirs(tree, size, over=False): small_dirs = [] comp = ge if over else le for child in tree.children.values(): if comp(child.size(), size): small_dirs.append(child.size()) small_dirs += find_dirs(child, size, over) return small_dirs
I import great equals ge
and and lesser equal le
from the operator
module to make the comparison nice and short.
It was nice using data classes for this task and playing around with the operator module.
Refactor
I looked at the Reddit solution Megathread and realised I should have used match
for the command matching in the parse, so I adjusted it to this pretty thing:
def parse_directory(data): tree = Directory("/") dir = tree for line in data: match line.split(" "): case "$", "cd", "..": dir = dir.parent case "$", "cd", "/": dir = tree case "$", "cd", target_dir: dir = dir.children[target_dir] case "$", "ls": pass case "dir", name: dir.add_child(name) case start, name: dir.add_file(name, int(start)) return tree
Much nicer than what I did with if statements and splitting.
def parse_directory(data): tree = Directory("/") dir = tree for line in data: if line.startswith("$"): # Command if "cd" in line: # Change directory target_dir = line.split(" ")[2].strip() if ".." == target_dir: dir = dir.parent elif "/" == target_dir: dir = tree else: dir = dir.children[target_dir] elif "ls" in line: # List directory pass else: start, name = line.split(" ") # File or directory if line.startswith("dir"): # directory dir.add_child(name) else: # File dir.add_file(name, int(start)) return tree
Error Log
I coded a lot, before testing piece-wise so I generated a few errors along the way:
elif "ls" in line: SyntaxError: invalid syntax
forgot a closing bracketdir = dir.children[dir.children.index(line.split(" ")[1])] ValueError: 'cd' is not in list
forgot that commands have 2 spacesValueError: '/' is not in list
follow up to error before where I don't return to root properly, I'll split the command as a resultValueError: 'cwdpn' is not in list
– no idea what that even means, I made a logic error somewhere. Gotta go debug!dir.add_file(line.split(" ")) TypeError: Directory.add_file() missing 1 required positional argument: 'name'
gotta broadcast the tuple!start, name = line.split(" ")[1] ValueError: too many values to unpack (expected 2)
- oops, I should maybe be a bit more carefulAttributeError: 'str' object has no attribute 'size'
changed to a dictionary representation and didn't adjust the size calculationTypeError: unsupported operand type(s) for +: 'int' and 'str'
didn't convert the sizes to an integer in the File classValueError: invalid literal for int() with base 10: 'grgj'
another very nice mistake parsing. Gotta take it slower. The sizes are first in the files.AttributeError: 'NoneType' object has no attribute 'print'
forgot to return the tree from theparse_directory()
function- Used
if
instead ofelif
, which caused an unexpected output on the$ cd ..
command - Part 2:
wrong answer: 1186199
too high. So I made some logic error.