I profiled with `cargo flamegraph` and learned that the HashMap
operations were taking forever. So, I replaced the HashMap with a
straight Vec*, which bought a 10x improvement. Further profiling revealed
complaints with re-finding the guard starting position in every single
blocked-walk, so I added a way to provide the position to the
constructor (since it never changes). This brought the total speedup to
15x, from ~1.5s on the release build to ~0.1s.
*The memory usage (valgrind, valgrind --tool=massif, valgrind
--tool=dhat) between the two data structures seems comparable. I guess
if the walks were significantly sparser on the map the HashMap might
win, but apparently they aren't, so we end up using basically as much
memory in the HashMap anyway.