does it deal with cases like "twone", "nineight" and so on? it doesn't appear so to me because of the leading sed statements would commit the interpretation regardless of what happens next, but perhaps there is something subtle im not seeing.
It uses capture groups and puts them back in the replacement (the \1 ), so a match of "one" is replaced with (thematch)1(thematch). So eightwo would replace the two with two2two and end up like eightwo2two, so then the t is preserved and eight is found in a later step and then you end up with eight8eightwo2two, and can solve that using part1 only looking for numbers.
As others have said, part 2 of today's was really difficult. I finally solved it using Python regex `overlapped=true`, but it was very tricky. The irritation of having all of the test cases passing, but it failing for my challenge input!
I hope it doesn't scare off newcomers, but I already know a few who have given up on part 2.
> As others have said, part 2 of today's was really difficult.
I suspect it was purpose-built to foil ChatGPT. I solved it on my own first and then went back and tried to help GPT4 write a solution. Even after carefully walking it through a strategy, discussing edge cases, and having it outline a solution in pseudocode, it still failed completely to translate our conversation into working code. (It didn't even work for non-tricky input.) It did anticipate the edge cases though, so that's something.
Did anyone have any better luck with ChatGPT? I wonder if LLM-resistant puzzles and generally greater difficulty (at least for the second star) will be a theme this year.
I got it to work with ChatGPT, took around 5 attempts, but that was mainly me not understanding all the edge cases. Once I came up with a strategy that would work, ChatGPT gave me working code.
My strategy was not efficient, but did work.
I walked the string twice, first LTR and replaced all found strings with numbers, then walked right to left, and replaced all backwards strings to numbers.
Then took the first digit from the left walked string, and the last from the right walked string.
Another trick is you can put some regex engines in right-to-left mode, so my lazy (and admittedly relatively expensive cpu-wise) solution was to match ltr for first and rtl for last.
People who went with posh tools like regexpen were stumped, people with an C64 BASIC inspired for loop had little difficulty. The problem was optimized for the unga bunga
Same here. I would've really like if the spec specifically mentioned the possibility of that one edge case ahead of time instead of having to sift through the 1000 lines of input.
No hate on AOC though, I really respect all the hard work that goes into it.
Unless the question has been edited recently, it did. There are multiple lines in the second example input that show the overlap:
> eightwothree
> 4nineeightseven2
> zoneight234
I test my AoC solutions incrementally by printing output, so I found that I was failing to produce the correct list of numbers in a line right away. I suppose if you're taking a faster approach and just trying to extract the first and last numbers that it's easier to miss. It's always a good idea to look at the example input, though.
I used BurntSushi's excelent aho-corasick, which unsurprisingly implements the Aho–Corasick algorithm (overkill I know). It did take me a while to realize though that there are overlaps which meant that my code worked on the example input but not on the real input (fortunately the library has you covered in both cases). I have the code on my GitHub but solve it yourself first, it's fun.
I first brute-forced the solve (on each line, find and rfind every digit, keep the smallest find and the largest rfind), that worked fine out of the box as that's not sensible to overlap.
I then figured I'd use aho-corasick because that was an opportunity to, and there's no kill like overkill.
I then proceeded to waste half an hour because I didn't read the documentation, so I didn't see that `find_iter` finds non-overlapping occurrences. I assumed it found overlapping occurrences since that's what the algorithm does out of the box.
I also thought to use a Trie for part two, thinking of it similarly to a typeahead problem, but I hadn't heard of Aho-Corasick.
I thought it was over-complicating a day one solution, so I ended up brute-forcing similar to above solutions. Still, it is nice to learn about this algorithm. I may come back to give it a shot and compare runtimes later. Thanks!
I heard the question was difficult before solving it, and then I was surprised when I didn't hit any speed bumps. The overlapping case didn't even occur to me. It turns out I stumbled upon a simple solution: use two regexes. One to match the first digit, and one to match the last. Then the overlapping is a total non-issue.
Yeah, that's what I did for part 2 and there were no issues. I did try to solve it with a single regex at first, but for some reason I wasn't able to figure out the right combination of lazy/greedy matches to make it work (so I didn't even get far enough to discover the overlap issue).
I agree, the first day is usually trivial, that wasn't the case here. I also tried using regex but ended up implementing a much simpler solution using index() and rindex().
The inputs are actually not as tough as they could be. It’s possible to beat this puzzle with brittle regex that would fail on something like oneightwone.
Without going into spoilers, its interesting that people jumped to regex to solve this. For me that was a fairly non intuitive when I first saw the problem(both parts).
What jumped to me is the problem statement indicated a finite number of states and I crafted a solution based on that information. But its really cool to see how we all jump to different implementations.
While I didn't go with regexes per se, I did go with I think a fairly "standard" solution of iterating the string and finding matches, and got caught in the trap that a lot of people are mentioning.
In the end, I figured out a manual way around that, but then afterwards I realised that the iteration isn't really necessary given the problem at hand, and ended up going a completely different route that was a lot faster.
What surprised me is that I'd thought of the alternative at the start while reading the description, and then dismissed it because iteration is typically so convenient for these sorts of things.
I took one look at the overlaps, decided I had things I'd theoretically like to accomplish today that didn't involve tracking overlapping subsets of strings, and hardcoded the nine possible overlaps, i.e. [1]
Not OP, but I too first thought of a state machine. As soon as I started to write it I realized I was over-solving a day-1 problem. So I switched to brute force
Seems to me that people made part 2 harder than it us. Just define an array containing the digits: "one", "two", and so forth. Then check for substring matches, position by position. Maybe not elegant, but effective.
I think the issue is that they tried to separate the input into a list of tokens, like ["5", "nine"], and work from there, which doesn't work on something like "oneight".
I didn't have that issue at all; I just looped through the 20 different tokens and found first and last instances of all of them, and compared the very first and very last of all instances.
I just placed digits in the middle of the words instead of replacing them. This way I didn't "break" any overlapping words, and the order of digits is still the same.
The main difficulty of part 2 is that there are edge cases that are not covered by the examples. I have appended the example list with some edge cases, so use this list instead:
I don't get the amount of effort people out into the replacement-strategy, I did perfectly fine without it and the code is about as complex as the examples I've seen.
Yeah, all the talk of replacement seems like people masively overthinking or abstracting a day one problem. My C++ solution was a simple search using the <algorithm> header. It's a little less neatly abstracted out as yours, and could be cleaned up a fair bit, as I wasn't bothered to deduplicate the code after getting it working (and I will if this turns out to be useful tomorrow), but the essence is the same:
Ahh, people are trying to do a replacement before finding tokens. I wondered why so many people were saying this was difficult.
My head went straight to token parsing, which given the limited set of tokens made it trivial. Thought I was missing something
Day one part 2 was relatively rough. Things I learned from it: rust regex crate doesn't support look-ahead, rust onig crate is currently broken in many ways and shouldn't be used (the version in crates.io doesn't compile and the version on GitHub is failing tests and look-ahead isn't working). It was a very frustrating time for me. After 2 hours of troubleshooting the above I used the same approach in python and it took 2 minutes to write. So annoying.
> The regex crate doesn't support arbitrary look-around because it isn't known how to implement efficiently.
A bit of a philosophical question:
If how to write an efficient implementation is yet not known to man, ie. not a matter of the library's author time or skills, but literally a limit on human knowledge: why not at least provide the functionality with a good enough implementation? (with caveats just possibly mentioned in documentation)
IMHO that'd be arguably a good thing for everybody, at a minimum better than just not offering the possibility at all. Which drives users to frustration, or leaves them having to discover a more pragmatic alternative lib that opted to add it.
This is no complaint or feature request... I just want to learn from some insight behind the thought process of "if it's not efficient, better not have it at all"
PS. Thanks for the link. Now I have a good read for the weekend, for sure!
Ah jeez I totally missed this crate! Thanks!! I had originally gone with Onig because of a SO post you made years ago. Fancy-regex substituted right in and worked immediately. Much appreciation for all you do
It's a Regular Language, in the mathematical sense. All you need is the greedy match anything sequence, I don't know why so many people jumped to non-Regular regex extensions.
Easy mode for this use-case is to match on a reversed version of the regex on a reversed version of the string, but yeah, negative lookahead would be great.
Yeah I mean I did think of that but after spending all that time figuring out Regex I was decided to commit as fully as I could until it was 3am and I had to go to bed. There were lots of alternative solutions I thought of but chose to ignore out of stubborness. I'm admit, a good bit of the frustration was my own fault (but also c'mon this was way harder than any day 1 we've had before)
Yeah, that was exactly was I was doing. For the left number do everything as normal, while walking through the string. For the reverse part, just walk through the reverse of the string and match the reversed keywords.
Definitely not but it was the most elegant solution that I could think of right away, so I committed to it. In hindsight, it would've been relatively easy to solve with some loops and no dependencies but hey, I wouldn't have learned so much about the rust regex ecosystem that way.
Advent of Code means a lot to me. The problems are fun, sure, but something about it really boosts my winter. I've noticed that I get a lot more productive with my hobbies in the weeks following AoC.
I think maybe it's as simple as living on the west coast and getting used to getting amped and doing some big, adrenaline-filled, social, fun activity at 9 PM every night. It gets me used to being PRODUCTIVE in the evening and not just settling down to watch TV.
Anyway, whatever the cause, I care a lot about AoC. It makes my Decembers a whole lot happier.
For me it's the same, but opposite. It forces me out of bed and ready by the computer at 6 in the morning, haha. Which of course also means that I can end early and suddenly have so much evening time.
Is bash required.
For example, this also works in dash, NetBSD sh, pdksh, tcsh, etc.
I hope it doesn't scare off newcomers, but I already know a few who have given up on part 2.
I suspect it was purpose-built to foil ChatGPT. I solved it on my own first and then went back and tried to help GPT4 write a solution. Even after carefully walking it through a strategy, discussing edge cases, and having it outline a solution in pseudocode, it still failed completely to translate our conversation into working code. (It didn't even work for non-tricky input.) It did anticipate the edge cases though, so that's something.
Did anyone have any better luck with ChatGPT? I wonder if LLM-resistant puzzles and generally greater difficulty (at least for the second star) will be a theme this year.
My strategy was not efficient, but did work.
I walked the string twice, first LTR and replaced all found strings with numbers, then walked right to left, and replaced all backwards strings to numbers.
Then took the first digit from the left walked string, and the last from the right walked string.
[0] https://www.regular-expressions.info/lookaround.html
No hate on AOC though, I really respect all the hard work that goes into it.
> eightwothree
> 4nineeightseven2
> zoneight234
I test my AoC solutions incrementally by printing output, so I found that I was failing to produce the correct list of numbers in a line right away. I suppose if you're taking a faster approach and just trying to extract the first and last numbers that it's easier to miss. It's always a good idea to look at the example input, though.
100% with you on this. Last year was the first year I'd had time to complete it, and I always love the challenge.
https://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_algorithm
I then figured I'd use aho-corasick because that was an opportunity to, and there's no kill like overkill.
I then proceeded to waste half an hour because I didn't read the documentation, so I didn't see that `find_iter` finds non-overlapping occurrences. I assumed it found overlapping occurrences since that's what the algorithm does out of the box.
I thought it was over-complicating a day one solution, so I ended up brute-forcing similar to above solutions. Still, it is nice to learn about this algorithm. I may come back to give it a shot and compare runtimes later. Thanks!
However the 'naive' solution still feels quite nice. Some python, at the risk of sharing spoilers:
Deleted Comment
What jumped to me is the problem statement indicated a finite number of states and I crafted a solution based on that information. But its really cool to see how we all jump to different implementations.
Deleted Comment
In the end, I figured out a manual way around that, but then afterwards I realised that the iteration isn't really necessary given the problem at hand, and ended up going a completely different route that was a lot faster.
What surprised me is that I'd thought of the alternative at the start while reading the description, and then dismissed it because iteration is typically so convenient for these sorts of things.
[1] https://git.sr.ht/~bsprague/advent-of-code/tree/main/item/20...
https://pastebin.com/r1jNCSdm
Once I get a line back from that, it's the same problem as part A.
https://github.com/xdavidliu/advent-of-code/blob/main/2023/d...
Not elegant but still do the job.
[0] https://hanukkah.bluebird.sh/about/
two1nine eightwothree abcone2threexyz xtwone3four 4nineeightseven2 zoneight234 7pqrstsixteen eighthree sevenine oneight xtwone3four three7one7 eightwothree oooneeone eight7eight
https://github.com/codr7/swift-interpreter/blob/main/part10/...
https://gist.github.com/joedavis/3d6f2b87bae4809ef8a062caff7...
C++'s .rbegin() / .rend() reverse iterators made the search fairly trivial.
Oh so THAT is what is causing people problems.
Deleted Comment
fancy-regex is built on top of the regex crate and supports look-around.
The regex crate doesn't support arbitrary look-around because it isn't known how to implement efficiently.
See: https://swtch.com/~rsc/regexp/regexp1.html
A bit of a philosophical question:
If how to write an efficient implementation is yet not known to man, ie. not a matter of the library's author time or skills, but literally a limit on human knowledge: why not at least provide the functionality with a good enough implementation? (with caveats just possibly mentioned in documentation)
IMHO that'd be arguably a good thing for everybody, at a minimum better than just not offering the possibility at all. Which drives users to frustration, or leaves them having to discover a more pragmatic alternative lib that opted to add it.
This is no complaint or feature request... I just want to learn from some insight behind the thought process of "if it's not efficient, better not have it at all"
PS. Thanks for the link. Now I have a good read for the weekend, for sure!
> match(es) multiple, possibly overlapping, regexes in a single search.
"necessary" is a strong word. :-)
I think maybe it's as simple as living on the west coast and getting used to getting amped and doing some big, adrenaline-filled, social, fun activity at 9 PM every night. It gets me used to being PRODUCTIVE in the evening and not just settling down to watch TV.
Anyway, whatever the cause, I care a lot about AoC. It makes my Decembers a whole lot happier.