Vesuvius Challenge 2023 Grand Prize awarded: we can read the first scroll

When I first came across this project on HN (early last year), I was taken aback by how impossible the project looked and how smart were people working on this. Despite seeing a few intelligent names behind the project, I subconsciously believed that this would at least take 5-10 years before a breakthrough.

Today I sit with the same amazement, taken aback again, appreciating how ridiculously awesome this is. Congratulations to the winners and everyone involved!

qingcharles · 2 years ago

So many things that look insane are becoming a reality. You look at those scrolls burnt to a crisp and the idea of reading them is nonsense.

The fact I have a computer writing flowery alt text descriptions of my photos with unnerving accuracy is something I would not have predicted for another 20 years. But, here we are...

Keyframe · 2 years ago

Right? Imagine trying to explain some of it to one of the ancients - so, you have this quartz sand, see?...

lynx23 · 2 years ago

Kudos for caring about alt tags. Blind user here. While we are at it. I was also thinking how I could make good use of the new vision models. And after a while...

https://github.com/mlang/tracktales

The fact I have a computer generate spoken narration for my MPD playlist with descriptions of album art included just blows my mind. 2023 was indeed a fucking milestone.

p-e-w · 2 years ago

It's all about incentives. $1 million is a lot of money. The vast majority of hard problems don't have much brainpower dedicated to them, because the bang/buck ratio doesn't work out. Machine learning, math, and adjacent fields already have many careers that pay very well, so getting top-notch experts to dedicate their attention to what might be a futile endeavor is difficult.

And this isn't only about the monetary value itself, but also the fact that a large cash prize attached to a challenge boosts the prestige of finding a solution. Nobel Prizes come with about a million bucks on top of them, after all.

I'm quite confident that if someone offered $100 million for deciphering the Voynich manuscript or Linear A, we'd have a solution within 3 years.

CityOfThrowaway · 2 years ago

I'm 90% sure the people that did this project did it because they got nerd sniped by it and got to hang out with nat while earning a reasonable salary

BenFranklin100 · 2 years ago

Not to be argumentative, but $1M isn’t very much money, certainly not for a project of this scope. It’s a testament to the creativity, competence, and dedication of those involved they’ve gotten this far with such little funding. Hopefully their early success will attract more resources to this very worthy project.

chx · 2 years ago

You discount passion as motivator.

But it's not clear what current tech could help with. Machine learning can't be applied to something you don't have training data for.

This is the coolest thing I've read this year. It reads like science fiction. Who would even imagine it's possible to read text from a 2000 year-old rolled up burnt-crisp paper?

dougmwne · 2 years ago

It’s a 270 year archaeological and technological culmination. The scrolls were dug up in 1752. It took the collective developments of the Industrial Revolution, the sciences all our engineering and manufacturing prowess to discover, preserve and scan the scrolls. Then the final cherry on top of the current AI revolution that can create inferences and connections that are beyond the human mind to even understand. And out pops 2000 year-old wisdom of the ancients.

jfengel · 2 years ago

One other thing that was required: the patience not to ruin it.

So much has been lost to well-meaning archaeologists who dug up and threw away things that they didn't think were important. They tried cleaning and preservation techniques on artifacts without testing, sometimes ruining them in the process. They ripped things out of context, and "restored" them based on guesses that were sometimes flagrantly wrong.

Of course they couldn't be expected to know everything that would come in the future, so blame can sometimes perhaps be muted. But it's especially positive that they extricated these particular objects very carefully and just waited for a way to extract information that they could hardly have hoped for.

bee_rider · 2 years ago

It is a bit fitting that it turns out the scroll is about the relationship between enjoyment and abundance.

It looks like, from what we can gather, the author decides that should something be hard to get, that doesn’t lead to greater enjoyment. But, it seems that the archaeologists have found an awful lot of joy in how “rare” access to these scrolls is!

ssnistfajen · 2 years ago

I remember first reading about Herculaneum Papyri more than a decade ago, and pondered about them being read one day. After all, research into virtually unwrapping these scrolls had been ongoing since 2007 (https://en.wikipedia.org/wiki/Herculaneum_papyri#Virtual_unr...), but I certainly did not expect it to happen so soon. Exponential technology acceleration once again proves itself true.

linsomniac · 2 years ago

Speaking of "reading like sci-fi", what's that book where they scan an entire library of books, descructively, by feeding them into a "book chipper" like device that chops the books up into little pieces, vacuums those pieces up and scans the pieces as they flow through, reconstructing the original text by putting the scanned results together like so many jigsaw puzzles? It was a subplot of the book, but I can't for the life of me remember what book it was.

saled · 2 years ago

FYI it seems ChatGPT could have answered this for you.

> The book you're describing sounds like "Rainbows End" by Vernor Vinge. In this near-future sci-fi novel, set in 2025, one of the subplots involves a project called the "Library Project," where the UCSD (University of California, San Diego) library decides to digitize its entire collection. The process is somewhat as you described: books are destructively scanned by being shredded into tiny pieces, which are then scanned and digitized, with the text being reconstructed from the scans. This process is a part of the broader themes of the book, which include the effects of technology on society and the concept of "wearable computing" and augmented reality. Vernor Vinge, a retired San Diego State University professor of mathematics, computer scientist, and Hugo Award-winning author, is well-known for his works in the science fiction genre, especially for exploring the concept of the technological singularity.

prezjordan · 2 years ago

http://www.technovelgy.com/ct/content.asp?Bnum=1109

Rainbows End by Vernor Vinge (ChatGPT helped with the search)

michaelt · 2 years ago

Vacuuming up an entire library of books, chopping them into little pieces, reconstructing the original text by putting together the little pieces like so many jigsaw puzzles?

That's not a sci-fi novel, that's OpenAI's business model!

yourapostasy · 2 years ago

I’m now wavering a bit on my earlier dismissal of people freezing their bodies for an indeterminate future revival. I could probably get into a science fiction story with this premise:

Instead of relying upon machinery, some zillionaire has their body dry frozen and stashed in a lunar south pole crater, with a foundation funding interstellar propulsion research to move the body to the coldest stable points discovered along the way towards the Boomerang Nebula (1° Kelvin) and research to revive back from burnt-crisp state.

The foundation incites all sorts of advancements along the way like working out practical fusion and ever more exotic energy generation, AGI, gravity manipulation, Drexlerian nanotech, Dyson swarm, star wisps, self-modifying bodies and so on, in its quixotic quest to fulfill its mandate.

namaria · 2 years ago

I kinda wanna write an horror story about people freezing their bodies or heads only to be revived in the future bat shit insane from the excruciating experience of existing for several decades in a sort of limbo...

kretaceous · 2 years ago

ImageXav · 2 years ago

I was ridiculously excited when I first read about this in October (if I remember correctly) last year, when a few of the first results were beginning to pop out. I found the methodology fascinating. First of all the digital unwrapping of the scrolls, then the recognition that crackling in the paper was the sign of ink, and finally putting together a model to detect it, piece by piece. I need to look into the final repository to understand what exactly they did, but they seem to have used a TimeSFormer. I'm confused by this choice as I thought it was for video. How did they apply this to images? In the end though, what a wonderful day for archeology. These young minds deserve a huge round of applause for what they have achieved.

marcyb5st · 2 years ago

my understanding is that the scan they did on the scrolls returned the layers themselves. Like so:

```

xxxxxxxxxx <- The surface of the scroll

xxxxxxxxxx

...

xxxxxxxxxx <- The bottom of the scroll

So, by tiling the image on the surface you get data that is size_x * size_y * n_layers. So, it can be seen as a video stream with size_x * size_y * 1 channel * n_layers where the layers replace the temporal dimension.

dan_mctree · 2 years ago

The scans used for the grand prize look like this : https://scrollprize.org/img/grandprize/scroll1.mp4

It's a cut through the scroll, with the time dimension in this video representing the location of the cut along the scroll lengthwise.

As you can see from the mess it's far from trivial to find the surface of any of the sheets in the scroll, often they layers are blended together messes.

You may be thinking of the scans they used of an unwrapped sheet, those were as you describe and were used to help figure out methods for the real challenge.

They explain it on the methodology sections. The scans result in a stack of tiff images that can be rendered as videos of the scan or as 3d models.

Deleted Comment

peter_retief · 2 years ago

ditto about being excited!

jtchang · 2 years ago

“Any sufficiently advanced technology is indistinguishable from magic.”

Absolutely insane the level of wizardry being applied here to turn a lump of blackened, charred scrolls into readable text.

Having only cursory knowledge with machine learning are some of the techniques used in the article only recently discovered or have they been around for a while?

Is it due to us having reached an inflection point with these types of algorithms that they have become more popular and thus we are seeing new ways to apply them to old problems?

kortex · 2 years ago

There has definitely been a virtuous cycle between GP-GPU processing capability, algorithms, libraries and software that use that hardware, and researchers working with those tools.

echelon · 2 years ago

> Absolutely insane the level of wizardry being applied here to turn a lump of blackened, charred scrolls into readable text.

Imagine what we'll be able to do to brains, dead or alive, in 100 years.

And in 10,000, maybe we'll be reconstructing the light cone. Maybe that's what we are right now. (Not serious, but it's a fun thought experiment.)

xvector · 2 years ago

This is why I am going for cryopreservation if I ever have the luxury of choosing the way I die.

jdminhbg · 2 years ago

Here is the link to their "master plan" to read all of the excavated scrolls: https://scrollprize.org/master_plan

It looks like there are two main bottlenecks to reading more: the need for manual intervention in segmenting the scanned scrolls, and the cost in scanning new scrolls.

ok_dad · 2 years ago

As for scanning: $30mm doesn't seem like a ton of money to scan 800 scrolls with untold history and other works, compared to other uses of that amount of money I could name now. Maybe someone will donate that cost and perhaps all the scrolls can be transported at one time or in a few bigger groups to be closer to the particle accelerator. Another million bucks and I bet you could build a climate-controlled container to take them all at once, or something. If I had $30mm I would definitely donate to this cause, it seems like one of the best uses of that kind of money I can think of. That would bypass the need to research and develop a bench top scanner or another solution. You could even crowdfund this!

As for segmentation: get some sort of collective solution going, like the Seti@Home did, but for people who are bored as hell, instead of them scrolling Reddit or Twitter all day. Maybe do it like a CAPTCHA so you get it done for free? I'd segment for a few hours a month if I had the ability to do so.

This is a cool project that has taken a community to build to this point, why not try and open and expand the collective of humans working to understand the scrolls? Get millions of people involved and you don't need to rely on technological crutches and development, though that is not the worst way to go either.

BurningFrog · 2 years ago

At $30mm, they'll have billionaire philanthropists lining up around the block to get their name on this!

tysam_and · 2 years ago

Funding is a huge one as well. Funding is the wheel that drives the project (source, have been hanging around the project people for a little while).

If you know anyone that would help chip in for the Phase 2 of the project (scaling up, please let Nat know! (not directly affiliated with the project management team, just pointing to him as a great contact for that.... <3 :')))) ) )

riffraff · 2 years ago

It seems "weird" none of the mega rich has committed a few million dollars for this, it looks like a very good way to build a legacy while benefiting humanity, and e.g. Bezos would probably find a million dollars behind the couch pillows.

kilroy123 · 2 years ago

What a refreshingly clear and thought-out plan. This project honestly gives me a lot of hope.

Animats · 2 years ago

Yes. Now it can be done, but costs too much. Once they get a scanning unit near the scrolls, it will be much cheaper. The data reduction will probably get cheaper, too.

ryneandal · 2 years ago

Herculaneum was one of the highlights of my trip to Italy with the wife. I didn't realize the scope of just how much ash and soil had to be removed for excavation. It was dozens of meters [1]. It's an absolute shame that the site is given a fraction of the attention that Pompeii receives, I thought it was vastly better preserved and truly awe-inspiring [2].

I highly recommend spending a few hours wandering the site, it is an absolute wonder.

1: https://www.icloud.com/photos/#08dJAA5eM9jpbhlEa3fzkl5ng 2: https://www.icloud.com/photos/#076Pof4FziA7WgcI8hZrGZmzg

beautron · 2 years ago

I enjoyed the attention given to Herculaneum in a computer game called Rome: Pathway to Power (released in 1992). You start the game as a slave who has to escape Herculaneum before Vesuvius erupts. I loved the game as a kid. It's sort of like an isometric immersive sim (with a clunky interface). It got me interested in ancient Rome.

I hope to visit Herculaneum some day.

inglor_cz · 2 years ago

The modern Italian town of Ercolano lies just over Herculaneum, so excavations of the rest of the ancient town are a bit tricky. Only about a quarter has been excavated so far, in contrast to Pompeii, which are two-thirds out.

alach11 · 2 years ago

bglazer · 2 years ago

One aspect of archaeology that I really find fascinating is the practice of leaving certain artifacts unexplored. The original discoverers of the scrolls tried to unroll a few, apparently found it was impossible without completely destroying the scroll, and then just left the rest undisturbed. Rather than pushing forward and destroying everything, they left these as a mystery for a future age. Two centuries (!!) later we can finally begin to understand these, with the aid of technology that would be utterly unthinkable to those people who very thoughtfully restrained themselves.

unsupp0rted · 2 years ago

> Rather than pushing forward and destroying everything

In the early days they wouldn't have accomplished anything by pushing forward, so it doesn't take all that much restraint.

I'm more impressed by people in, say, the 1990s or early 2000s. They might've had a shot but there was still too much risk, so they restrained themselves until it was a safer bet.

gwern · 2 years ago

Yeah, I can't give the King's men much credit here. They destroyed a lot of scrolls, and it was only because they weren't getting much of anything that they stopped and abandoned excavations or focused on digging out sculptures they could show off (many now in the Getty Museum - great museum, but I did feel a bit melancholy thinking about the scrolls while I was there in 2019).

klyrs · 2 years ago

On the other hand, we ground up mummies for paint to the point that we ran out and used fresher corpses to meet demand.

It is a bit of miracle that they were preserved, and not just discarded.

topper-123 · 2 years ago

One aspect of that time period is they absolutely idolized the romans. A lot of education at the time consisted of learning latin and at the same time people were well aware that only a fraction of the classical texts had been preserved. I find it very believable that they understood the significance of preserving and potentially unlocking these scrolls.

dmurray · 2 years ago

An example of the same thing at a macro level:

https://www.smithsonianmag.com/smart-news/archaeologists-reb...

_a_a_a_ · 2 years ago

Bigger yet by far https://en.wikipedia.org/wiki/Mausoleum_of_the_First_Qin_Emp...

He of the terracotta army. Not excavated yet for fear of damage, but I would so love to know...

tobinfricke · 2 years ago

Similarly, there are large sections of Pompeii, which remain unexcavated -- left for the future.

Tronno · 2 years ago

Herculaneum, where these scrolls are from, is 75% unexcavated! And it will likely remain this way for some time, as Naples sits right on top of it.

sekai · 2 years ago

> We estimate that the scrolls we have in Naples contain more than 16 megabytes of text. Some members of our papyrology team say that revealing this text will be the greatest revolution in the classics since the Renaissance

Amazing achievement, let's hope the Italian government allows for additional excavation of the villa.

they likely would, Pompeii and Herculaneum are _still_ being excavated after two centuries, it's not like things are still.

But we have only read 5% of this scroll and there are a ton more already excavated, it will probably take years before we manage to process what we already have.

seydor · 2 years ago

> it will probably take years

In the direction things are going ... maybe a few months :)

Digory · 2 years ago

If you can automate the input, you can probably automate much of the basic analysis (things that would be "revolutionary" to undergrads).

"ChatGPT, give me the highlights of these ancient Greek scrolls ..."

wl · 2 years ago

The big problem is that the Villa of the Papyri is underneath modern buildings. That doesn't mean that excavation without demolition is impossible (see the Scavi underneath the Basilica of Saint Peter), but it makes things far more difficult.

BlueTemplar · 2 years ago

If the prospect is very high to multiply by several times the total remaining classical works, I doubt that the money will be particularly hard to find ?