This is an elaboration of my submission for the letter detection prize as part of the Vesuvius Challenge in June. It gets technical and esoteric but the first few paragraphs are of general interest.
In AD 79, Mt Vesuvius, a volcano in Italy, erupted burying several nearby Roman towns, including Pompeii and Herculaneum. Forgotten until their rediscovery in the 1800s, these extremely well-preserved ruins have taught us much about the ancient Roman world that did not survive transmission in ancient texts, either because contemporary authors did not write on the subject or because their contents were part of the ~99% of ancient writing that did not come down to us.
Near Herculaneum was a luxurious villa on the shore of the Bay of Naples, that once belonged to Calpurnius Piso, the father in law of Julius Caesar. The Getty Villa is a modern replica of this impressive structure.
Beginning in the 1700s, after its accidental discovery beneath 6 meters of ancient volcanic mud flow, numerous sculptures were removed through an extensive network of tunnels – the structure itself remains mostly unexcavated.
In addition to bronze and marble, early explorers also found numerous lumps of charcoal which were eventually understood to be pyrolyzed scrolls, the only ancient library to survive to the present day. Complicating study, the scrolls are incredibly fragile and, now blackened by ancient heat, extremely difficult to read. For two centuries, classicists have mostly resisted the temptation to irreversibly crumble the scrolls through manual unrolling, awaiting a future age where unimaginable technology may enable non-destructive analysis.
That age is now.
What might be in the collection? People have speculated forever! Richard Carrier has one list. Maybe we will find texts saved from the destruction of the library of Alexandria? Maybe additional excavations will unearth thousands of additional scrolls?
The Vesuvius Challenge is a collaboration between private benefactors and academics to bring the latest technology to bear beneath a banner prize of $1,000,000 for successful decoding of the data.
Two scrolls were scanned using X-ray CT at a resolution of 8 um! Each file is roughly 5 TB of data, while each scroll might contain just 10,000 words. We have a compression problem here.
It is non-trivial to read the data.
- The scrolls are in bad shape. They were probably, on average, really old and damaged even before they were buried in mud, cooked at 600C, then left for 2000 years.
- The ink was optically black on optically pale papyrus, but both are equivalently transparent to X-rays.
- Tomographic reconstruction is rife with spurious artifacts.
- Clean separation of foliations in software is difficult where layers are very close.
Nevertheless, progress has occurred on both segmentation – the extrication of sections of scroll and flattening in software – and ink detection.
The rest of this post will include italicized quotes from the prize criteria, followed by my explanations. As a disclaimer, the competition organizers have asked me to emphasize that they do not (yet) endorse my method or putative discoveries.
An image, showing text on a sheet of papyrus, clearly showing at least 10 legible letters in a continuous area from Scroll 1 or 2, no larger than 4cm2. Words and lines have to be clearly visible.
A submission should include the following:
Image. Submissions must be in the form of an image of the virtually unrolled papyrus, showing visible and legible text. Specify which scroll the image came from and where in the scroll it was found.
The text was found via persistent direct visual inspection of a large segment (created by fellow competitor “Hari Seldon”) which is part of Scroll 1. I did not create the segment, but I have directly observed similar ink characters in less-beautifully unrolled segments I have generated using my own code and algorithms.
Images here are cropped from full resolution images available in this album: https://photos.app.goo.gl/61cWRicfY8Qj92cEA
The orientation should be clear from these images, but it is towards the “bottom” in the default orientation. The text is visible between layers 30 and 45 of the monster segment, which has 65 layers in total. Layer 33 or 34 is the single layer with the most text visible, corresponding to the inwards-facing surface of the segment’s papyrus.
Above, image generated from X-ray tomographic data that has been digitally flattened. This segment is roughly 3cm x 12cm.
Above, hand annotated identification of characters from visible ink fragments. Not every identification is certain.
Above, photo of whiteboard hand annotation enabling adaptation of filters and changing of layers.
Dimensions. In your images, please include a scale showing the size of 1cm. Please also include pixel and real dimensions (width and height in both pixels millimeters) for all letters that you have found (either within the image or in a separate document).
This image is 3001 x 1701 pixels. A lossless png can be found here: https://photos.app.goo.gl/SM1ukfRe6FGPx8g2A
The pixel size is 0.008 microns, so the total field of view is 24.008 x 13.608 mm.
This image gives a 1 cm scale bar.
The characters are typically 3-4 mm in height, and 2-5 mm in width.
3D orientation. Please include a 3D image that makes clear where your area is located in the scroll, how it is oriented, and what orientation the letters are facing. Please make it clear if you found the letters facing the inside or the outside of the scroll, and which side is “up” in the writing. This doesn’t have to be an elaborate rendering, even a crude hand-drawn picture would suffice, as long as you communicate the position and orientation.
Answered already. The letters are on the upper surface of the segment, which is the inwards-facing part of the scroll surface. They are visible within 24 microns of the surface, right where “bright” papyrus fades to “black” empty space. In some places, 3D texture of the ink fragments is clearly visible, like cracked and peeling mud.
Include a mesh with texture, if you have it.
Rows of text. Papyrologists tell us that in all 800 scrolls which have been partially or fully read so far, text appears in regular rows and columns. You can also see this in the fragments and on Chartes. Please include where you are seeing rows, e.g. by annotating your picture with horizontal lines.
This hand-annotated image of layer 34 of the monster segment shows where characters are visible. This shows two rows of text, oriented parallel to the edge of the scroll (rolling axis is up-down in the view), aligned with papyrus fibers.
Images should still include fibers. You can make a binary ink image but you should then overlay it over an image where the fibers are visible. We expect to see letters parallel to the papyrus fibers (as that is how it appears in >99% of all samples read to date).
Public fragments with known ground truth. Please run your methods on the three public fragments and include your results.
I have never directly studied the fragments. The ongoing apparent failure of deep-learning based ink detection based on the fragments indicated to me that direct inspection of the actual data would be more fruitful, as it has been here. There is more than enough “data” in the given scroll files to extract signal.
I have 8 years of experience working with various kinds of deep learning and image processing, including on astrophysical data and on Mars mapping projects at NASA JPL, so I have a good idea of its strengths and limitations. With rare exceptions, functional feature detection filters using ML are bootstrapped off human detection. There is a good reason for that – our own visual cortex is extremely adept at identifying subtle patterns and textures, especially with training, especially on X-ray data.
I have confidence that the ink I’ve detected here is obvious enough that it will be relatively easy to write classical (non ML) detection filters for it, and have made some progress in that direction, but this approach is much more time consuming and much noisier than just using my eyes!
Fellow contestant Luke Farritor has made some progress with ink detection AI fine tuned on the area I identified characters.
If your method involves training on the public fragments, please include k-fold results (where you exclude inference areas from training, and repeat that until you have run inference for all segments). Or at least 1-3 such examples, if your training is particularly slow or expensive.
Note that the segments are 4µm per pixel instead of 8µm, so you might need to downsample to the same resolution.
It does not involve ML or fragments.
The surface volumes can be found e.g. here for Fragment 1, and similarly for the other two.
Reproducible instructions. Please include clear instructions on how to run your method on an area of a scroll, ideally on a surface volume of a segment (such as the aforementioned surface volumes of fragments or a scroll surface volume such as this one).
Ah, the interesting part.
Let’s quickly develop an ontology of texture features.
Feature: Horizontal and vertical striations
Interpretation: Papyrus fibers
Feature: Dark areas
Interpretation: Low X-ray cross section – empty space
Feature: Bright speckles
Interpretation: Mineral grit, usually part of papyrus cellular matrix
Feature: Bright flakes. Appears to be attached to surface of papyrus. May have odd shapes, up to 3-4 mm in size. Typically isolated.
Interpretation: Could be exceptionally thick ink, squashed insect, or special characters.
Feature: Triangular “shark bite” texture.
Interpretation: Meshing artefacts from segment unrolling algorithm
Feature: Bright thin curved “hair” lines.
Interpretation: Unknown – possibly evaporative residue from water with minerals, such as sea spray or sweat.
Feature: Vertical linear bright/dark features.
Interpretation: Creases/papyrus delamination/damage from scroll rolling that aren’t fully flattened by unrolling algorithm.
Feature: Cracks – typically parallel to the grain of the papyrus, extended (3 cm and up).
Interpretation: The papyrus has cracks.
Feature: “Cracked mud” or “crackle” texture with lighter convex patches 0.1-0.5 mm in size separated by narrow, dark, high contrast channels typically at angles of 60-90 degrees from each other, and otherwise aligned at random with respect to the underlying papyrus fibers.
The cracked mud textured areas have discrete (sharp) boundaries, they do not “fade” away, but end abruptly. The textured areas form linear features that are often straight for 2-4 mm and 0.5-1 mm wide.
The cracked mud texture often appears to sit “proud” or slightly above the background papyrus, as though it is adhered to the surface.
Interpretation: Ink residue.
Once the reader develops an intuition for the cracked mud texture it becomes much more salient and can be readily picked out against the background noise even if the ink in that area was extremely thin and thus almost invisible. See, eg, this partial character from another part of the scroll (not the large segment):
Time to read an actual character.
We’ll start with the Pi character, because it’s more directly visible within a single layer – other characters are on bumps in the papyrus and only completely visible while sliding through different layers.
First, we are looking for characters that look like this (if Greek):
Orthography (lettering) can vary slightly, but importantly – scribed ancient Greek differs substantially from modern cursive greek characters used in mathematics and physics, which initially confused me.
Here are some Pis for reference:
Let’s take a close look at the Pi character on the image above
This image is at native resolution (0.008 um pixel size). It is 750×708 pixels, which is exactly 6 mm wide and ~5.66 mm high.
The ink residue is clear throughout this letter, in this region:
(This is the same image with a contrast filter applied). The ink residue is
- Of exactly the expected dimensions (length, width, orientation w.r.t. papyrus, line width, and position at the surface of the sheet)
- Of the natural and expected shape, including variations in line width particularly on the lower right corner when the stylus left the page, and at the upper two corners where the stylus slowed to make the turn, consistent with the papyrus reference images.
- Well-defined edges
Immediately to the right of the Pi there is another character.
This character is written over a section of papyrus that dips into the page a bit so we’ll take two samples, one 40 microns closer to the surface.
This sample is at the same level as the Pi image earlier, and the bottom right corner of the Pi is visible on the left edge. It shows a rounded blob of ink residue texture towards the lower end of the case, and another section is visible on a raised ridge of papyrus towards the top.
This crop is 40 microns deeper on the same field of view, illuminating ink that fell on gaps between papyrus ridges, and showing the upper extent of the letter. The field of view is 4.5 mm high, so the character is between 2.5 and 3 mm tall. It could be an iota, or potentially a gamma or tau if a horizontal bar can be inferred against the background papyrus.
Once again, the ink-residue textured section is morphologically unique, distinct, in the right location, the right size, sharp edged, etc.
Next on the right is:
This character is harder to make out, until the reader realizes that it is curved. It is first visible when looking for the dark narrow cracks in the “cracked mud” texture. It is a handwritten lunate sigma, which looks like a ‘c’. The field of view is 3.35 mm high. The character is aligned directly to the right of the iota and pi characters, consistent in size, orthography, line width, alignment, ink texture, ink position relative to the papyrus, etc. Like the Pi, slight stroke width variations are recognizably derived from the motion of hand writing.
With three characters (pi, iota, sigma) we can check if this is part of a word – of course it could be two words since ancient writing generally did not include spaces between words. Using this handy list (https://kyle-p-johnson.com/assets/most-common-greek-words.txt) we find 69 instances of πισ, and none of πγσ or πτσ.
Let’s return to the π and run to the left.
This is 32 microns further from the page than the image used for the Pi, which is partially visible on the right edge of this image. As a result, the Pi ink residue is most visible as bright, upcurled edges consistent with the “cracked mud” texture interpretation.
Immediately to its left is a small circular feature, clearly visible despite a few nearby instances of “shark bite” texture from virtual unrolling.
Without going into all the detail, this ink feature is most likely an omicron, which resembles the “degrees” symbol in modern english, and is written above the line.
Οπισ as a letter sequence only appears in 2 of the 10000 most common Greek words: ὄπισθεν (“behind”) and ὀπίσω (“backwards”). See the modern “opposite”, likely derived via Latin from the same Indo-European root.
To its left:
This image (cropped from Monster1_7.png, the same as the Pi image) shows distinct cracked mud texture on the upper half of the character, along with a visible outline possibly caused by the stylus permanently deforming the papyrus inwards. Following this outline down we see it surround a small area with a seemingly inverted cracked mud texture, with darker convex shapes separated by bright lines. These features, which are at least 40 microns thick (spanning 6 adjacent slices) represent either a region where ink minerals propagated to crack boundaries during evaporation, perhaps due to different quantities of water solvent in the ink, or perhaps they are an imaging artifact due to the next layer of papyrus being in direct contact with this part of the character.
This character is most likely another iota.
Next on the left:
This character has two areas of obvious ink residue: lower right, and lower left. There is also the edge of the iota visible on the top right.
In addition to these two “cracked mud” texture areas, there is also an area of apparent deformation of a papyrus fiber in the upper center of the image, and colinear to the top left a singular flake residue approximately 0.08 mm long and 0.02 mm wide, a sharp-edged feature that sits proud of the background papyrus surface by 0.06 mm. Combining these features we almost certainly have a lambda, similar to:
Next on the left is:
This one is difficult to identify, as the papyrus is crinkled and damaged in this area. However, there is definite “cracked mud” texture in the center of this frame, as well as the tail of the previous character visible in the lower right.
If we follow the linear features that are not parallel to the underlying papyrus (mostly left-right) and overlying papyrus peeking through (up down), we see another stroke trending from top left to bottom right, along with a subsidiary stroke trending down and left from the center of this frame. Could be another lambda, or potentially an alpha similar to:
If it is another lambda, then we have λλιοπ, a letter combination that occurs only in the word “καλλιοπε”, the muse of epic poetry. I studied Latin, not Greek, so I don’t know classical Greek inflections and I am unsure if the character following the π is possible.
Either of these word interpretations is captivating but it is too soon, in the view of the competition organizers, to officially endorse this conjecture. More work is required!
Further left we come to the edge of this segment.
While there are four more characters visible to the right of the sigma, I will instead direct attention to the second line.
This wide crop shows the pi and iota characters above, and a number of clear ink residues below, which we will now focus on.
This is cropped from Monster1_7.png, showing a dominant ink residue stretching from top left to bottom right, as well as an additional area in the lower left.
This is cropped in the same area, from Monster1_11.png, and is thus 4 x 0.008 mm = 24 microns further from the page, where some of the vertical papyrus strands from the overlying page are coming into view. The ink residue on the lower left of the image is now the curly edges of the “cracked mud” texture, trending away both right and diagonally upwards. My initial ID of this character as a capital nu is less likely than a capital delta.
As before, the character is 4 mm across, consistently positioned, oriented, composed of linear strokes, just off the inner surface of the segment papyrus sheet, etc etc.
Here are a couple for reference from a surviving papyrus:
Immediately to the left of the delta is a fragmentary character. Two patches of “cracked mud” ink residue are visible, one in the top left and a larger one in the lower center. In addition there is a diagonal artifact running from the upper left ink residue down towards the lower right, where it nearly touches the left-most corner of the afore-mentioned delta character. If we interpret this feature as local deformation of the papyrus from writing (which it appears to be through several slices) then the combination of these strokes spells either lambda or perhaps alpha.
Here are an alpha and lambda next to each other for comparison.
To the left of the alpha/lambda is another fragmentary character.
This crop shows a patch of ink residue in the lower right corner filling a “hole”, a section of papyrus outlined in the dark rim that is, perhaps, 0.06 mm lower than the rest. This ink residue stretches upwards towards another at the top, center right.
This crop, taken from Monster1_12.png shows another section of relatively elevated ink residue forming a horizontal band along the bottom of the frame that appears to taper off towards the lower left corner.
This character could be an upper case Omega, a Delta, a Sigma, a Beta, or Phi, or any other character with a line across the base.
When this process is sufficiently automated, a Greek-trained LLM can generate a probability distribution for the next character in a sequence, then do a Bayes test against any visible ink fragments to update the probability of the next character. Because written alphabetic languages like Ancient Greek have an information entropy of about one bit per word, significant error correction is possible!
There are at least 9 other characters with varying levels of legibility on this same end of the monster segment, arrayed across two rows with regular spacing, aligned with the horizontal papyrus strands, etc etc. But these 10 I have discussed sit within a space of about 3 cm^2, which meets the requirement of the First Letters Prize goal.
Together, these 19 characters represent roughly 130 mm of drawn ink, a great starting sample for localizing ink-detection algorithms to this scroll, ink type, and orthography.
Alternatively, running it on a segment mesh file, is also acceptable. More esoteric methods are also acceptable as long as you very clearly explain how to use it.
We will use these instructions to run your method on the secret 4th fragment, for which we have known ground truth.
I’m not sure how this will work but good luck!
Discussion of your method. Please tell us how your method works, why you believe your method works, and relevant details on how to interpret the results (e.g. what do different colors in your output mean). Any details that you can share to convince us (and yourself!) that you actually found letters will be helpful.
I think by this point the method is clear and the evidence >5 sigma that what I have identified is actually written characters.
Just as with the Grand Prize, please do not make your discovery public until winning the prize. We will work with you to announce your findings.
I am keen to share my discovery with the discord community as I believe it will supercharge ambition and expand discovery.
By submitting you agree to make your method open source. It does not have to be open source at the time of submission, but you have to make it open source to accept the prize.
The first team to make a valid submission by 11:59pm Pacific, December 31st, 2023 wins the $40,000.
Note that if your text does not meet our expectations (e.g. they are not neatly organized in rows, or aligned with papyrus fibers, or on the inside of the papyrus) you might still have found valid text, so this doesn’t immediately disqualify your submission. It does make it less likely that you found valid text.
I think the text I’ve found meets these expectations.
If you do submit this information and we still cannot get high confidence, we will not immediately reject your submission; it will remain open until we do have more evidence, though this could potentially take months. For example, if we discover 4 months from now that your method was right all along, and your method was the first such valid method, then you will then win the First Letters Prize (even if, for example, the Grand Prize is already claimed).
Bonus round – inference with training
I used “Classify” function in Mathematica. Yes, I could use python (and have in the past) but life is short. The purpose of this demo is to show that a computer can see what my eyes see!
Super basic home-brew image classifier showing λιοπις.
Second bonus round – looking at the fragments
On the left – ML ink detection. On the right – ground truth from IR imaging.
I thought Scroll 1 was in bad shape, but these fragments are really messed up. Lots of cracks, dislocations, and all kinds of weird textures, even on the flattened samples.
Only one side of the fragment was visible in 65 0.004 micron layers, so I hope this is the side with ink on it.
I found similar “cracked mud” and “flake” textures corresponding to known character ink, but only for perhaps 10% of the known characters. It’s been a long day, I can probably find more on closer inspection, but that does make one wonder about automated ink detection and what that is seeing.
Let’s look at some screenshots. This is from Fragment 3
This screenshot shows the “YK” section in the middle pretty clearly. All we’re seeing here is relatively unfractured ink residue “standing proud” embossed from the surface. Around the edges of the cracks, we see stronger signals (eg the middle of the “K”) because the peeled ink is standing up towards the field of view.
A closer look at the “cracked mud” texture on the right arm of the “Y” and the center of the “K”.
Another area of fairly obvious ink residue, this to the right of the “K” character.
The center of the “K”, showing raised ink (right) and peeled ink (center).
It is possible that there are other, more subtle ink residues than the ones I was easily able to see in Hari’s segment and have shown here above in a few cases. Perhaps ML can find them too.
Why is it that automated ink detection continues to draw blanks on the scrolls? Does it work on the secret fourth fragment?
There has been much discussion in the Discord of whether the resolution affects it (I doubt it) or some other factor. That said, the fragments have different CT reconstruction artifacts than scroll 1 which could throw off texture detection algorithms.
My hypothesis – the ink detection ML algorithms are at least partly memorizing which parts of the papyrus have ink on them, which accounts for their spotty reconstruction seen above. Why is ink detection detecting ink in places that have no ink, never had ink, are far from places that have ink? My guess is that the combination of papyrus and mineral fragments are a latent-space match for places that do have ink. If the networks have more than ~100 weights this sort of pattern matching is fairly trivial. We asked deep learning to find the inked parts but our loss function was more easily solved by memorizing which set of local landmarks correspond to the special area. This could be hard to fix if we don’t have a definite idea in mind of what the ink texture will look like, and have so little data to work with.
How to test? Retrain the algorithm on fragments with half the ink mask blanked out, see what happens.
I hope this discussion spurs more interest in recovering ancient texts. The entirety of the ancient corpus today can fit on a small bookshelf. It would be amazing to double the number of texts we have.
The obvious next step is to go out, find, and classify the crackle texture. We may not get words frequently but we will greatly increase the quantity of real-world data we can train ink detection networks on.