Well this was interesting. As someone who was actually building similar website in the late 90's I threw this into the Opus 4.5. Note the original author is wrong about the original site however:
"The Space Jam website is simple: a single HTML page, absolute positioning for every element, and a tiling starfield GIF background.".
This is not true, the site is built using tables, not positioning at all, CSS wasn't a thing back then...
Here was its one-shot attempt at building the same type of layout (table based) with a screenshot and assets as input: https://i.imgur.com/fhdOLwP.png
You bet, Fun post and writeup, took me a bit down memory lane. I built several sites with nested table-based layouts, 1x1 transparent gif files set to strange widths to get layouts to force certain sizes. Little tricks with repeating gradient backgrounds for fancy 'beveled' effects. Under construction GIFs, page counters, GUESTBOOKS!, Photoshop drop-shadows on everything. All the things, fond-times. One or two I haven't touched in 20 years, but keep online for my own time-capsule memory :)
The failure mode here (Claude trying to satisfy rather than saying 'this is impossible with the constraints') shows up everywhere. We use it for security research - it'll keep trying to find exploits even when none exist rather than admit defeat. The key is building external validation (does the POC actually work?) rather than trusting the LLM's confidence.
Ah! I see the problem now! AI can't see shit, it's a statistical model not some form of human. It uses words, so like humans, it can say every shit it wants and it's true until you find out.
The number one rule of the internet is don't believe anything you read. This rule was lost in history unfortunately.
I remember building really complex layouts w nested tables, and learning the hard way that going beyond 6 levels of nesting caused serious rendering performance problems in Netscape.
It was relatively OK to deal with when the pages were created by coders themselves.
But then DreamWeaver came out, where you basically drew the entire page in 2D and it spat out some HTML tables that stitched it all back together again, and the freedom it gave our artists in drawing in 2D and not worrying about the output meant they went completely overboard with it and you'd get lots of tiny little slices everywhere.
Definitely glad those days are well behind us now!
Off topic, but you have used imgur as your image hosting site, which cannot be viewed in the UK. If you want all readers to be able to see and understand your points, please could you use a more universally reachable host?
Claude/LLMs in general are still pretty bad at the intricate details of layouts and visual things. There are a lot of problems that are easy to get right for a junior web dev but impossible for an LLM.
On the other hand, I was able to write a C program that added gamma color profile support to linux compositors that don't support it (in my case Hyprland) within a few minutes! A - for me - seemingly hard task, which would have taken me at least a day or more if I didn't let Claude write the code.
With one prompt Claude generated C code that compiled on first try that:
- Read an .icc file from disk
- parsed the file and extracted the VCGT (video card gamma table)
- wrote the VCGT to the video card for a specified display via amdgpu driver APIs
The only thing I had to fix was the ICC parsing, where it would parse header strings in the wrong byte-order (they are big-endian).
Claude didn't write that code. Someone else did and Claude took that code without credit to the original author(s), adapted it to your use case and then presented it as its own creation to you and you accepted this. If a human did this we probably would have a word for them.
Certainly if a human wrote code that solved this problem, and a second human copied and tweaked it slightly for their use case, we would have a word for them.
Would we use the same word if two different humans wrote code that solved two different problems, but one part of each problem was somewhat analogous to a different aspect of a third human's problem, and the third human took inspiration from those parts of both solutions to create code that solved a third problem?
What if it were ten different humans writing ten different-but-related pieces of code, and an eleventh human piecing them together? What if it were 1,000 different humans?
I think "plagiarism", "inspiration", and just "learning from" fall on some continuous spectrum. There are clear differences when you zoom out, but they are in degree, and it's hard to set a hard boundary. The key is just to make sure we have laws and norms that provide sufficient incentive for new ideas to continue to be created.
That's an interesting hypothesis : that LLM are fundamentally unable to produce original code.
Do you have papers to back this up ? That was also my reaction when i saw some really crazy accurate comments on some vibe coded piece of code, but i couldn't prove it, and thinking about it now i think my intuition was wrong (ie : LLMs do produce original complex code).
Student? Good learner? Pretty much what everyone does can be boiled down to reading lots of other code that’s been written and adapting it to a use case. Sure, to some extent models are regurgitating memorized information, but for many tasks they’re regurgitating a learned method of doing something and backfilling the specifics as needed— the memorization has been generalized.
Programmers are willingly blind to this, at least until it's their code being stolen or they lose their job.
_LLMs are lossily compressed archives of stolen code_.
Trying to achieve AI through compression is nothing new.[0] The key innovation[1] is that the model[2] does not output only the first order input data but also the higher order patterns from the input data.
That is certainly one component of intelligence but we need to recognize that the tech companies didn't build AI, they build a compression algorithm which, combined with the stolen input text, can reproduce the input data and its patterns in an intelligent-looking way.
[1]: Oh, god, this phrase is already triggering my generated-by-LLM senses.
[2]: Model of what? Of the stolen text. If 99.9999% of the work to achieve AI wasn't done by people whose work was stolen, they wouldn't be called models.
Are you saying that every piece of code you have ever written contains a full source list of every piece of code you previously read to learn specific languages, patterns, etc?
Or are you saying that every piece of code you ever wrote was 100% original and not adapted from any previous codebase you ever worked in or any book / reference you ever read?
I've been struggling with this throughout the entire LLM-generated-code arc we're currently living -- I agree that it is wack in theory to take existing code and adapt it to your use-case without proper accreditation, but I've also been writing code since Pulp Fiction was in theaters and a lot of it is taking existing code and adapting it to my use-case, sometimes without a fully-documented paper trail.
Not to mention the moral vagaries of "if you use a library, is the complete articulation of your thing actually 100% your code?"
Is there a difference between loading and using a function from ImageMagick, and a standalone copycat function that mimics a function from ImageMagick?
What if you need it transliterated from one language to another?
Is it really that different than those 1200 page books from the 90's that walk you through implementing a 3D engine from scratch (or whatever the topic might be)? If you make a game on top of that book's engine, is your game truly yours?
If you learn an algorithm in some university class and then just write it again later, is that code yours? What if your code is 1-for-1 a copy of the code you were taught?
It gets very murky very quick!
Obviously I would encourage proper citation, but I also recognize the reality of this stuff -- what if you're fully rewriting something you learned decades ago and don't know who to cite? What if you have some code snippet from a website long forgotten that you saved and used? What if you use a library that also uses a library that you're not aware of because you didn't bother to check, and you either cite the wrapper lib or cite nothing at all?
I don't have some grand theory or wise thoughts about this shit, and I enjoy the anthropological studies trying to ascertain provenance / assign moral authority to remarkable edge cases, but end of the day I also find it exhausting to litigate the use of a tool that exploited the fact that your code got hoovered up by a giant robot because it was public, and might get regurgitated elsewhere.
To me, this is the unfortunate and unfair story of Gregory Coleman [0] -- drummer for The Winstons, who recorded "Amen, Brother" in 1969 (which gave us the most-sampled drum break in the world, spawned multiple genres of music, and changed human history) -- the man never made a dime from it, never even knew, and died completely destitute, despite his monumental contribution to culture. It's hard to reconcile the unjustness of it all, yet not that hard to appreciate the countless positive things that came out of it.
I don't know. I guess at the end of the day, does the end justify the means? Feels pretty subjective!
> Claude/LLMs in general are still pretty bad at the intricate details of layouts and visual things
Because the rendered output (pixels, not HTML/CSS) is not fed as data in the training. You will find tons of UI snippets and questions, but they rarely included screenshots. And if they do, the are not scraped.
Interesting thought. I wonder if Anthropic et al could include some sort of render-html-to-screenshot as part of the training routine, such that the rendered output would get included as training data.
Why is this something a Wayland compositor (a glorified window manager) needs to worry about? Apple figured this out back in the 1990s with ColorSync and they did it once for the Mac OS and any application that wanted colour management could use the ColorSync APIs.
Color management infrastructure is intricate. To grossly simplify: somehow you need to connect together the profile and LUT for each display, upload the LUTs to the display controller, and provide appropriate profile data for each window to their respective processes. During compositing then convert buffers that don't already match the output (unmanaged applications will probably be treated as sRGB, color managed graphics apps will opt out of conversion and do whatever is correct for their purpose).
Ok, so here is an interesting case where Claude was almost good enough, but not quite. But I’ve been amusing myself by taking abandoned Mac OS programs from 20 years ago that I find on GitHub and bringing them up to date to work on Apple silicon. For example, jpegview, which was a very fast and simple slideshow viewer. It took about three iterations with Claude code before I had it working. Then it was time to fix some problems, add some features like playing videos, a new layout, and so on. I may be the only person in the world left who wants this app, but well, that was fine for a day long project that cooked in a window with some prompts from me while I did other stuff. I’ll probably tackle scantailor advanced next to clean up some terrible book scans. Again, I have real things to do with my time, but each of these mini projects just requires me to have a browser window open to a Claude code instance while I work on more attention demanding tasks.
Interesting. I switched to the Mac in 2005, and what I missed the most was the fact that in windows you could double click an image and then tap the left and right keys to browse other photos in the same folder. I learned objective c and made an app for it back then, but never published. I guess the jpegview fulfilled a similar purpose.
I switched to Mac in 2008. I forget if the featured existed back then, but today on macOS if you press spacebar on an image in Finder to preview, you can use the arrow keys to browse other photos.
Hum, table cells provide the max-width and images a min-with, heights are absolute (with table cells spilling over, as with CCS "overflow-y: visible"), aligns and maybe HSPACE and VSPACE attributes do the rest. As long as images heights exceed the effective line-height and there's no visible text, this should render pixel perfect on any browser then in use. In this case, there's also an absolute width set for the entire table, adding further constraints. Table layouts can be elastic, with constraints or without, but this one should be pretty stable.
(Fun fact, the most amazing layout foot-guns, then: Effective font sizes and line-heights are subject to platform and configuration (e.g., Win vs Mac); Netscape does paragraph spacing at 1.2em, IE at 1em (if this matters, prefer `<br>` over paragraphs); frames dimensions in Netscape are always calculated as integer percentages of window dimensions, even if you provide absolute dimensions in pixels, while IE does what it says on the tin (a rare example), so they will be the same only by chance and effective rounding errors. And, of course, screen gamma is different on Win and Mac, so your colors will always be messed up – aim for a happy medium.)
- "First, calculate the orbital radius. To do this accurately, measure the average diameter of each planet, p, and the average distance from the center of the image to the outer edge of the planets, x, and calculate the orbital radius r = x - p"
- "Next, write a unit test script that we will run that reads the rendered page and confirms that each planet is on the orbital radius. If a planet is not, output the difference you must shift it by to make the test pass. Use this feedback until all planets are perfectly aligned."
This is my experience with using LLMs for complex tasks: If you're lucky they'll figure it out from a simple description, but to get most things done the way you expect requires a lot of explicit direction, test creation, iteration, and tokens.
One of the keys to being productive with LLMs is learning how to recognize when it's going to take much more effort to babysit the LLM into getting the right result as opposed to simply doing the work yourself.
Re: tokens, there is a point where you have to decide what's worth it to you. I'd been unimpressed with what I could get out of chat apps but when I wanted to do a rails app that would cost me thousands in developer time and some weeks between communication zoom meetings and iteration... I bit the bullet and kept topping up Claude API and spent about $500 on Opus over the course of a weekend, but the site is done and works great.
Hm, I didn't try exactly this, but I probably should!
Wrt unit test script, let's take Claude out of the equation, how would you design the unit test? I kept running into either Claude or some library not being capable of consistently identifying planet vs non planet which was hindering Claude's ability to make decisions based on fine detail or "pixel coordinates" if that makes sense.
Do you give Claude the screenshot as a file? If so I’d just ask it to write a tool to diff each asset to every possible location in the source image to find the most likely position of each asset. You don’t really need recognition if you can brute force the search. As a human this is roughly what I would do if you told me I needed to recreate something like that with pixel perfect precision.
If I were to do this (and I might give it a try, this is quite an interesting case), I would try to run a detection model on the image, to find bounding boxes for the planets and their associated text. Even a small model running on CPU should be able to do this relatively quickly.
Congratulations, we finally created 'plain English' programming languages. It only took 1/10th of the worlds electricity and 40% of the semiconductor production.
Yes, this is a key step when working with an agent—if they're able to check their work, they can iterate pretty quickly. If you're in the loop, something is wrong.
I'm trying to understand why this comment got downvoted. My best guess is that "if you're in the loop, something is wrong" is interpreted as there should be no human involvement at all.
The loop here, imo, refers to the feedback loop. And it's true that ideally there should be no human involvement there. A tight feedback loop is as important for llms as it is for humans. The more automated you make it, the better.
This has been my experience with almost everything I've tried to create with generative AI, from apps and websites, to photos and videos, to text and even simple sentences. At first glance, it looks impressive, but as soon as you look closer, you start to notice that everything is actually just sloppy copy.
That being said, sloppy copy can make doing actual work a lot faster if you treat it with the right about of skepticism and hand-holding.
It's first attempt at the Space Jam site was close enough that it probably could have been manually fixed by an experienced developer in less time than in takes to write the next prompt.
but my experience has also been that with every model they require less hand holding and the code is less sloppy. If I’m careful with my prompts, gpt codex 5.1 recently has been making a lot of typescript for me that is basically production ready in a way it couldn’t even 2 months ago
The sentence immediately after that would imply sarcasm
> Note: please help, because I'd like to preserve this website forever and there's no other way to do it besides getting Claude to recreate it from a screenshot. Believe me, I'm an engineering manager with a computer science degree. Please please please help (sad emoji)
There was a response to this post on the front page earlier this morning that was able to get Claude to succeed simply by giving it access to Playwright so it could see what it was doing and telling it in the prompt that it needed to be pixel perfect:
As of right now, it seems to have been flagged into oblivion by the anti-AI crowd. I found both posts to be interesting, and it's unfortunate that one of them is missing from the conversation.
"The Space Jam website is simple: a single HTML page, absolute positioning for every element, and a tiling starfield GIF background.".
This is not true, the site is built using tables, not positioning at all, CSS wasn't a thing back then...
Here was its one-shot attempt at building the same type of layout (table based) with a screenshot and assets as input: https://i.imgur.com/fhdOLwP.png
I'm keeping it in for now because people have made some good jokes about the mistake in the comments and I want to keep that context.
The number one rule of the internet is don't believe anything you read. This rule was lost in history unfortunately.
Still works, only Claude can not understand what those tables means.
1. https://www.tirreno.com
But then DreamWeaver came out, where you basically drew the entire page in 2D and it spat out some HTML tables that stitched it all back together again, and the freedom it gave our artists in drawing in 2D and not worrying about the output meant they went completely overboard with it and you'd get lots of tiny little slices everywhere.
Definitely glad those days are well behind us now!
But was Space Jam using multiple images or just one large image with and image map for links?
I think it's easier if you adapt and get a VPN or a new government.
- Read an .icc file from disk
- parsed the file and extracted the VCGT (video card gamma table)
- wrote the VCGT to the video card for a specified display via amdgpu driver APIs
The only thing I had to fix was the ICC parsing, where it would parse header strings in the wrong byte-order (they are big-endian).
Would we use the same word if two different humans wrote code that solved two different problems, but one part of each problem was somewhat analogous to a different aspect of a third human's problem, and the third human took inspiration from those parts of both solutions to create code that solved a third problem?
What if it were ten different humans writing ten different-but-related pieces of code, and an eleventh human piecing them together? What if it were 1,000 different humans?
I think "plagiarism", "inspiration", and just "learning from" fall on some continuous spectrum. There are clear differences when you zoom out, but they are in degree, and it's hard to set a hard boundary. The key is just to make sure we have laws and norms that provide sufficient incentive for new ideas to continue to be created.
Do you have papers to back this up ? That was also my reaction when i saw some really crazy accurate comments on some vibe coded piece of code, but i couldn't prove it, and thinking about it now i think my intuition was wrong (ie : LLMs do produce original complex code).
What do you mean? The programmers work is literally combining the existing patterns into solutions for problems.
I don’t think it’s fair to call someone who used Stack Overflow to find a similar answer with samples of code to copy to their project an asshole.
Student? Good learner? Pretty much what everyone does can be boiled down to reading lots of other code that’s been written and adapting it to a use case. Sure, to some extent models are regurgitating memorized information, but for many tasks they’re regurgitating a learned method of doing something and backfilling the specifics as needed— the memorization has been generalized.
> took that code without credit to the original author(s), adapted it to your use case
Aka software engineering.
Humans do this all the time.
_LLMs are lossily compressed archives of stolen code_.
Trying to achieve AI through compression is nothing new.[0] The key innovation[1] is that the model[2] does not output only the first order input data but also the higher order patterns from the input data.
That is certainly one component of intelligence but we need to recognize that the tech companies didn't build AI, they build a compression algorithm which, combined with the stolen input text, can reproduce the input data and its patterns in an intelligent-looking way.
[0]: http://prize.hutter1.net/
[1]: Oh, god, this phrase is already triggering my generated-by-LLM senses.
[2]: Model of what? Of the stolen text. If 99.9999% of the work to achieve AI wasn't done by people whose work was stolen, they wouldn't be called models.
Or are you saying that every piece of code you ever wrote was 100% original and not adapted from any previous codebase you ever worked in or any book / reference you ever read?
Who?
Not to mention the moral vagaries of "if you use a library, is the complete articulation of your thing actually 100% your code?"
Is there a difference between loading and using a function from ImageMagick, and a standalone copycat function that mimics a function from ImageMagick?
What if you need it transliterated from one language to another?
Is it really that different than those 1200 page books from the 90's that walk you through implementing a 3D engine from scratch (or whatever the topic might be)? If you make a game on top of that book's engine, is your game truly yours?
If you learn an algorithm in some university class and then just write it again later, is that code yours? What if your code is 1-for-1 a copy of the code you were taught?
It gets very murky very quick!
Obviously I would encourage proper citation, but I also recognize the reality of this stuff -- what if you're fully rewriting something you learned decades ago and don't know who to cite? What if you have some code snippet from a website long forgotten that you saved and used? What if you use a library that also uses a library that you're not aware of because you didn't bother to check, and you either cite the wrapper lib or cite nothing at all?
I don't have some grand theory or wise thoughts about this shit, and I enjoy the anthropological studies trying to ascertain provenance / assign moral authority to remarkable edge cases, but end of the day I also find it exhausting to litigate the use of a tool that exploited the fact that your code got hoovered up by a giant robot because it was public, and might get regurgitated elsewhere.
To me, this is the unfortunate and unfair story of Gregory Coleman [0] -- drummer for The Winstons, who recorded "Amen, Brother" in 1969 (which gave us the most-sampled drum break in the world, spawned multiple genres of music, and changed human history) -- the man never made a dime from it, never even knew, and died completely destitute, despite his monumental contribution to culture. It's hard to reconcile the unjustness of it all, yet not that hard to appreciate the countless positive things that came out of it.
I don't know. I guess at the end of the day, does the end justify the means? Feels pretty subjective!
[0] https://en.wikipedia.org/wiki/Amen_break
Dead Comment
Because the rendered output (pixels, not HTML/CSS) is not fed as data in the training. You will find tons of UI snippets and questions, but they rarely included screenshots. And if they do, the are not scraped.
You say that as if that’s uncommon.
Absolute positioning wasn't available until CSS2 in 1998. This is just a table with crafty use of align, valign, colspan, and rowspan.
I'm keeping it in for now because have made some good jokes about the mistake in the comments and I want to keep that context.
Like the web was meant to be. An interpreted hypertext format, not a pixel-perfect brochure for marketing execs.
(Fun fact, the most amazing layout foot-guns, then: Effective font sizes and line-heights are subject to platform and configuration (e.g., Win vs Mac); Netscape does paragraph spacing at 1.2em, IE at 1em (if this matters, prefer `<br>` over paragraphs); frames dimensions in Netscape are always calculated as integer percentages of window dimensions, even if you provide absolute dimensions in pixels, while IE does what it says on the tin (a rare example), so they will be the same only by chance and effective rounding errors. And, of course, screen gamma is different on Win and Mac, so your colors will always be messed up – aim for a happy medium.)
what?
- "First, calculate the orbital radius. To do this accurately, measure the average diameter of each planet, p, and the average distance from the center of the image to the outer edge of the planets, x, and calculate the orbital radius r = x - p"
- "Next, write a unit test script that we will run that reads the rendered page and confirms that each planet is on the orbital radius. If a planet is not, output the difference you must shift it by to make the test pass. Use this feedback until all planets are perfectly aligned."
One of the keys to being productive with LLMs is learning how to recognize when it's going to take much more effort to babysit the LLM into getting the right result as opposed to simply doing the work yourself.
Wrt unit test script, let's take Claude out of the equation, how would you design the unit test? I kept running into either Claude or some library not being capable of consistently identifying planet vs non planet which was hindering Claude's ability to make decisions based on fine detail or "pixel coordinates" if that makes sense.
That said, I love this project. haha
The loop here, imo, refers to the feedback loop. And it's true that ideally there should be no human involvement there. A tight feedback loop is as important for llms as it is for humans. The more automated you make it, the better.
That being said, sloppy copy can make doing actual work a lot faster if you treat it with the right about of skepticism and hand-holding.
It's first attempt at the Space Jam site was close enough that it probably could have been manually fixed by an experienced developer in less time than in takes to write the next prompt.
There are other ways such as downloading an archive and the preserving the file in one or more cloud storages.
https://archive.is/download/cXI46.zip
> Note: please help, because I'd like to preserve this website forever and there's no other way to do it besides getting Claude to recreate it from a screenshot. Believe me, I'm an engineering manager with a computer science degree. Please please please help (sad emoji)
https://news.ycombinator.com/item?id=46193412
As of right now, it seems to have been flagged into oblivion by the anti-AI crowd. I found both posts to be interesting, and it's unfortunate that one of them is missing from the conversation.