It's getting tiring seeing 3D model generation papers throwing around "high quality" to describe their output then glossing over nearly all of the qualities of a high quality 3D model in actual production contexts. Have they figured out how to produce usable topology yet? They don't talk about that, so probably not.
3D artists are begging for AI tools which automate specific tedious but necessary tasks like retopo and UV unwrapping, but tools like the OP do the opposite, skipping over those details to produce a poorly executed "final" result and leaving the user to reverse engineer the model in an attempt to salvage the mess it made.
If gen3D is going to be a thing then they need to listen to the people actually doing 3D work, not just chase benchmarks invented by other gen3D researchers. Some commentary on a similar paper about how they are trying to solve the wrong problems: https://x.com/rms80/status/1801362145600254211
> Have they figured out how to produce usable topology yet?
One recent attempt at it is Mesh Anything [1], which tries to generate triangle meshes with sensible topology[2] for a given point cloud - but it's still in the early stages, it can fail somewhat dramatically on meshes with smooth and concave parts [3] and it has a hard cap on the number of triangles it can generate.
> If gen3D is going to be a thing then they need to listen to the people actually doing 3D work, not just chasing benchmarks invented by other gen3D researchers.
This has been an issue for much longer than generative AI. In graphics, Siggraph papers have been publishing fully automated methods claiming “high quality” for decades that produce lower quality than human created assets or human-in-the-loop tools. Same is true of lots of other things in computer science, academics is just prone to over-automating because it’s fun and clever and because we can.
This is probably the natural course of events. People publishing papers and projects are going for maximum wow, not for maximum quality. Often this is smart grad students in school, and they have no access to 3d professionals, and need to get something published asap to meet funding/graduation goals.
We shouldn’t necessarily complain about it or get tired, but instead work on tech transfer. Borrow these ideas and bring them to 3d tooling. It takes time, and we have to vette them and figure out which will actually help people. Sometimes the ones that get lots of attention turn out to have no staying power.
> if gen3D is going to be a thing then they need to listen to the people actually doing 3D work, not just chasing benchmarks invented by other gen3D researchers.
Completely agree. I used to take an interest in reserach into non photorealistic rendering. The same thing happened there. Paper after paper detailing 'novel' approaches to yet another cross hatch algorithm that no artist was ever going to use.
Yes but a fair chunk only produce meshes as an afterthought. They often use a neural representation (or gsplats or density fields) and bolt on mesh generation for portability.
(This particular project does seem to be specifically about meshes but I'm addressing your broader point)
Not every application of 3d generation necessarily needs meshes. Or rather - might not need meshes in the not so distant future.
We'll see, but we've heard this story before with other proposed alternatives to mesh-based pipelines. Voxels and point clouds were all the rage in research for a while and poised to take over as soon as the last few pieces came together, but they never did and we still use meshes for nearly everything, either by creating meshes directly or turning some other type of data into a mesh. We still need to figure out how you're even supposed to edit static neural assets at a fine-grained level, nevermind animate them with precision, nevermind do all that efficiently.
I'd argue that even 2d generated art isn't very useable beyond putting it in your blog every few paragraphs with hopes of keeping readers hooked but such art is immediately recognisable that it has been prompted into existence.
Good luck with logos, serious graphics web design or UX or hoping to generate game assets that feel like all fitting together.
As someone who teaches 3D, a 'high quality' model would need to have clean topology: all quads which flow around the form in a predictable and rational manner. From this, I would expect a clean texture map. I am fairly certain that current technology is not up to this.
I have seen a few of these papers, and (from my limited experience) very rarely is the 3d model avauable for review.
I ran Instant Meshes over a model of Mount Chimborazo and it did an okay job. Although the mesh was fairly uniform with its quads, the topology looked weird. Like the typographic equivalent of "rivers" running through the mesh. I decided to keep the imperfect topology, with some amount of decimation, because it looks better.
Instant Meshes is pretty good. We have it installed in all our labs. The results you share seem fine for the purpose. In our experience, Quad Remesher is the best. It certainly helps that QR sits inside of Blender as an add on (unlike Instant Meshes). It is also pretty much industry standard, being used in Max, Maya, ZBrush. As for its results... I find that it has a deeper 'awareness' of the source topology. The meshes are dense when they need to be and not dense when they don't need to be.
A good test of any re-mesher is to apply it to some 3D text. 3D text geometry is famously horrible. Another test is to apply it to a face. Face topology is very prescribed: there is a (mostly) single way of doing it. Any non-standard approach will create issues, especially if you wish to animate it. I would not have expected any automatic approach to work very well, yet have been surprised by the quality of QR.
For human topology wouldn't you just transfer it with keypoints to a standard topology using something advanced like R3DS Wrap? That's what is used in human scan data conversion for film to get a realistic animatable double for VFX shots.
I'm ashamed to say that I have not heard of R3DS Wrap. It does not seem to re-mesh as such. Rather it seems to 'wrap' an existing mesh around your raw mesh. It also manages the baking, which is sweet.
This seems like a valid approach. After all, every human figure in a Pixar movie has been sourced from the same base mesh.
Right. Outside of creating something to show off in a web browse, this is currently at best a quick way to generate disposable background assets for VFX. Definitely not usable for AM or 3D printing, and more work to clean up and stylize for any given animation.
Once we get artists to start producing LoRA on UV wraps of their own assets, I think it'll get interesting.
In my experience depending on the model, it can be between hard and very hard. Blenders remesh will produce something reasonable but with a gazillion polygons. Guided approaches are a lot better but are labour intensive as the user is effectively tracing over the source model.
Retopology will destroy the old uv map, so a new map will need to be made. Following that, the old textures will need to be baked onto this map.
I have spent easily as much as a day cleaning up (I.e. re-mqking) 3d scans.
Really good. This is just geometric analysis though. Geometric in the sense that the model likely doesn't understand what it's rendering. All it sees is some shape.
The next step is geometry with organized contours that make sense, meaning that the model needs to cohesively understand the picture and not just the geometry. For example, if a person in the picture is wearing armor the model generates two separate models overlayed on one another, the armor and the mesh.
Great to see these getting better and better. This might actually be usable for geometry generation if it's possible to increase the resolution. It seems that a simple super-resolution pass could help with this. For now, using this mesh as a reference model would help a lot in a typical 3D modeling process.
Those textures are completely useless, because they have all the light and view-dependency baked in. It's not really possible to extract a diffuse texture from this. There has been some work on generating material BRDFs [0], but I've not seen great results yet.
The demo page has demo images but the results are not cached. While im probably not an interesting customer I got bored waiting. Not something worth spending cpu cycles on.
3D artists are begging for AI tools which automate specific tedious but necessary tasks like retopo and UV unwrapping, but tools like the OP do the opposite, skipping over those details to produce a poorly executed "final" result and leaving the user to reverse engineer the model in an attempt to salvage the mess it made.
If gen3D is going to be a thing then they need to listen to the people actually doing 3D work, not just chase benchmarks invented by other gen3D researchers. Some commentary on a similar paper about how they are trying to solve the wrong problems: https://x.com/rms80/status/1801362145600254211
One recent attempt at it is Mesh Anything [1], which tries to generate triangle meshes with sensible topology[2] for a given point cloud - but it's still in the early stages, it can fail somewhat dramatically on meshes with smooth and concave parts [3] and it has a hard cap on the number of triangles it can generate.
[1] https://buaacyw.github.io/mesh-anything/ (submitted previously as https://news.ycombinator.com/item?id=40701992 )
[2] although they use the somewhat awkward term "artist-created mesh"
[3] e.g. https://i.imgur.com/z52QeiQ.png
https://github.com/huxingyi/autoremesher
This has been an issue for much longer than generative AI. In graphics, Siggraph papers have been publishing fully automated methods claiming “high quality” for decades that produce lower quality than human created assets or human-in-the-loop tools. Same is true of lots of other things in computer science, academics is just prone to over-automating because it’s fun and clever and because we can.
This is probably the natural course of events. People publishing papers and projects are going for maximum wow, not for maximum quality. Often this is smart grad students in school, and they have no access to 3d professionals, and need to get something published asap to meet funding/graduation goals.
We shouldn’t necessarily complain about it or get tired, but instead work on tech transfer. Borrow these ideas and bring them to 3d tooling. It takes time, and we have to vette them and figure out which will actually help people. Sometimes the ones that get lots of attention turn out to have no staying power.
Completely agree. I used to take an interest in reserach into non photorealistic rendering. The same thing happened there. Paper after paper detailing 'novel' approaches to yet another cross hatch algorithm that no artist was ever going to use.
(This particular project does seem to be specifically about meshes but I'm addressing your broader point)
Not every application of 3d generation necessarily needs meshes. Or rather - might not need meshes in the not so distant future.
https://cascadeur.com/blog/general/ai-in-animation
Are there any good projects or papers that do this? I'm just starting to read up on the space.
Good luck with logos, serious graphics web design or UX or hoping to generate game assets that feel like all fitting together.
I have seen a few of these papers, and (from my limited experience) very rarely is the 3d model avauable for review.
I ran Instant Meshes over a model of Mount Chimborazo and it did an okay job. Although the mesh was fairly uniform with its quads, the topology looked weird. Like the typographic equivalent of "rivers" running through the mesh. I decided to keep the imperfect topology, with some amount of decimation, because it looks better.
https://i.ibb.co/2nX8WfC/chimborazo.png
Retopology tools are discussed on a Blender forum:
https://blenderartists.org/t/best-auto-retopology-tool-ever/...
Have you tried them? What are your thoughts? What tool would you consider state-of-the-art?
A good test of any re-mesher is to apply it to some 3D text. 3D text geometry is famously horrible. Another test is to apply it to a face. Face topology is very prescribed: there is a (mostly) single way of doing it. Any non-standard approach will create issues, especially if you wish to animate it. I would not have expected any automatic approach to work very well, yet have been surprised by the quality of QR.
This seems like a valid approach. After all, every human figure in a Pixar movie has been sourced from the same base mesh.
Once we get artists to start producing LoRA on UV wraps of their own assets, I think it'll get interesting.
You can download the models from Hugging Face.
Retopology will destroy the old uv map, so a new map will need to be made. Following that, the old textures will need to be baked onto this map.
I have spent easily as much as a day cleaning up (I.e. re-mqking) 3d scans.
Said no actual, working 3D artist ever.
The next step is geometry with organized contours that make sense, meaning that the model needs to cohesively understand the picture and not just the geometry. For example, if a person in the picture is wearing armor the model generates two separate models overlayed on one another, the armor and the mesh.
Deleted Comment
Those textures are completely useless, because they have all the light and view-dependency baked in. It's not really possible to extract a diffuse texture from this. There has been some work on generating material BRDFs [0], but I've not seen great results yet.
[0] for example, https://sheldontsui.github.io/projects/Matlaber