Actually we have the same problems and challenges in 3d as well. At the end 3d is result of a 2d image rendered with a projection matrix and others. Stroking is difficult, for example. You need to combine little triangles to construct a line with miters in 3d, etc.
In my opinion, 3d graphics are harder than 2d for sure. WebGL API is dirty. There's a Webgl, Webgl2, and we're going to have another one: WebGPU, and hopefully it will make 3d easier. In 2d we just have 2d drawing context.
Line miters have no valid interpretation other than a view-dependent one -- the PostScript spec says that miter limit is based on curve angle, something which only makes sense in a flat, 2D perspective. Freya Holmér came up with a set of heuristics that work well enough for Shapes (I helped debug the math a bit there), but you first have to figure out what miter means in a 3 dimensional world, and it's not easy.
But yes, you are generally correct. Maybe if I was writing this post today, I'd give it the title "why shapes are so much harder than textures".
I also disagree that WebGPU is a worse spec than WebGL 1 and 2. I basically implement a WebGPU-like wrapper for WebGL today in my own WebGL projects, and it's made my life much easier, not harder. Though, full disclaimer, I am a contributor to the WebGPU spec.
>It has bothered me for a long time that for some reason, html5 canvas is better at drawing primitives than Unity.
I know, right?!?
I was quite frustrated by the fact that it was so much easier to draw 2d stuff with html5 canvas than with Unity, that I integrated a web browser with Unity and used it to do 2d data driven drawing (graphing, data visualization, text, and ui widgets) with JavaScript / canvas / d3 / etc, and then blitted the textures into Unity to display in UI overlays and 3d textures.
One advantage to that approach is that you can use any of the zillions of excellent well-supported JavaScript libraries like d3 to do the 2d drawing.
And I wanted to write as much of my Unity application in JavaScript as possible anyway, so it was easier to debug and (orders of magnitude) quicker to iterate changes (and possible to live-code).
>I also disagree that WebGPU is a worse spec than WebGL 1 and 2. I basically implement a WebGPU-like wrapper for WebGL today in my own WebGL projects, and it's made my life much easier, not harder. Though, full disclaimer, I am a contributor to the WebGPU spec.
He hopes WebGPU would make things easier. He didn't say WebGPU is worse than WebGL.
That is explicitly not a goal of the WebGPU working group. They have acknowledged that WebGPU will make 3D graphics programming in the browser significantly harder. But it will make supporting the 3D graphics API in the browser easier for the browser developers.
So basically, because Apple took over a decade to finally unsqueeze their tight fingers enough to dedicate any developers to WebGL2 in Safari, instead of improving the site developer experience, the W3C has opted to take huge swaths of responsibility away from browser developers and foist it on site devs.
The "hope" is that middle-tier libraries like Three.js and Babylon.js will hide the complexity away. I'm sure they will, and by all my experiences with them, will do a fine job of it. But it's certainly not the direction I hoped things would go.
WebGPU is more explicit, but I wouldn't say it's significantly harder. Sure, if you just want to get a triangle on the screen, it's going to take more lines of code. But everyone builds their own abstractions over the base API, and once you get to that point it's about the same ease-of-use.
And that knowledge from the WebGPU API can extend to the desktop with something like Rust's WGPU, where you can author a 3D app once and get a Vulkan, DirectX, Metal, _or_ OpenGL targeted version.
In my opinion if we're going to have a stable and a consistent API, I'm totally ok if it's going to be significantly harder. I think we should not need a middle-tier library for creating 3d graphics. Most of the time We needed threejs or twgl for mapping the shader inputs (uniforms etc) with javascript. The rest of it is just pure matrix math, that we do in 2d graphics as well.
It's worth to mention that they're also going to create another shading language. https://www.w3.org/TR/WGSL/
The API may get harder, but that's not a huge deal. Not saying that's a good thing, but APIs are far from the hardest thing in 3D graphics.
In my experience using WebGL, the hardest thing is all the missing parts compared to modern APIs and the fact that, whenever you want to implement something, you can't use the newest techniques and have to go hunting around to find out how the game dev community used to do things 20 years ago. I would take a jump in API complexity in exchange for access to a modern graphics pipeline and I'm excited by the new opportunities WebGPU will afford my work.
It is. There's a terminology problem in play here. Throughout this article 2D does not mean 2D, it means "arbitrary complexity paths, eg bezier curves". This is a small subset of 2D as makes up a UI. It'd be like saying 3D exclusively means infinitely detailed tesellated shapes with path traced rendering. That's definitely an area of 3D that exists, but of course is not at all the entirety of 3D in practice in eg games or movies. Rather it's more like the holy grail.
Same thing here with eg. SVGs. GPU accelerating SVGs is stupidly complicated because it's an inherently serial algorithm, and GPUs are poop at that. But how much of your 2D UI is made up of that? Text is in that same category but how much else? Typically very little. Maybe a few icons, but that tends to be about it. Instead you have higher level shapes, like round rectangles. And those you can do with a GPU quite easily. Similarly images are usually just a textured quad. Again trivial for a GPU. You could describe them as paths if you had a fully generic, fully accelerated path rendering system. But nobody has that, so nobody actually describes them like that.
So very nearly all 2D/UI systems are GPU accelerated. It'd be perhaps more accurate to call them hybrid renderers. Things like fonts are just CPU rendered because CPUs are better at it, but the GPU is doing all the fills, gradients, "simple" shapes, texturing, etc...
Programming graphics, especially 2D stuff, is far more ergonomic and convenient in your native language already executing on the CPU. If you can get away with it performance wise, there's really no incentive to incur the myriad obnoxious bullshit inherent in GPU programming.
But with the advent of high dpi displays it's become problematic to do even simple 2D/UI rendering on the CPU just because of the enormous quantity of pixels.
When you pull in GPU support, now you're stuck having to pick a backend (gl/vulkan/d3d/metal) or some compatibility layer to make some/all of them work. You have to write shaders, you have to constantly move state in/out of the GPU across this GPU:CPU API boundary. It's just a total clusterfuck best avoided if possible.
Browsers largely make use of the GPU for UI rendering. Direct2D, Cocoa, QT and GTK(4) are all hardware accelerated as well. So not really sure what you mean?
The linked article covers the major challenges that makes it difficult to adapt the GPU to 2D vector graphics rendering.
tl;dr 2D cares about shapes and curves and proper antialiased coverage of super small shapes, the traditional rasterization GPU pipeline is very good at triangles and textures and have limited coverage options.
Direct2D, Qt and GTK+ still do a good portion of the graphics work on the CPU, and only use GPU for composition. Some limited graphics can be done on the GPU, usually with quality tradeoffs. Font rasterization is still done the CPU, and uploaded to the GPU as a texture.
Newer libraries like Pathfinder, Spinel, piet-gpu all work by not using the triangle rasterization parts but instead treating the GPU as a general-purpose parallel processor with compute shaders.
I think it is. I see all kinds of 2D projects claiming to be "GPU accelerated" - for example GTK, KDE, web browsers. I'm not sure how much of the actual processing is done on the GPU, but it's enough to call it "accelerated"!
GPUs want to draw triangles, and in fact only know how to draw triangles[0]. Pretty much all graphics API innovation has been around either feeding more triangles to the GPU faster, letting the GPU create more triangles after they've been sent, or finding cool new ways to draw things on the surface of those triangles.
2D/UI breaks down into drawing curves, either as filled shapes or strokes. The preferred representation of such is a Bezier spline, which is a series of degree-three[1] polynomials that GPUs have zero support for rasterizing. Furthermore, strokes on the basis functions of those splines are not polynomials, but an even more bizarre curve type called an algebraic curve. You cannot just offset the control points to derive a stroke curve; you either have to approximate the stroke itself with Beziers, or actually draw a line sequentially in a way that GPUs are really not capable of doing.
The four things you can do to render 2D/UI on a GPU is:
- Tesselate the Bezier spline with a series of triangles. Lyon does this. Bezier curves make this rather cheap to do, but this requires foreknowledge of what scale the Bezier will be rendered at, and you cannot adjust stroke sizes at all without retessellating.
- Send the control points to the GPU and use hardware tessellation to do the above per-frame. No clue if anyone does this.
- Don't tessellate at all, but send the control points to the GPU as a polygonal mesh, and draw the actual Beziers in the fragment shaders for each polygon. For degree-two/quadratics there are a series of coordinate transforms that you can do which conveniently map all curves to one UV coordinate space; degree-three/cubics require a lot more attention in order to render correctly. If I remember correctly Mozilla Pathfinder does this[2].
- Send a signed distance field and have the GPU march it to render curves. I don't know much about this but I remember hearing about this a while back.
All of these approaches have downsides. Tessellation is the approach I'm most familiar with because it's used heavily in Ruffle; so I'll just explain it's downsides to give you a good idea of why this is a huge problem:
- We can't support some of Flash's weirder rendering hacks, like hairline strokes. Once we have a tessellated stroke, it will always be that width regardless of how we scale the shape. But hairlines require that the stroke get proportionally bigger as the shape gets smaller. In Flash, they were rendering on CPU, so it was just a matter of saying "strokes are always at least 1px".
- We have to sort of guess what scale we want to render at and hope we have enough detail that the curves look like curves. There's one particular Flash optimization trick that consistently breaks our detail estimation and causes us to generate really lo-fi polygons.
- Tessellation requires the curve shape to actually make sense as a sealed hull. We've exposed numerous underlying bugs in lyon purely by throwing really complicated or badly-specified Flash art at it.
- All of this is expensive, especially for complicated shapes. For example, pretty much any Homestuck SWF will lock up your browser for multiple minutes as lyon tries to make sense of all of Hussie's art. This also precludes varying strokes by retessellating per-frame, which would otherwise fix the hairline stroke problem I mentioned above.
[0] AFAIK quads are emulated on most modern GPUs, but they are just as useless for 2D/UI as triangles are.
[1] Some earlier 2D systems used degree-two Bezier splines, including most of Adobe Flash.
[2] We have half a PR to use this in Ruffle, but it was abandoned a while back.
I disagree with the fundamental assertion that 2d is harder than 3d. I think a more accurate title would be "Why are 2D vector graphics so much harder than 3D when using a 3D-oriented raster graphics pipeline?"
If we remove the existing constraints and say you have to build these things in pure software, I think the equation would look a little different. I don't know of many developers who can accurately describe what the GPU does these days. Triangle rasterization is not an easy problem if you have to solve it yourself.
It's that you're usually rasterizing much more complicated shapes than triangles in 2D, like polygons and curves and fonts with hinting (which is often actually implemented as Turing-complete byte code, not stuff that's easy to run in parallel entirely in the GPU).
Writing a triangle rasterizer is not that hard. What APIs like OpenGL give you for free (other than a performance boost) is walking all the pixels that are covered by each triangle, and computing the barycentric coordinates for each pixel (and then using these to lerp the vertex data). So that's what you have to replace by a CPU program.
I find the much harder part is how to setup the architecture in such a way that the data flows through your shader pipelines without an unbearable amount of boilerplate. 3D APIs don't help with that - if anything they make it harder.
Certainly. One can write a trivial version in maybe 30 lines of code. Writing a triangle rasterizer that you would want to use in a product that is consumed by another human is hard.
Also, it is my experience that none of these things can truly be built in isolation. Depth buffers and acceleration structures crosscut all aspects of a rendering engine.
I do agree regarding the 3d APIs though. Writing it yourself in software mode can be easier than learning someone else's mousetrap. This is the path I prefer, even if it is slower at first.
Is it even that? People tolerate a lot more artifacts in 3D than 2D. If you wrote a 2D graphics engine that used triangles as primitives people probably wouldn't like it (and it would probably render text very slowly.)
Easy, 2D is harder because it can have hard edges. Look at any 3D game, the edges of the polygons are obscured by antialiasing methods and texture wrapping. In 2D, you need to draw a line at some arbitrary angle (or worse, a spline!) where on one side there's black and on the other side white. You will have to fake this with subpixel rendering. You also need to detect when the subpixels would interfere with the actual line width. You need to be able to snap two objects together that fit together seamlessly in the mathematical sense, without a sliver in-between popping in and out of existence.
Anyone who has tried to fake 2D using 3D rendering (such as inside a game engine) has likely run into the above issues.
from what i've read, they're pretty different abstractions, so i'm concerned that learning WebGL2 will give me a mental model that will be largely useless and just add to the confusion when switching to WebGPU. even the shader language is different (WGSL vs GLSL) [1]. how much knowledge will be transferable?
It's a strange take that 2D == vector graphics and 3D == resterization and then write a blog post that's really about vector vs raster under the auspices of 2D vs 3D.
Both the 2D and 3D here are vectors -- both the 2D lines and the 3D triangles are represented as a sequence of mathematical points which have to be "rasterized". But, put simply, triangle rasterization is just a much easier problem than curve rasterization.
Paul Haberli, when he was a computer graphics researcher at SGI, wrote a paper in 1993 called Texture Mapping as a Fundamental Drawing Primitive, which was about how to use texture mapping for drawing anti-aliased lines, air-brushes, anti-aliased text, volume rendering, environment mapping, color interpolation, contouring, and many other applications.
The tl;dr is that 2D vector graphics uses implicit geometry while 3D is explicitly defined using vertices and triangles, implicit means "more maths".
And as the article says, the reason we don't use implicit geometry in 3D is that it is simply too hard except for a few specific cases, at least for now.
To answer the question "why 2D is harder?" is "because we are not talking about the same thing". From easiest to hardest we have: 2D triangle-based, 3D triangle-based, 2D implicit, 3D implicit.
In my opinion, 3d graphics are harder than 2d for sure. WebGL API is dirty. There's a Webgl, Webgl2, and we're going to have another one: WebGPU, and hopefully it will make 3d easier. In 2d we just have 2d drawing context.
But yes, you are generally correct. Maybe if I was writing this post today, I'd give it the title "why shapes are so much harder than textures".
I also disagree that WebGPU is a worse spec than WebGL 1 and 2. I basically implement a WebGPU-like wrapper for WebGL today in my own WebGL projects, and it's made my life much easier, not harder. Though, full disclaimer, I am a contributor to the WebGPU spec.
Shapes: A real-time vector graphics library for Unity by Freya Holmér:
https://acegikmo.com/shapes
>It has bothered me for a long time that for some reason, html5 canvas is better at drawing primitives than Unity.
I know, right?!?
I was quite frustrated by the fact that it was so much easier to draw 2d stuff with html5 canvas than with Unity, that I integrated a web browser with Unity and used it to do 2d data driven drawing (graphing, data visualization, text, and ui widgets) with JavaScript / canvas / d3 / etc, and then blitted the textures into Unity to display in UI overlays and 3d textures.
One advantage to that approach is that you can use any of the zillions of excellent well-supported JavaScript libraries like d3 to do the 2d drawing.
And I wanted to write as much of my Unity application in JavaScript as possible anyway, so it was easier to debug and (orders of magnitude) quicker to iterate changes (and possible to live-code).
UnityJS description and discussion:
https://news.ycombinator.com/item?id=22689040
Shapes looks really wonderful, and would have been perfect for some of my Unity projects, so I'll check it out and consider using it in the future!
He hopes WebGPU would make things easier. He didn't say WebGPU is worse than WebGL.
That is explicitly not a goal of the WebGPU working group. They have acknowledged that WebGPU will make 3D graphics programming in the browser significantly harder. But it will make supporting the 3D graphics API in the browser easier for the browser developers.
So basically, because Apple took over a decade to finally unsqueeze their tight fingers enough to dedicate any developers to WebGL2 in Safari, instead of improving the site developer experience, the W3C has opted to take huge swaths of responsibility away from browser developers and foist it on site devs.
The "hope" is that middle-tier libraries like Three.js and Babylon.js will hide the complexity away. I'm sure they will, and by all my experiences with them, will do a fine job of it. But it's certainly not the direction I hoped things would go.
And that knowledge from the WebGPU API can extend to the desktop with something like Rust's WGPU, where you can author a 3D app once and get a Vulkan, DirectX, Metal, _or_ OpenGL targeted version.
It's worth to mention that they're also going to create another shading language. https://www.w3.org/TR/WGSL/
In my experience using WebGL, the hardest thing is all the missing parts compared to modern APIs and the fact that, whenever you want to implement something, you can't use the newest techniques and have to go hunting around to find out how the game dev community used to do things 20 years ago. I would take a jump in API complexity in exchange for access to a modern graphics pipeline and I'm excited by the new opportunities WebGPU will afford my work.
Same thing here with eg. SVGs. GPU accelerating SVGs is stupidly complicated because it's an inherently serial algorithm, and GPUs are poop at that. But how much of your 2D UI is made up of that? Text is in that same category but how much else? Typically very little. Maybe a few icons, but that tends to be about it. Instead you have higher level shapes, like round rectangles. And those you can do with a GPU quite easily. Similarly images are usually just a textured quad. Again trivial for a GPU. You could describe them as paths if you had a fully generic, fully accelerated path rendering system. But nobody has that, so nobody actually describes them like that.
So very nearly all 2D/UI systems are GPU accelerated. It'd be perhaps more accurate to call them hybrid renderers. Things like fonts are just CPU rendered because CPUs are better at it, but the GPU is doing all the fills, gradients, "simple" shapes, texturing, etc...
But with the advent of high dpi displays it's become problematic to do even simple 2D/UI rendering on the CPU just because of the enormous quantity of pixels.
When you pull in GPU support, now you're stuck having to pick a backend (gl/vulkan/d3d/metal) or some compatibility layer to make some/all of them work. You have to write shaders, you have to constantly move state in/out of the GPU across this GPU:CPU API boundary. It's just a total clusterfuck best avoided if possible.
tl;dr 2D cares about shapes and curves and proper antialiased coverage of super small shapes, the traditional rasterization GPU pipeline is very good at triangles and textures and have limited coverage options.
Direct2D, Qt and GTK+ still do a good portion of the graphics work on the CPU, and only use GPU for composition. Some limited graphics can be done on the GPU, usually with quality tradeoffs. Font rasterization is still done the CPU, and uploaded to the GPU as a texture.
Newer libraries like Pathfinder, Spinel, piet-gpu all work by not using the triangle rasterization parts but instead treating the GPU as a general-purpose parallel processor with compute shaders.
See eg https://www.youtube.com/watch?v=vjENktnbCaE
2D/UI breaks down into drawing curves, either as filled shapes or strokes. The preferred representation of such is a Bezier spline, which is a series of degree-three[1] polynomials that GPUs have zero support for rasterizing. Furthermore, strokes on the basis functions of those splines are not polynomials, but an even more bizarre curve type called an algebraic curve. You cannot just offset the control points to derive a stroke curve; you either have to approximate the stroke itself with Beziers, or actually draw a line sequentially in a way that GPUs are really not capable of doing.
The four things you can do to render 2D/UI on a GPU is:
- Tesselate the Bezier spline with a series of triangles. Lyon does this. Bezier curves make this rather cheap to do, but this requires foreknowledge of what scale the Bezier will be rendered at, and you cannot adjust stroke sizes at all without retessellating.
- Send the control points to the GPU and use hardware tessellation to do the above per-frame. No clue if anyone does this.
- Don't tessellate at all, but send the control points to the GPU as a polygonal mesh, and draw the actual Beziers in the fragment shaders for each polygon. For degree-two/quadratics there are a series of coordinate transforms that you can do which conveniently map all curves to one UV coordinate space; degree-three/cubics require a lot more attention in order to render correctly. If I remember correctly Mozilla Pathfinder does this[2].
- Send a signed distance field and have the GPU march it to render curves. I don't know much about this but I remember hearing about this a while back.
All of these approaches have downsides. Tessellation is the approach I'm most familiar with because it's used heavily in Ruffle; so I'll just explain it's downsides to give you a good idea of why this is a huge problem:
- We can't support some of Flash's weirder rendering hacks, like hairline strokes. Once we have a tessellated stroke, it will always be that width regardless of how we scale the shape. But hairlines require that the stroke get proportionally bigger as the shape gets smaller. In Flash, they were rendering on CPU, so it was just a matter of saying "strokes are always at least 1px".
- We have to sort of guess what scale we want to render at and hope we have enough detail that the curves look like curves. There's one particular Flash optimization trick that consistently breaks our detail estimation and causes us to generate really lo-fi polygons.
- Tessellation requires the curve shape to actually make sense as a sealed hull. We've exposed numerous underlying bugs in lyon purely by throwing really complicated or badly-specified Flash art at it.
- All of this is expensive, especially for complicated shapes. For example, pretty much any Homestuck SWF will lock up your browser for multiple minutes as lyon tries to make sense of all of Hussie's art. This also precludes varying strokes by retessellating per-frame, which would otherwise fix the hairline stroke problem I mentioned above.
[0] AFAIK quads are emulated on most modern GPUs, but they are just as useless for 2D/UI as triangles are.
[1] Some earlier 2D systems used degree-two Bezier splines, including most of Adobe Flash.
[2] We have half a PR to use this in Ruffle, but it was abandoned a while back.
If we remove the existing constraints and say you have to build these things in pure software, I think the equation would look a little different. I don't know of many developers who can accurately describe what the GPU does these days. Triangle rasterization is not an easy problem if you have to solve it yourself.
https://en.wikipedia.org/wiki/PostScript_fonts#Type_2
https://en.wikipedia.org/wiki/Font_hinting
https://www.typotheque.com/articles/hinting
I find the much harder part is how to setup the architecture in such a way that the data flows through your shader pipelines without an unbearable amount of boilerplate. 3D APIs don't help with that - if anything they make it harder.
Certainly. One can write a trivial version in maybe 30 lines of code. Writing a triangle rasterizer that you would want to use in a product that is consumed by another human is hard.
Also, it is my experience that none of these things can truly be built in isolation. Depth buffers and acceleration structures crosscut all aspects of a rendering engine.
I do agree regarding the 3d APIs though. Writing it yourself in software mode can be easier than learning someone else's mousetrap. This is the path I prefer, even if it is slower at first.
Anyone who has tried to fake 2D using 3D rendering (such as inside a game engine) has likely run into the above issues.
https://surma.dev/things/webgpu/ (https://news.ycombinator.com/item?id=30600525)
https://mattdesl.svbtle.com/drawing-lines-is-hard (https://news.ycombinator.com/item?id=13671016)
https://blog.mapbox.com/drawing-antialiased-lines-with-openg...
i'd like to try implementing this paper [1] in WebGPU, but think it's over my head currently, having mainly worked with Canvas2D.
[1] https://jcgt.org/published/0002/02/08/paper.pdf
[1] https://dmnsgn.me/blog/from-glsl-to-wgsl-the-future-of-shade...
Previous discussion: https://news.ycombinator.com/item?id=19871207
> Revision of history, wrong, and insultingly so. This post is a rewrite of serious graphics history. Read Foley Dan Vam, forget this tripe.
http://en.wikipedia.org/wiki/Paul_Haeberli
http://www.graficaobscura.com/texmap/index.html
The tl;dr is that 2D vector graphics uses implicit geometry while 3D is explicitly defined using vertices and triangles, implicit means "more maths".
And as the article says, the reason we don't use implicit geometry in 3D is that it is simply too hard except for a few specific cases, at least for now.
To answer the question "why 2D is harder?" is "because we are not talking about the same thing". From easiest to hardest we have: 2D triangle-based, 3D triangle-based, 2D implicit, 3D implicit.