Show HN: ChartGPU – WebGPU-powered charting library (1M points at 60fps)

leeoniya · 20 days ago

uPlot maintainer here. this looks interesting, i'll do a deeper dive soon :)

some notes from a very brief look at the 1M demo:

- sampling has a risk of eliminating important peaks, uPlot does not do it, so for apples-to-apples perf comparison you have to turn that off. see https://github.com/leeoniya/uPlot/pull/1025 for more details on the drawbacks of LTTB

- when doing nothing / idle, there is significant cpu being used, while canvas-based solutions will use zero cpu when the chart is not actively being updated (with new data or scale limits). i think this can probably be resolved in the WebGPU case with some additional code that pauses the updates.

- creating multiple charts on the same page with GL (e.g. dashboard) has historically been limited by the fact that Chrome is capped at 16 active GL contexts that can be acquired simultaneously. Plotly finally worked around this by using https://github.com/greggman/virtual-webgl

> data: [[0, 1], [1, 3], [2, 2]]

this data format, unfortunately, necessitates the allocation of millions of tiny arrays. i would suggest switching to a columnar data layout.

uPlot has a 2M datapoint demo here, if interested: https://leeoniya.github.io/uPlot/bench/uPlot-10M.html

huntergemmer · 20 days ago

Really appreciate you taking the time to look, Leon - uPlot has been a huge inspiration for proving that browser charts don't have to be slow.

Both points are fair:

1. LTTB peak elimination - you're right, and that PR is a great reference. For the 1M demo specifically, sampling is on by default to show the "it doesn't choke" story. Users can set sampling: 'none' for apples-to-apples comparison. I should probably add a toggle in the demo UI to make that clearer.

2. Idle CPU - good catch. Right now the render loop is probably ticking even when static. That's fixable - should be straightforward to only render on data change or interaction. Will look into it.

Would love your deeper dive feedback when you get to it. Always more to learn from someone who's thought about this problem as much as you have.

dapperdrake · 19 days ago

Blind sampling like this makes it useless for real-world statistics of the kind your users care about.

And column-oriented data is a must. Look at Rlang's data frames, pandas, polars, numpy, sql, and even Fortran's matrix layout.

Also need specialized expicitly targetable support for Float32Array and Float64Array. Both API and ABI are necessary if you want to displace incumbents.

There is huge demand for a good web implementation. This is what it takes.

Am interested in collaborating.

olau · 19 days ago

Original Flot maintainer here.

I once had to deal with many million data points for an application. I ended up mip-mapping them client-side.

But regarding sampling, if it's a line chart, you can sample adaptively by checking whether the next point makes a meaningfully visible difference measured in pixels compared to its neighbours. When you tune it correctly, you can drop most points without the difference being noticeable.

I didn't find any else doing that at the time, and some people seemed to have trouble accepting it as a viable solution, but if you think about it, it doesn't actually make sense to plot say 1 million points in a line chart 1000 pixels wide. On average that would make 1000 points per pixel.

PaulDavisThe1st · 19 days ago

We routinely face this in the audio world when drawing waveforms. You typically have on the order of 10-100k samples per second, durations of 10s-1000s of seconds, and pixel widths of on the order of 1-10k pixels.

Bresenham's is one algorithm historically used to downsample the data, but a lot of contemporary audio software doesn't use that. In Ardour (a cross-platform, libre, open source DAW), we actually compute and store min/max-per-N-samples and use that for plotting (and as the basis for further downsampling.

ghc · 19 days ago

> Original Flot maintainer here.

I discovered flot during my academic research career circa 2008 and it saved my ass more times than I can count. I just wanted to say thank you for that. I wouldn't be where I am today without your help :)

leeoniya · 19 days ago

hey!

> But regarding sampling, if it's a line chart, you can sample adaptively by checking whether the next point makes a meaningfully visible difference measured in pixels compared to its neighbours.

uPlot basically does this (see sibling comment), so hopefully that's some validation for you :)

dapperdrake · 19 days ago

This is a good sampling transform to offer. Call it "co-domain awareness" or something.

vlovich123 · 20 days ago

Is there any techniques using wavelet decomposition to decimate the high frequency component while retaining peaks? I feel like that's a more principled approach than sampling but I haven't seen any literature on it describing the specific techniques (unless the idea is fundamentally unsound which is not obvious to me).

huntergemmer · 20 days ago

Interesting idea - I haven't explored wavelet-based approaches but the intuition makes sense: decompose into frequency bands, keep the low-frequency trend, and selectively preserve high-frequency peaks that exceed some threshold.

My concern would be computational cost for real-time/streaming use cases. LTTB is O(n) and pretty cache-friendly. Wavelet transforms are more expensive, though maybe a GPU compute shader could make it viable.

The other question is whether it's "visually correct" for charting specifically. LTTB optimizes for preserving the visual shape of the line at a given resolution. Wavelet decomposition optimizes for signal reconstruction - not quite the same goal.

That said, I'd be curious to experiment. Do you have any papers or implementations in mind? Would make for an interesting alternative sampling mode.

dapperdrake · 19 days ago

This really depends on your problem domain.

apitman · 19 days ago

> creating multiple charts on the same page with GL (e.g. dashboard) has historically been limited by the fact that Chrome is capped at 16 active GL contexts that can be acquired simultaneously. Plotly finally worked around this by using https://github.com/greggman/virtual-webgl

Sometimes I like to ponder on the immense amount of engineering effort expended on working around browser limitations.

dapperdrake · 19 days ago

Think of it as finally targeting a smartphone. People like beautiful pictures. And your phone is already in your hand.

aurbano · 20 days ago

Not much to add, but as a very happy uPlot user here - just wanted to say thank you for such an amazing library!!

leeoniya · 19 days ago

yw!

sarusso · 19 days ago

What I did in a few projects to plot aggregated (resampled) data without loosing peaks was to plot it over an area chart representing the min-max values before aggregating (resampling). It worked pretty well.

Bengalilol · 19 days ago

One small thing I noticed: when you zoom in or out (or change the time span), the y-axis stays the same instead of adapting to the visible data.

dapperdrake · 19 days ago

Both are useful. With the y-axis staying the same there is a stable point of reference. Then you can see how sub-samples behave relative to your whole sample.

Cabal · 19 days ago

I wouldn't spend too much of your time deep diving - it's an AI slop project.

Dead Comment

zokier · 20 days ago

If you have tons of datapoints, one cool trick is to do intensity modulation of the graph instead of simple "binary" display. Basically for each pixel you'd count how many datapoints it covers and map that value to color/brightness of that pixel. That way you can visually make out much more detail about the data.

In electronics world this is what "digital phosphor" etc does in oscilloscopes, which started out as just emulating analog scopes. Some examples are visible here https://www.hit.bme.hu/~papay/edu/DSOdisp/gradient.htm

huntergemmer · 20 days ago

Great suggestion - density mapping is a really effective technique for overplotted data. Instead of drawing 1M points where most overlap, you're essentially rendering a heatmap of point concentration. WebGPU compute shaders would be perfect for this - bin the points into a grid, count per cell, then render intensity. Could even do it in a single pass. I've been thinking about this for scatter plots especially, where you might have clusters that just look like solid blobs at full zoom-out. A density mode would reveal the structure. Added to the ideas list - thanks for the suggestion!

akomtu · 20 days ago

You don't need webgpu for that. It's a standard vertex shader -> fragment shader pass with the blending mode set to addition.

jsmailes · 19 days ago

That digital phosphor effect is fascinating! As someone who works frequently with DSP and occasionally with analogue signals, it's incredible to see how you can pull out the carrier/modulation just by looking at (effectively) a moving average. It's also interesting to see just how much they have to do behind the scenes to emulate a fairly simple physical effect.

leeoniya · 20 days ago

agreed, heatmaps with logarithmic cell intensity are the way to go for massive datasets in things like 10,000-series line charts and scatter plots. you can generally drill downward from these, as needed.

dapperdrake · 19 days ago

Good idea.

Add Lab-comor space for this though, like the color theme solarized-light.

Also add options to side-step red-green blindness and blue-yellow blindndess.

hienyimba · 20 days ago

Right on time.

We’ve been working on a browser-based Link Graph (osint) analysis tool for months now (https://webvetted.com/workbench). The graph charting tools on the market are pretty basic for the kind of charting we are looking to do (think 1000s of connected/disconnected nodes/edges. Being able to handle 1M points is a dream.

This will come in very handy.

huntergemmer · 20 days ago

That's a cool project! Just checked out the workbench. I should be upfront though: ChartGPU is currently focused on traditional 2D charts (line, bar, scatter, candlestick, etc.), not graph/network visualization with nodes and edges. That said, the WebGPU rendering patterns would translate well to force-directed graphs. The scatter renderer already handles thousands of instanced points - extending that to edges wouldn't be a huge leap architecturally.

Is graph visualization something you'd want as part of ChartGPU, or would a separate "GraphGPU" type library make more sense? Curious how you're thinking about it.

agentcoops · 20 days ago

Really fantastic work! Can't wait to play around with your library. I did a lot of work on this at a past job long ago and the state of JS tooling was so inadequate at the time we ended up building an in-house Scala visualization library to pre-render charts...

More directly relevant, I haven't looked at the D3 internals for a decade, but I wonder if it might be tractable to use your library as a GPU rendering engine. I guess the big question for the future of your project is whether you want to focus on the performance side of certain primitives or expand the library to encompass all the various types of charts/customization that users might want. Probably that would just be a different project entirely/a nightmare, but if feasible even for a subset of D3 you would get infinitely customizable charts "for free." https://github.com/d3/d3-shape might be a place to look.

In my past life, the most tedious aspect of building such a tool was how different graph standards and expectations are across different communities (data science, finance, economics, natural sciences, etc). Don't get me started about finance's love for double y-axis charts... You're probably familiar with it, but https://www.amazon.com/Grammar-Graphics-Statistics-Computing... is fantastic if you continue on your own path chart-wise and you're looking for inspiration.

lmeyerov · 19 days ago

You may enjoy Graphistry (eg, pygraphistry, GraphistryJS), where our users regularly do 1M+ graph elements interactively, such as for event & entity data. Webgl frontend, GPU server backend for layouts too intense for frontend. We have been working on stability over the last year with large-scale rollout users (esp cyber, IT, social, finance, and supply chain), and now working on the next 10X+ of visual scaling. Python version: https://github.com/graphistry/pygraphistry . It includes many of the various tricks mentioned here, like GPU hitmapping, and we helped build various popular libs like apache arrow for making this work end-to-end :)

Most recently adding to the family is our open source GFQL graph language & engine layer (cypher on GPUs, including various dataframe & binary format support for fast & easy large data loading), and under the louie.ai umbrella, piloting genAI extensions

MeteorMarc · 20 days ago

Can you please comment about this trust listing? Are we talking the same thing?https://gridinsoft.com/online-virus-scanner/url/webvetted-co...

wesammikhail · 20 days ago

my 2 cents: I'm one of these people that could possibly use your tool. However, the website doesnt give me much info. I'd urge you to add some more pages that showcase the product and what it can do with more detail. Would help capture more people imo.

losteric · 19 days ago

Does Cosmos.gl do what you need? https://cosmos.gl/?path=/docs/welcome-to-cosmos-gl--docs

kposehn · 20 days ago

Agreed. This is highly, highly useful. Going to integrate this today.

huntergemmer · 20 days ago

Awesome - let me know how it goes! Happy to help if you hit any rough edges. GitHub issues or ping me here.

kqr · 19 days ago

Is this an open source Palantir?

huntergemmer · 19 days ago

Update: Patched idle CPU usage while nothing is being rendered.

One thing to note: I added a toggle to "Benchmark mode" in the 1M benchmark example - this preserves the benchmark capability while demonstrating efficient idle behavior.

Another thing to note: Do not be alarmed when you see the FPS counter display 0 (lol), that is by design :) Frames are rendered efficiently. If there's nothing to render (no dirty frames) nothing is rendered. The chart will still render at full speed when needed, it just doesn't waste cycles rendering the same static image 60 times per second.

Blown away by all of you amazing people and your support today :)

huntergemmer · 19 days ago

Update: Pushed some improvements to the candlestick streaming demo based on feedback from this thread.

You can now render up to 5 million candles. Just tested it - Achieved 104 FPS with 5M candles streaming at 20 ticks/second.

Demo: https://chartgpu.github.io/ChartGPU/examples/candlestick-str...

Also fixed from earlier suggestions and feedback as noted before:

- Data zoom slider bug has been fixed (no longer snapping to the left or right) - Idle CPU usage bug (added user controls along with more clarity to 1M point benchmark)

13 hours on the front page, 140+ comments and we're incorporating feedback as it comes in.

This is why HN is the best place to launch. Thanks everyone :)

mcintyre1994 · 19 days ago

Pretty sure you have an extra 60x multiplier on all those time frames. Eg 1s shows 1 minute, 15m looks like 15 hours, 1D looks like 2 months.

azangru · 20 days ago

Bug report: there is something wrong with the slider below the chart in the million-points example:

https://chartgpu.github.io/ChartGPU/examples/million-points/...

While dragging, the slider does not stay under the cursor, but instead moves by unexpected distances.

huntergemmer · 20 days ago

Thanks - you're the second person to report this! Same issue as the Mac M1 scrollbar bug reported earlier.

Looks like the data zoom slider has a momentum/coordinate mapping issue. Bumping this up the priority list since multiple people are hitting it.

virgil_disgr4ce · 20 days ago

I also experienced this behavior :)

barrell · 20 days ago

I just rewrote all the graphs on phrasing [1] to webgl. Mostly because I wanted custom graphs that didn’t look like graphs, but also because I wanted to be able to animate several tens of thousands of metrics at a time.

After the initial setup and learning curve, it was actually very easy. All in all, way less complicated than all the performance hacks I had to do to get 0.01% of the data to render half as smooth using d3.

Although this looks next level. I make sure all the computation happens in a single o(n) loop but the main loop still takes place on the cpu. Very well done

To anyone on the fence, GPU charting seemed crazy to me beforehand (classic overengineering) but it ends up being much simpler (and much much much smoother) than traditional charts!

[1] https://phrasing.app

tempaccsoz5 · 20 days ago

TimeLine maintainer here. Their demo for live-streamed data [0] in a line plot is surprisingly bad given how slick the rest of it seems. For comparison, this [1] is a comparatively smooth demo of the same goal, but running entirely on the main thread and using the classic "2d" canvas rendering mode.

[0]: https://chartgpu.github.io/ChartGPU/examples/live-streaming/...

[1]: https://crisislab-timeline.pages.dev/examples/live-with-plug...

kshri24 · 20 days ago

The entire library seems to be AI generated [1] [2]. Not sure how much of it was actually written by a human and how much of it was AI.

[1]: https://github.com/ChartGPU/ChartGPU/blob/main/.cursor/agent...

[2]: https://github.com/ChartGPU/ChartGPU/blob/main/.claude/agent...

bobmoretti · 19 days ago

Given that the author's post and comments all sound like they were run through an LLM, I'm not at all surprised.

janice1999 · 19 days ago

That was obvious before even looking at the repo because the OP used "the core insight" in the intro. Other telltale signs of these type of AI projects:

- new account

- spamming the project to HN, reddit etc the moment the demo half works

- single contributor repo

- Huge commits minutes apart

- repo is less than a week old (sometimes literally hours)

- half the commits start with "Enhance"

- flashly demo that hides issues immediately obvious to experts in the field

- author has slop AI project(s)

OP uses more than one branch so he's more sophisticated than most.