Show HN: CSVFiddle – Query CSV files with DuckDB in the browser

shbhrsaha · 4 years ago

Hey HN,

I made CSVFiddle because I wanted a quick way to query CSV files with SQL and share the results with other people.

The app runs 100% in-browser, so the data you import and the queries you write are never sent to a web server. When you share the URL to a workspace, all of its queries and references to CSV files are just encoded in the URL fragment.

In-browser querying is made possible by DuckDB-Wasm, which has been an awesome project to work with:

https://duckdb.org/2021/10/29/duckdb-wasm.html

There are definitely limitations with CSVFiddle (e.g. sometimes the auto-parsing feature doesn't accurately interpret the imported files), but so far it's been useful for a range of data tasks.

Some demo workspaces you can check out:

University Students by State https://tinyurl.com/6k35anth

Uber Pickups in NYC https://tinyurl.com/5n8av39h

kjksf · 4 years ago

Do you have an option to use a .csv from url, e.g.: https://csvfiddle.io/?url=https://foo.com/bar.csv

If not, could you implement it?

I ask because I'm working on a web-based file manager (https://filerion.com/)

One of the feature ideas I have is letting people to view CSV files.

I don't want to implement my own csv viewer and would rather integrate with tools like csvfiddle.

I.e. the user would right-click on a .csv file in my file manager, one of the options would be "View in CSVFiddle".

When chosen, I would create publicly visible, CORS-compatible url for the .csv file (so that you can fetch() it) and launch cvsfiddle.io?url=<url> in a new window.

shubhamjain · 4 years ago

Does it work with really large files? Like, >100mb or so. I was considering making something similar but with sqlite.js [1], but the problem with it is that it loads everything in memory, so I wasn't entirely sure how it will deal with larger workloads.

[1]: https://sql.js.org/#/

throwamon · 4 years ago

This sounds like a workaround to your problem:

https://news.ycombinator.com/item?id=27016630

danso · 4 years ago

By no means am I crapping on what you've created — it looks great, and I've always wanted to try DuckDB and now you've made a frictionless entrypoint — just wanted to point out in general that querying CSV with SQL is more accessible than some people might have assumed. e.g. here's a recent TIL blogpost from Simon Willison about him discovering how to do sqlite queries against CSV from the command line: https://til.simonwillison.net/sqlite/one-line-csv-operations

One suggestion I would make: the Uber trips data is interesting, but might be too big for this demo? I was getting a few loading errors when trying it (didn't investigate where in the process the bottleneck was though)

simonw · 4 years ago

A more appropriate comparison here might be to my Datasette Lite project, which runs SQLite in the browser using WASM and lets you join multiple CSV files by URL: https://simonwillison.net/2022/Jun/20/datasette-lite-csvs/

I think CSVFiddle is a fantastic addition to the ecosystem: making DuckDB more accessible - especially in a browser - is a very useful thing.

westurner · 4 years ago

On HN: "One-liner for running queries against CSV files with SQLite" https://news.ycombinator.com/item?id=31824030

hk1337 · 4 years ago

I have no problem with sqlite, in fact I really like it but that seems like it could be quite a hefty "one-liner".

throwamon · 4 years ago

Not to discourage you or anything, but Observable seems to cover this without too much hassle:

https://observablehq.com/@cmudig/introducing-sql-with-duckdb

tlarkworthy · 4 years ago

I recently streamed firebase into duckDB for realtime exploratory analytics in the browser (on Observablehq)

https://observablehq.com/@tomlarkworthy/firebase-to-duckdb

ijidak · 4 years ago

This is great.

Will definitely be using this.

I wouldn't worry too much about people focused on running from the command line.

I love the command line. But not as a query interface for quick investigation of CSV data.

I've been wanting something like this for a long time!

Excited to give it a try.

kristianp · 4 years ago

I guess if it uses duckdb it can query parquet (and duckdb) files too?

swuecho · 4 years ago

very useful tools!