Many small queries are efficient in SQLite

I built a rules engine around SQLite queries for the last product I was working on. This article was the reason.

You can run an unbelievable # of select statements per unit time against a SQLite database. I think of it like a direct method invocation to members of some List<T> within the application itself.

Developers who still protest SQLite are really sleeping on the latency advantages. 2-3 orders of magnitude are really hard to fight against. This opens up entirely new use cases that hosted solutions cannot consider.

Databases that are built into the application by default are the future. Our computers are definitely big enough. There are many options for log replication (sync & async), as well as the obvious hypervisor snapshot approach (WAL can make this much less dangerous).

norir · a year ago

> Databases that are built into the application by default are the future. Our computers are definitely big enough.

Interesting point. It's easy to forget that in the past local storage was small and expensive so that it was necessary to have a separate database machine.

nevdka · a year ago

It wasn't just storage. In the 90s, most servers were single core. Databases already had network interfaces so they could be shared by non-web applications running on desktops, and so were the natural place to split a website over multiple machines. This turned into the received wisdom of always putting your database on a separate machine, despite processing power and storage latency improving so much since the 486 and pentium days that network overhead dominated response times.

VWWHFSfQ · a year ago

I'm still not understanding this push toward using SQLite as a production backend database. It's great for what it is, a tiny embeddable client-side application database. Like an address book on your phone. But even the developers themselves have steadfastly refused to allow it to expand beyond that scope. For instance, they won't add native types for any useful things like dates/times, or uuids. Because that would bloat the code and the size of the embedded object. So you're stuck with "everything is a string". Ref integrity can be enabled, but even those constraint options are very limited.

Not sure why people are still trying to shoe-horn it into a role that it's not meant to be in, and not even really supported to be.

blopker · a year ago

Maybe it's not for you, but the "everything is a string" thing is just the default. SQLite has STRICT table option since 2021 that people really should be using if possible: https://www.sqlite.org/stricttables.html

This brings strict types that people expect from the other server-based databases.

SQLite · a year ago

Just to clear up the error in the parent post: SQLite has native blobs, floats, and integers, not just strings. It doesn't have a bunch of other types for things like dates and JSON - you just represent those things using the native times of integer, float, string or blob. But it is not limited to only strings. This has been true for 20 years.

claytongulick · a year ago

It's absolutely supported for that use case if you can get away with single-writer multi-reader architecture, which IMHO most medium sized applications can. [1]

[1] https://www.sqlite.org/whentouse.html (see Server Side Database)

lupusreal · a year ago

> It's great for what it is, a tiny embeddable client-side application database. Like an address book on your phone

Size is no real concern, if the user of a client side application has many gigabytes of data a sqlite database is still well suited for the role. There's no shoehorning, it just works.

dudus · a year ago

And even being this simple and having clear challenges it's worth using it for the latency gains.

crazygringo · a year ago

> Databases that are built into the application by default are the future.

No they're not, because for web servers, you have many (tens? hundreds? thousands?) of web servers that all need to talk to a single database. And you sure don't want to replicate and sync a gigantic database across each web server -- that would be a disaster.

While for local apps on your phone or computer, usage of SQLite is already widespread -- it's not the future, it's here. And cloud-connected apps that work offline already do some sort of sync for that.

Nathanba · a year ago

Most people don't have tens or thousands of webservers and could be running with a more efficient inprocess database that syncs to another networked readonly replica for a very long time. I'm very surprised that MySQL or Postgres still don't have an inprocess mode because it's such an obvious win and on a technical coding level it should (naively) be very easy to switch from network calls to direct calls.

danielheath · a year ago

> for web servers, you have many (tens? hundreds? thousands?) of web servers that all need to talk to a single database

Maybe if you're in the top 1000 or so largest websites.

Back when the alexa 10k was a thing, $work was on it - and we're serving that level of traffic with a rails app running on 30 CPU cores. It would fit _easily_ onto a single machine.

secondcoming · a year ago

We have a giant Postgres DB, but instead of having every machine connect directly to it, we have a job that creates a smaller SQLite cache of relevant data and that’s pushed out to the machines who then reload that on the fly.

All this depends on your data being somewhat shardable of course.

mschuster91 · a year ago

> Databases that are built into the application by default are the future. Our computers are definitely big enough.

That assumes local applications themselves are the future, and that assumption has grown ever weaker and weaker with everyone and their dog going cloud-only (or starting as a SaaS in the first place) to grab all the sweet sweet recurring subscription revenue.

prisenco · a year ago

| Databases that are built into the application by default are the future.

I'm not sold on client side, but a lot of great work is being done on putting the db and application on the same server, between SQLite replication and other approaches like SpacetimeDB. I'm interested to see where it goes.

vlovich123 · a year ago

SQLite underpins nearly every on-disk storage mechanism at Apple and Google as the OS level. You may not be sold but systems programmers elsewhere are.

binary132 · a year ago

I often find myself thinking about what a database like SQLite that might look like if it had a native API instead of a query language. I guess it wouldn’t be so different from a dataframe API, but with persistence, locking, and relational features, and I guess an mmap under the hood.

corytheboyd · a year ago

> Databases that are built into the application by default are the future.

That’s a bit overzealous. SQLite is great, but it’s not a replacement for a hosted database. Not all data can live on the client. Use SQLite when you have the right use case for it (offline desktop app, etc.)

bob1029 · a year ago

> Not all data can live on the client

When you say "client" are you referring to the end user's machine, or the server hosting the application they are talking to?

matthewaveryusa · a year ago

If you have a database server and an application server, then move your application on the same machine as your database. That's basically the use case of sqlite server-side. You can easily fit 250TB on rack (or 64tb on an ec2.) That's a lot for non-blob storage: 750KB per us citizen

gwbas1c · a year ago

There are ways to use a hosted SQL, or from a cloud application.

I personally haven't used it.

What I am curious about is if "Many Small Queries Are Efficient in SQLite" when using the various hosted flavors of SQLite.

timClicks · a year ago

> Databases that are built into the application by default are the future. Our computers are definitely big enough.

Knowing this, I wonder if there will be another wave of object databases at some point.

grishka · a year ago

> Databases that are built into the application by default are the future.

It still depends a lot. If your system is high-load enough, you may want sharding, for example.

kllrnohj · a year ago

Wouldn't a hosted database really only "make sense" if you don't want sharding? It's a lot easier to shard at the front end before it makes it to any business logic than to do that and also shard the database. It's when you don't want to deal with database sharding at all that you'd want all application instances to hit the same hosted database.

The article doesn't explain why this is a bad idea for database servers hosted on a remote machine. The first reason is obvious... Network connections take memory and processing power on both client and server. Each additional query causes more resource usage. It is unnecessary overhead which is why things like multiple active result sets were created for SQL Server.

The network round trip time can also add up if you run into resource constraints doing this.

On a remote database, you also have to contend with multiple users and so complicated locking techniques can come into play depending upon the complexity of all database activity.

Many databases have options to return multiple result sets from one connection which helps control the overhead caused by this usage pattern.

EDIT: This also brings back horrible memories where developers would do this in a db client server architecture. Then they would often not close the DB connections when done. So you could have thousands of active connections basically doing nothing. Luckily, this problem was solved with better database connection handling.

simonw · a year ago

These days there are other tricks you can use to turn several SQL queries into a single round-trip too, with things like JSON aggregates.

Here's an example PostgreSQL query that returns 10 rows from one table and 20 rows from another table in a single network round-trip, using JSON serialization to return the different shaped rows in one go: https://simonwillison.net/dashboard/union-json-demo/

Nathanba · a year ago

it is very interesting, I have a similar blog post in my bookmarks as well: https://www.crunchydata.com/blog/generating-json-directly-fr...

the problem is that it only works with Postgres, not mysql or sqlite or pretty much anything else (at least not as conveniently) and the bigger problem is that the queries become more complex

slaymaker1907 · a year ago

Yes, you get some performance improvements, but I think that comes at the price of security isolation. Think about it, your application probably requires tons of libraries and evolves really quickly compared to the database itself. Additionally, having Admin/root permissions on the server hosting the DB is generally a much bigger deal than granting such permissions on an application server that talks to that DB.

If none of this makes sense, don't worry, that just means you don't work in enterprise...