atombender (u/atombender)

atombender commented on The Architecture of Open Source Applications (Volume 1) Berkeley DB aosabook.org/en/v1/bdb.ht... · Posted by u/grep_it

bborud · 15 hours ago

Berkeley DB is one of those things everyone respected, for some reason, but that didn't actually work if you threw a bit of data at it. And not just for us. I remember talking to companies that paid them lots of money to work on reliability, and it never got better.

But I do remember reading much of the source (trying to figure out why it didn't work) and thinking "this is pretty nice code".

atombender · 14 hours ago

Well, it worked for Amazon — Berkeley DB was used extensively there as the makn database, right from the beginning. I remember talking to an ex-Amazon engineer in 2006 who said BDB was still the main database used for inventory, and complained that everything was a mess, with different teams using different tech for everything. Around that time Amazon made DynamoDB to solve some of that mess — and it sat on top of BDB.

An old thread about this: https://news.ycombinator.com/item?id=29290095.

Loading parent story...

Loading comment...

Loading parent story...

Loading comment...

Loading parent story...

Loading comment...

Loading parent story...

Loading comment...

Loading parent story...

Loading comment...

atombender commented on TIL: Apple Broke Time Machine Again on Tahoe taoofmac.com/space/til/20... · Posted by u/rcarmo

ezfe · 7 days ago

When backing up to a local system it is extremely useable and reliable. It creates separate snapshot volumes for each backup and can be navigated in the Finder interface or using the fancy space interface.

Also, backups over the network are possible and have worked well for me for a few years.

atombender · 7 days ago

It's reliable except when it's not. I'm using Mojave, and currently fighting a bug where a local snapshot gets stuck. When I list the local snapshots, I see the old one, then a gap of several days, and then additional snapshots.

From what I can tell, this snapshot is preventing space reclamation. The last month or so, I've constantly run out of disk space even when not doing anything special. As in actually run out of disk space — apps start to become unresponsive or crash, and I get warning boxes about low disk space. When you run low, the OS is supposed to reclaim the space used by snapshots, but I guess it doesn't happen,

The stuck snapshot can't be deleted with tmutil. I get a generic "failed to delete" error. The snapshot is actually mounted by the backup daemon, but unmount also fails. The only solution I've found is to reboot. Then I get 200-300GB back and the cycle starts again, with snapshots getting stuck again.

I'm considering updating to Tahoe just because there's a chance they fixed it in that release.

atombender commented on Efficient String Compression for Modern Database Systems cedardb.com/blog/string_c... · Posted by u/jandrewrogers

cmrdporcupine · 7 days ago

It's a startup founded by -- and built with tech coming out of research by -- some well known people in the DB research community.

Successor to Umbra, I believe.

I know somebody (quite talented) working there. It's likely to kick ass in terms of performance.

But it's hard to get people to pay for a DB these days.

atombender · 7 days ago

It's probably going to be acquired. The last effort to commercialize the TUM (Technical University of Munich) database group's work was acquired by Snowflake and disappeared into that stack.

CedarDB is the commercialization of Umbra, the TUM group's in-memory database lead by professor Thomas Neumann. Umbra is a successor to HyPer, so this is the third generation of the system Neumann came up with.

Umbra/CedarDB isn't a completely new way of doing database stuff, but basically a combination of several things that rearchitect the query engine from the ground up for modern systems: A query compiler that generates native code, a buffer pool manager optimized for multi core, push-based DAG execution that divides work into batches ("morsels"), and in-memory Adaptive Radix Tries (never used in a database before, I think).

It also has an advanced query planner that embraces the latest theoretical advances in query optimization, especially some techniques to unnest complex multi-join query plans, especially with queries that have a ton of joins. The TUM group has published some great papers on this.

Loading parent story...

Loading comment...

Loading parent story...

Loading comment...