Lovely read. Condensing some, there's three node types in the system, writers, compactors, and readers.
> Writers read from Kafka, (briefly) buffer events in memory, upload events to blob storage in our custom file format, and then commit the presence of these new files to our metadata store.... Compactors scan the metadata store for small files generated by the Writers and previous compactions, and compact them into larger files.... The Reader (leaf) nodes run queries over individual files in blob storage and return partial aggregates, which are re-aggregated by the distributed query engine.
And then the meta-data supporting the system:
> Husky's metadata store has multiple responsibilities, but its most important one is to serve as the strongly consistent source of truth for the set of files currently visible to each customer. We’ll delve into the details of our metadata store more in future blog posts, but it is a thin abstraction around FoundationDB, which we selected because it was one of the few open source OLTP database systems that met our requirements
There's some nice scalability/isolation benefits in this all. Having reader nodes reading from network storage has created a lot of flexibility & ability to shift work around on demand.
Keeping all the metadata in FoundationFB is exciting, & sounds like a great use case, for it's safe transactional updates!
It's remarkable how the data pipeline in almost all companies converge to the same architecture:
* You have services emit data into streams.
* You dump the streams into your storage with high frequency so you can have near real-time result, this process will create many small files.
* Because small files are inefficient, you have compactors that run over the small files and merge them into bigger files, and/or delete records that's obsolete.
* You run a query engine that read over the small files and large files to get the final result.
* To speed up step 2,3,4 you store the metadata of the files in-memory / in a database.
This is a great read, thanks for sharing the architecture. I am glad to see the increase in adoption of FoundationDB. It is a great piece of technology why is also why we are using it as a core component for Tigris https://docs.tigrisdata.com/overview/key-concepts
Has Datadog come up with a new generation of sales approaches? I (and many others, according to the discussion when the topic comes up) have had bad experiences.
Had bad experiences as well.
- Pushy
- Trying to sell you stuff even if you explicitly mention you're only interested in one specific service multiple times
- Don't tailor the sales process at all to your needs
its a shame, the product is kind of nice. But this is 100% of putting.
one mistake in my logs, and my account was due > 10k us$. until a manager contact-me after a month.
It appears to be a method to force a "sales" call.
A simple indicator of how much you are due ( daily ) would solve this kind of problem. ( google/reddit shows that this kind of problem happens all the time in the last 2 years )
We have instances that spin up and down quickly. AWS bills by the second; Datadog billed at that time (unsure if it's changed) by the minute. This mismatch led to huge bills, such that monitoring was more expensive than the resource being monitored. It's probably fair to respond to that with RTFM. However, par for the course in the industry seems to be to adjust the bill when our mistake was made in good faith. Their response was to give us a small adjustment in exchange for signing up for additional services. More than just what happened is how it felt. It felt sleazy, and didn't jibe with the way the company was presented in the community.
As for the tech, it seemed like a quality product.
Nice read, but I was hoping they’d say that it led to a big improvement in their log searching syntax/ui. It seems impossible to just full text search for a string and find log lines that have a value containing that text. Drilling down through the “details” pane and clicking filter/match/exclude works well, but general searching is too confusing for me to figure out, if it even works at all.
> Writers read from Kafka, (briefly) buffer events in memory, upload events to blob storage in our custom file format, and then commit the presence of these new files to our metadata store.... Compactors scan the metadata store for small files generated by the Writers and previous compactions, and compact them into larger files.... The Reader (leaf) nodes run queries over individual files in blob storage and return partial aggregates, which are re-aggregated by the distributed query engine.
And then the meta-data supporting the system:
> Husky's metadata store has multiple responsibilities, but its most important one is to serve as the strongly consistent source of truth for the set of files currently visible to each customer. We’ll delve into the details of our metadata store more in future blog posts, but it is a thin abstraction around FoundationDB, which we selected because it was one of the few open source OLTP database systems that met our requirements
There's some nice scalability/isolation benefits in this all. Having reader nodes reading from network storage has created a lot of flexibility & ability to shift work around on demand.
Keeping all the metadata in FoundationFB is exciting, & sounds like a great use case, for it's safe transactional updates!
* You have services emit data into streams.
* You dump the streams into your storage with high frequency so you can have near real-time result, this process will create many small files.
* Because small files are inefficient, you have compactors that run over the small files and merge them into bigger files, and/or delete records that's obsolete.
* You run a query engine that read over the small files and large files to get the final result.
* To speed up step 2,3,4 you store the metadata of the files in-memory / in a database.
https://twitter.com/fulmicoton/status/1526776987553263616https://github.com/quickwit-oss/quickwit
its a shame, the product is kind of nice. But this is 100% of putting.
one mistake in my logs, and my account was due > 10k us$. until a manager contact-me after a month. It appears to be a method to force a "sales" call.
A simple indicator of how much you are due ( daily ) would solve this kind of problem. ( google/reddit shows that this kind of problem happens all the time in the last 2 years )
As for the tech, it seemed like a quality product.