I love influx but damn do they like moving (too?) fast and quickly changing stuff. In a way, it's pretty cool since it means that they don't get stuck with bad decisions for backwards compatibility reasons, but it's a bit of a roller coaster for users.
Not sure what's the best solution though. Having a "stable" but fundamentally limited product (I guess influxdb v1) or breaking stuff in hopes of ending up with a way better technical foundation.
We're migrating off of InfluxDB due to that rollercoaster, honestly. It's hard enough to find time to maintain the monitoring stack at work. Casually dropping "Oh, and now you get to rebuild the entire grafana to change the query language" on that doesn't help. And apparently, version 3 does the same thing, except backwards.
Sorry, but at that point, we've decided to rebuild the entire metric visualization once on TimescaleDB, since we're running postgres a lot anyhow.
Fair warning, I had serious scaling issues with Timescale.
Solutions like Grafana Mimir, Victoria Metrics, Clickhouse, or yes, the new Influx implementation, are much more scalable and will give you much fewer headaches.
ClickhouseDB is realy brilliant, btw, it's a powerhouse.
Especially with the fairly recent additions that enable hybrid local + S3 option, pushing older metrics to S3 for cheap long-term storage.
Same here. I joined current company 3 years ago when Influx v2 was coming out. I was supposed to build some analytics on top of it. It was very painful. Flux compiler was often giving internal errors, docs were unclear and it was hard to write any a bit more complicated code. The dash is subpar to graphana but graphana had just raw support. There was no query builder for flux so I tried building dashboards in influxv2 but the whole experience was excrutiating. I still have an issue open where they have an internal function incorrectly written in their own flux code and I provided the fix and what was the issue but it was never addressed. Often times I had a feeling that I found bugs in situations that were so basic that it felt like I was the only person on the planet writing Flux code
We are influxdb enterprise customers and looking to do the same thing. They've kept their enterprise offering on 1.x, which has kept us mostly happy, but seeing what's going on in their OSS stuff is horrifying and we're looking to avoid the crash and burn at the end of the tunnel.
Are you running the "OLAP" TimescaleDB on the same instance as your regular OLTP Postgres? This is the only reason I would entertain TimescaleDB, if I had a strict "1 server" requirement. I briefly deployed and looked into it and there were a lot of footguns like with compression.
If not, I would suggest looking at a proper OLAP DB. VictoriaMetrics has been great and was easy to set up.
It’s funny how for the longest time, I was upset with how slowly the web moved. At times I wished they wouldn’t care as much about backwards compatibility.
But now with these VC-funded tech products that have spawned over the last 5-7 years, who have a move-fast-and-break-things attitude, I’m seeing the benefits of the old approach.
I suppose it’s all a matter of trade offs, as with all things, and there’s no silver bullet.
We just left it. Too many changes, new query language is incomprehensible to drive-by-graphing, and rest of the industry seems to be building around PromQL/Prometheus.
serious question on behalf of the uniformed, why? I feel that society encourages people to double down and be consistent even if they are armed with better information. we could be better if we didn't have to stick to the one true path, no?
Been using both 1.x and 2.x for telemetry (oss & paid both). I am pretty excited with 3.x's interoperability. Archiving to standard data formats makes the data science team's job's easier, and with a more standard ANSI SQL query engine with jdbc support, and high cardinality tags, it will greatly speed up front end development and analysis use cases.
As well, I am one of those folks that happens to find the Flux query language powerful, but it's not easy enough for folks to just make that jump from SQL. Flux is much closer to Splunk's search language. It is good at what it does. FluxQL doesn't even have date parsing (which is really odd for a time series query language), but FlightSQL in 3.x seems to be more complete.
Yes I think v3 is pretty solid, and it's nice that they are still supporting v1 and v2. But I think the "migration" from v1 to v2 was the "painful" part. Not because it was too hard to migrate (I guess you don't even have to, since it's still supported), but because it introduced a very different approach, that was supposed to be the future of influx, that was just basically dropped in the next release. I think some commitment towards v3 might help in that regard. As you said, flux is powerful and took some time to get used to but it's now basically useless if you took the time to get into it.
I like that they are converging towards SQL, but at the same time it's a bit like going back to square one.
They seem more convinced about going full SQL this time though, but yeah
Just searching for this, I stumbled on this documentation page that illustrates the point very well:
In the same page (about the original influxql in v1), there is a depecration notice for v1 stating that v2 is the stable version, implying that InfluxQL is not recommended. And a pop up notice stating that v2 (flux) is basically deprecated and just in maintenance mode, and that you should use InfluxQL. But as I said in my earlier comment, I guess in some ways that's better than being too rigid and sticking with bad or less ideal technical decisions.
I've always had a soft spot for influxdb after using it for a self hosted datadog/newrelic etc solution many (6+) years ago with great success. Still use it in conjunction with telegraf and grafana for personal project monitoring, but I've not brought myself to upgrade from the 1.x series.
Hopefully it's improved, but last time I tried upgrading I found the UX in grafana to be subpar on the newer versions, as I recall you lost the autocomplete/UI to build your queries. Obviously grafana is it's own project but feels like they (influx) should invest more resource in areas like this to encourage people to upgrade - if you're going to do major upgrades make sure they have feature parity
Like you, I've stuck with Influx v1, Telegraf and Grafana. My policy is to upgrade only when there are significant reasons to. When I evaluated InfluxDB 2, there were no major reasons for me to switch. Of course, the data ingested in my case is relatively small. YMMV.
I looked at TimescaleDB but at the time there was no easy way to get data from Telegraf to TimescaleDB. Telegraf finally merged code that allows writes to Postgres databases, but it took like 3 years to do that.
Ultimately, I still stuck with InfluxDB v1 because sending data to it via the InfluxDB line protocol is so simple. I have a couple of bash scripts that use awk to transform command output to Influx line protocol and send it to InfluxDB. It's just so simple. I love it.
I love learning about new things, but the InfluxDB v1 keeps working fine so I may not switch from it until something forces me to do it.
We moved from Influxdb to Prometheus for this reason. Influxdb is far more powerful, but ain't nobody got time to fix all the graphs in Grafana or learn the very mathematical like QL.
If we had dedicated personell to manage our monitoring we might have stuck with it.
We're trying to make the transition from v1 to v3 easier by brining the write and query APIs from that version forward. We wanted to do the same for Flux, but found it was too difficult in the near term. We might be able to do something in the future, but for now we're focused on making core improvements to the v3 engine.
We'll have data migration tools for v1 and v2 into v3 later this year/early next.
Thanks for your reply! Dumb question (I couldn't find a definitive answer) but will v3 InfluxQL be compatible with v1? Is there an article about the changes between v1 and v3?
I guess I'd rather have that than ossifying on a completely flawed architecture. Apparently flux was kind of a dead end, and while it's super risky and illustrates issues in decision making, it's still better than just doubling down on something that their own team consider to be futureless or too flawed.
The backstory here is they were doing a rewrite anyways, for reasons that had not much to do with languages; they expected to write some C++ for the new version. Rust was the right call for them.
Rust is one of the few languages that have a chance to climb out of the hobbyist/academic/ultra-niche range, so it's interesting for me to hear about developments towards the direction of reaching mainstream status. I'd say the same thing about Zig but with less strength.
I started using HN because I'm a rust fanboy and there was a lot of rust content (same with lobsters). I'm glad to say there is a lot of other HN content that interests me, I might never have known.
Funny enough, in contrast to when I joined, the pendulum seems to have swung, and comments disparaging rust seem to be en vogue.
The issue is that InfluxDB is an infrastructure product. Changing the core impacts the way users interact with the product. If Figma decided to change their backend, it could be transparent to users.
Opinions could be different if first they implemented a complete compatibility layer, Flux included, prior to making the migration.
Would be great to see an in- depth blog post by Andrew and team about Rust, the bad and the good. They didn't just build a system but one that was optimized for performance. What were the major challenges during the rewrite? Have you optimized CI build times?
This is intriguing. Interesting, how does this new Influx engine compete in terms of performance with VictoriaMetrics (which is written in Go and really fast)?
They moved their entire stack from Go to Rust, rewrote the system from the ground, and spent a lot of time on it, I guess this is a big cost.
If I'm reading this [0] right, there will be no a standalone OS influxdb 3.0 version. So there's no point in comparing. I also wonder if it would be allowed to publish benchmarks of ENT version by 3rd-parties.
Not sure what's the best solution though. Having a "stable" but fundamentally limited product (I guess influxdb v1) or breaking stuff in hopes of ending up with a way better technical foundation.
Sorry, but at that point, we've decided to rebuild the entire metric visualization once on TimescaleDB, since we're running postgres a lot anyhow.
Solutions like Grafana Mimir, Victoria Metrics, Clickhouse, or yes, the new Influx implementation, are much more scalable and will give you much fewer headaches.
ClickhouseDB is realy brilliant, btw, it's a powerhouse. Especially with the fairly recent additions that enable hybrid local + S3 option, pushing older metrics to S3 for cheap long-term storage.
If not, I would suggest looking at a proper OLAP DB. VictoriaMetrics has been great and was easy to set up.
But now with these VC-funded tech products that have spawned over the last 5-7 years, who have a move-fast-and-break-things attitude, I’m seeing the benefits of the old approach.
I suppose it’s all a matter of trade offs, as with all things, and there’s no silver bullet.
Victoriametrics so far works very well.
As well, I am one of those folks that happens to find the Flux query language powerful, but it's not easy enough for folks to just make that jump from SQL. Flux is much closer to Splunk's search language. It is good at what it does. FluxQL doesn't even have date parsing (which is really odd for a time series query language), but FlightSQL in 3.x seems to be more complete.
I like that they are converging towards SQL, but at the same time it's a bit like going back to square one. They seem more convinced about going full SQL this time though, but yeah
Just searching for this, I stumbled on this documentation page that illustrates the point very well:
https://docs.influxdata.com/influxdb/v1/query_language/
In the same page (about the original influxql in v1), there is a depecration notice for v1 stating that v2 is the stable version, implying that InfluxQL is not recommended. And a pop up notice stating that v2 (flux) is basically deprecated and just in maintenance mode, and that you should use InfluxQL. But as I said in my earlier comment, I guess in some ways that's better than being too rigid and sticking with bad or less ideal technical decisions.
Hopefully it's improved, but last time I tried upgrading I found the UX in grafana to be subpar on the newer versions, as I recall you lost the autocomplete/UI to build your queries. Obviously grafana is it's own project but feels like they (influx) should invest more resource in areas like this to encourage people to upgrade - if you're going to do major upgrades make sure they have feature parity
I looked at TimescaleDB but at the time there was no easy way to get data from Telegraf to TimescaleDB. Telegraf finally merged code that allows writes to Postgres databases, but it took like 3 years to do that.
Ultimately, I still stuck with InfluxDB v1 because sending data to it via the InfluxDB line protocol is so simple. I have a couple of bash scripts that use awk to transform command output to Influx line protocol and send it to InfluxDB. It's just so simple. I love it.
I love learning about new things, but the InfluxDB v1 keeps working fine so I may not switch from it until something forces me to do it.
Next time around I'm going to give TimescaleDB a look.
They are always… in flux * sun glasses on*
If we had dedicated personell to manage our monitoring we might have stuck with it.
We'll have data migration tools for v1 and v2 into v3 later this year/early next.
https://news.ycombinator.com/item?id=25049253
At some point HN is going to have to decide if it's the Rust subreddit or a news site.
After that nobody wants to hear about Java.
Funny enough, in contrast to when I joined, the pendulum seems to have swung, and comments disparaging rust seem to be en vogue.
Opinions could be different if first they implemented a complete compatibility layer, Flux included, prior to making the migration.
I ask because ClickHouse is quite hot at the moment from my experience in consulting and that seems to be reflected in Google Trends [1].
And there are some startups relying on ClickHouse for their log/monitoring products like https://signoz.io and https://hyperdx.io.
[1] https://trends.google.com/trends/explore?date=all&q=ClickHou...
The non-reddit link target
They moved their entire stack from Go to Rust, rewrote the system from the ground, and spent a lot of time on it, I guess this is a big cost.
Is it worth it?
[0] https://www.influxdata.com/blog/the-plan-for-influxdb-3-0-op...