Some notes on Grafana Loki's new "structured metadata"

Did someone use both Grafana Loki and Kibana? Does it have any advantages over Kibana? I am mostly interested in resource usage and versatility of filtering.

In Kibana, if something is there I will find it with ease and it doesn't take a lot of time to investigate issues in a microservice based application. It is also quite fast.

feydaykyn · 6 months ago

Compared over Kibana, we experience: - 3x reduced costs - no more index corruption because a key changed type - slower performance for queries over 1 day, especially when non optimized without any filtering - non intuitive ui/ux

So good but not perfect! When we have the time we'll look for alternatives

kbouck · 6 months ago

Re: storage, Kibana (Elastic) has a new (as of v8.17) "logsdb" index mode which claims to be ~2.5x more storage efficient than previous options.

valyala · 6 months ago

Did you try VictoriaLogs? It is easier to configure than Loki and Elasticsearch, and it uses less CPU and RAM. See https://itnext.io/how-do-open-source-solutions-for-logs-work...

suraci · 6 months ago

1. It is extremely resource-efficient.

2. It has a convenient and simple query language.

3. It works very well with traces and metrics.

the pain part:

1. It struggles to query logs over a wide time range.

2. Its indexing (or labeling) capabilities are very limited, similar to Prometheus.

3. Due to 1 and 2, it is difficult to configure and use correctly to avoid errors related to usage limits (e.g., maximum series limits).

valyala · 6 months ago

> 2. It has a convenient and simple query language

IMHO, Loki query language is the most inconvenient language for logs I've seen:

- It doesn't support calculating multiple stats in a single query. For example, it cannot calculate the number of logs and the average request duration in a single query.

- Its' syntax for aggregate functions is very unintuitive and is hard to use, especially if you aren't familiar with PromQL.

- It requires putting an annoying "|=" separator between words and phrases you are searching in logs.

- You need to use a hack with JSON parsing when filtering or stats calculations on log fields is needed.

arcanemachiner · 6 months ago

Which its is it that your its are referring to?

Deleted Comment

jakozaur · 6 months ago

Kibana is great, but managing Elasticsearch is pain. Slow ingestion, query performance, takes a lot of space, harder to tune up.

Modern columnar SQL such as ClickHouse are 10+ times more efficient in real-world use cases.

I'm a CEO and founder of Quesma, which, let's use Kibana with ClickHouse: https://quesma.com/

Forever free, source-available license.

ohgr · 6 months ago

Kibana + ElasticSearch was a mess for us. Was glad to get rid of it. Cost a fortune to run and was time consuming. Loki conversely doesn’t even show up on our costs report (other than the S3 bucket) and requires very little if any maintenance!

Also out of box configuration sinks 1TB/hr quite happily in microservices mode.

valyala · 6 months ago

How much CPU and RAM do all the Loki components use for your workload?

Could you share Loki config, which can deal with 1TB/hr volume of logs?

remram · 6 months ago

ELK could never deal with my logs which are sometimes-JSON. Loki can ingest and query it just fine. Also the query/extraction language makes a lot more sense to me.

valyala · 6 months ago

Elasticsearch can store arbitrary text in log fields, including JSON-encoded string. Elasticsearch can also tokenize JSON-encoded string and provide fast full-text search over such string in the same way like it does for a regular plaintext string.

why do you need storing JSON-encoded string inside log field? It is much better parsing the JSON into separate fields at log shipper and storing the parsed log fields into Elasticsearch. This gives better query performance and may also reduce disk space usage, since values for every parsed field are stored separately (this usually improves compression ratio and reduces disk read IO during queries if column-oriented storage is used for per-field data).

I tried explaining this at https://itnext.io/why-victorialogs-is-a-better-alternative-t...

kbouck · 6 months ago

If your source emits logs in OpenTelemetry format, using an OTel Collector inbetween you could do sometimes-JSON parsing of log content before the backend.

cortesoft · 6 months ago

Couldn’t you use the logstash part of ELK to process the JSON?

parliament32 · 6 months ago

Yes, we switched metrics and logs from an Elastic stack to Prometheus/Thanos/Loki/Grafana about two years ago. On the logs side specifically, resource usage is WAY lower (300eps is like 1.5 cores and 4gb of memory), not to mention going from persistent volumes (disks) to blob storage / S3 is far cheaper and doesn't require any maintenance. Queries are slower, however, because Elastic pre-indexes while Loki searches on-demand, so it really comes down to query volume and your need for query performance (does it matter if your search takes 300ms vs 3s?). I've also found running Elastic yourself requires constant maintenance, while Loki has been very hands-off. Strongly recommend.

Unroasted6154 · 6 months ago

Loki was much cheaper to run in my experience, using S3 storage. And you could scale the parts you needed dynamically in K8s.

Elastic was kind of a resource hog and much more expensive for the same amount of data.

That might be dependent on your use case though.

weitzj · 6 months ago

From the Enterprise Perspektive at least for my use cases(fine grained permissions using extra id) , elasticsearch with kibana always had a solution available.

For grafana cloud and Loki you can close to a good usability with LBAC (label based access control) but you still need have many data sources to map onto each “team view” to make it user friendly.

What is missing for me is like in elastic a single datasource for all logs which every team member across all teams can see and you scope out the visibility level with LBAC

It's also not ideal to have a different query language for different Grafana datastores (LogQL, PromQL, TraceQL). Are there any plans on making a unified Grafana query language?

jakozaur · 6 months ago

There is an effort in OpenTelemetry to create a standard query language for observability. There were a lot of discussions with a lot of opinions; there were even several talks during KubeConEU about that:

https://sched.co/1tcyx

https://sched.co/1txI1

We are still waiting for a compelling implementation that will show the way.

ople · 6 months ago

Why not just use SQL? With LLMs evolving to do sophisticated text-to-SQL, the case for a custom language for the sake of simplicity is diminishing.

I think that expressiveness, performance and level of fluency by base language models (i.e. the amount of examples in training set) are the key differentiators for query languages in the future. SQL ticks all those boxes.

pbh101 · 6 months ago

I think I’m probably not interested in this. PromQL is already relatively dense to learn, but reasonably well fit to the domain model and internally consistent, unlike most other metric querying tools I’ve tried over the years.

Maybe that would work as well with traces and logs but IMO the problem space is quite different and not sure how much value we’d get from a unified language where some subsets only apply to parts, ie traces and logs and metric, as opposed to spiritually similar but distinct languages.

pjd7 · 6 months ago

Unifying things slows engineers down, so probably not (for some time).

Wheaties466 · 6 months ago

well, not having a unified language slows down things on the other end.