Readit News logoReadit News
angry__panda commented on When should you use the IN instead of the OR operator in Postgres?   ottertune.com/blog/query-... · Posted by u/angry__panda
consoomer · 2 years ago
Seems like an obvious optimization to use a hash instead of scanning the array with comparisons... but I bet it's hard because on small arrays sequentially scanning is faster than building a hash first. Does anyone know at what size the optimizer switches from scanning the list to a hash?
angry__panda · 2 years ago
Based on my experiments, Postgres switches from scanning the list to a hash when the array size > 8.
angry__panda commented on When should you use the IN instead of the OR operator in Postgres?   ottertune.com/blog/query-... · Posted by u/angry__panda
aforty · 2 years ago
When should you use IN instead of the OR operator?

Tl;dr;

Always.

angry__panda · 2 years ago
This is true for Postgres. But might not be true for other databases. For example, for MySQL, in some cases OR is faster than IN operator based on our experiments.
angry__panda commented on When should you use the IN instead of the OR operator in Postgres?   ottertune.com/blog/query-... · Posted by u/angry__panda
pdw · 2 years ago
It's frustrating when articles like this don't mention the version of the software tested. Because that does matter.

From the PostgreSQL 14 release notes: "Allow hash lookup for IN clauses with many constants (James Coleman, David Rowley). Previously the code always sequentially scanned the list of values."

And in PostgreSQL 15: "Allow hash lookup for NOT IN clauses with many constants (David Rowley, James Coleman). Previously the code always sequentially scanned the list of values."

angry__panda · 2 years ago
The article mentioned that it's tested on Postgres 14:

"We used a PostgreSQL v14 database on a db.t3.medium Amazon RDS instance equipped with 2 CPUs and 4GB RAM. The storage capacity is 200GB (gp2 storage)."

u/angry__panda

KarmaCake day77April 26, 2023View Original