MongoDB Responds to PostgreSQL Benchmarks

As others have pointed out in another HN discussion (1), Mongodb's reply is definitely questionable, if only by its tone.

I already replied to metheus on Twitter (2) in a thread where we asked for a way to repro their claims. I found their reply and comments very inappropriate, similar to the comment in here. Arrogant and derogatory to OnGres.

Anyway, I was writing this to note that OnGres has replied to Mongo's reply setting an example of how tech discussions should happen: without derogatory and arrogant comments, open to valid criticism (i.e. with something more than words and numbers that cannot be reproduced) and transparency.

Check it out: https://ongres.com/blog/benchmarking-do-it-with-transparency...

In there you'll see how Mongo consistently mis-interpreted (or mis-represented?) the results. They kept mixing the benchmarks and constantly talked about an experimental driver and missing connection pooling. In fact, they did use the official Mongo Lua driver and the official Java driver for different benchmarks and they did some of the benchmarks with and without connection pooling and published both results.

It's really sad to see Mongo reply to a thorough benchmark like this. It probably has its flaws but instead of correcting them or publishing a better benchmark like the one they did (to magically get 240x...) they chose to mischaracterize the work of others, spreading FUD and accusing them of cheating and being dishonest.

Hopefully they'll turn around and fix it. All it takes is to publish how they got they amazing numbers so that others can comment, repro or dispute the benchmark.

(1) https://news.ycombinator.com/item?id=20479670

(2) https://twitter.com/javiermaestro/status/1151849279226556417

I was at the presentation last Thursday, they (OnGres) have fully open sourced both their methodology and their results and had a pretty strict divide between teams designing the benchmarks and teams running the benchmarks.

MongoDB could create a Pull Request/Merge Request against that repository so we can all judge those results ourselves, their current response is only words and a single table showing unlikely results.

However I do think the criticism of not tuning MongoDB is valid, however their response is dishonest:

> with their own heavily tuned PostgreSQL.

This was explicitly not the case according to OnGres other than the established norms of taking 25% memory for `shared_buffers` etc. No other tuning that is normally done for big clusters was done.

https://gitlab.com/ongresinc/benchplatform/https://gitlab.com/ongresinc/txbenchmark

metheus · 6 years ago

Hi, I work at MongoDB, and I'm here to elaborate in answer to your comment.

> I was at the presentation last Thursday, they (OnGres) have fully open sourced both their methodology and their results and had a pretty strict divide between teams designing the benchmarks and teams running the benchmarks.

> MongoDB could create a Pull Request/Merge Request against that repository so we can all judge those results ourselves

The existing, unaltered content of the OnGres repo is all the testimony one needs to know that the OnGres team is incapable of or unwilling to produce a valid test of MongoDB. Open source garbage is still garbage.

I understand the allure of asking for a pull request from our testing team to demonstrate how we obtained the measurements we cited in our retort. It is tempting to see this as a case of well-intentioned scientists, doing their best, honestly asking for peer review. But that view relies on two things that we can not take for granted: 1) that the OnGres team is acting in good faith and will work to correct their errors, fairly declaring MongoDB more performant if they concur with our results; and 2) that such an open back-and-forth will be illuminating to bystanders.

1) We cannot assume that OnGres is acting in good faith when their report so clearly demonstrates that they biased the test against MongoDB. This conversation should start and end with the fact that OnGres used an experimental MongoDB driver to compare against PostgreSQL with a production driver and a dedicated connection pooler in front of it. (What kind of pull request could MongoDB submit to address the use of sysbench, which requires a Lua driver?) They are simply not credible.

2) What would a MongoDB-submitted patch prove? It would certainly print out different numbers, but that alone proves nothing. For those numbers to mean anything, you have to read and understand the code. Anyone capable of understanding why our patch is valid is equally capable of seeing the deep flaws in the code as published, no patch required.

Consider this: if a research group funded by the fossil fuel industry published a report, littered with false statements and methodological errors, claiming that climate change isn't happening, NASA and NOAA aren't obligated to issue full a correction of that report along with their response calling shenanigans.

No, we're not going to get mired in a patch war with demonstrably biased authors over a fundamentally flawed comparison methodology. We have published our own benchmarks demonstrating how to test MongoDB performance, and in a few months, one of our engineers will present her work adapting the industry-standard TPC-C at the VLDB conference.

> their current response is only words and a single table showing unlikely results.

There is nothing unlikely about our obtaining speedups to queries by using indexes that OnGres ignored.

> However I do think the criticism of not tuning MongoDB is valid, however their response is dishonest:

>> with their own heavily tuned PostgreSQL.

> This was explicitly not the case according to OnGres other than the established norms of taking 25% memory for `shared_buffers` etc. No other tuning that is normally done for big clusters was done.

I'm very comfortable using the phrase "heavily tuned" when OnGres used "established norms" for PostgreSQL and ignored the existence of those (clearly documented) norms for MongoDB, while falsely claiming in their report that MongoDB does not require tuning.