Readit News logoReadit News
croo · 6 years ago
Three out of five answers so far just questions the legitimacy of using noSql.

There are definitely use cases for noSQL and I clicked on this thread hoping for information and war stories about cockroach, mongo, redis, couchdb, the current state of noSql in postgres, and a few name I maybe never heard of.

Let me derail the conversation of berating this technology. I've happen to had a requirement which needs a dinamically changing dB structure (saving lots of json data from an dinamically changeable form). Which noSql db would you recommend for me? Any pitfalls? I'm primarily looking for self hosted solutions.

frompdx · 6 years ago
Honestly Postgres is probably my first choice for storing some dynamic JSON and I have used it for exactly that in the past. In particular, I used to to build a simple time series db for reporting on data from JSON payloads that the potential to hold arbitrary data.

The OP has harped quite a bit on connection limit issues and this is a valid concern but also something that you can mitigate by using connection pooling. Geographic replication is an issue and it's one of those things where I'm not really convinced any sql or nosql db offers a really good solution. For example, if you run mongodb in replica you must take extra steps to ensure your replicas do not suffer from split brain during a network partition. And as another commenter has pointed out, mongodb's transactions are flawed. I would not trust it for anything transactional beyond single document transactions in a non-replicated configuration.

DynamoDB is pretty decent for a document DB but I personally dislike working with it. It requires you to make many sacrifices that RDBMS's like postgres offer out of the box. For example, if you want to fetch all records you will have to do it in a loop because there are limits to the number of records that can be fetched at the same time. But of course you can't self host it.

Another hosted option is Datatomic. I've heard great things but have never used it so I can't really comment.

pier25 · 6 years ago
> you can mitigate by using connection pooling

Can you do connection pooling to Postgres from cloud functions?

I know there is a Node driver that does it for MySQL [1] but I've never seen one for Postgres.

[1] https://github.com/jeremydaly/serverless-mysql

pier25 · 6 years ago
Maybe Mongo if you don't need ACID transactions [1].

https://twitter.com/jepsen_io/status/1261276984681754625

rishav_sharan · 6 years ago
HN can be fairly rubbish at times. People here can get too tied to their tools to even consider that other tools may be useful to others. NoSQL has it uses and I personally love using one.

OP, try https://www.arangodb.com It is the the best NoSQL IMHO. Its multi model, extremely performant, has fantastic distributed/replication capabilities and good documentation. They even have a hosted offering of it.

pier25 · 6 years ago
Thanks for the suggestion.

It looks good but their cloud offering seems quite expensive starting at $0.20 per hour or about $150 per month.

rishav_sharan · 6 years ago
yep. the hosted option is fairly new and they are still finding their feet. Arangodb has so many amazing stuff about it; AQL which is a query language which looks like code, Foxx which as integrated web server and so on.
lgl · 6 years ago
This probably won't be a very popular opinion since it's not really a nosql database but if you have a relatively small dataset and not very complex querying needs you may be able to use Redis as a pretty decent datasource. Another solution may be Firebase Realtime Database although that will limit you vendor wise.
pier25 · 6 years ago
Redis only runs in memory and Firebase RTDB has many limitations and only works for the most simplistic use cases (I've been using it since 2016).

Firestore is better than the RTDB but still very limited compared to say Mongo or Fauna.

lgl · 6 years ago
While Redis needs to keep the dataset in memory (which is why I added that it kind of depends on the size of your stuff) it does have quite robust persistence features [0] so it's very unlikely you'll ever lose your data even across reboots or crashes if it's configured correctly.

That being said, it's still not really a database engine on itself and would also require a slight paradigm change on how you think about your data and how you create your schema so ymmv. But I've personally used it across a few non-data-heavy projects as primary datasource and have been quite happy with it. It was also famously used as primary datasource for a well known adult website generating 200M pageviews/day even back in 2012 [1] [2] although I don't know if that is still the case.

[0] https://redis.io/topics/persistence

[1] http://highscalability.com/blog/2012/4/2/youporn-targeting-2...

[2] https://news.ycombinator.com/item?id=3597891

hodder · 6 years ago
First determine whether Nosql is really the solution you want. Next once you think nosql is the solution you want, have your experienced old hands slap you a few times.

If that still doesn’t convince you, then you may actually may need nosql: go for Mongo.

pier25 · 6 years ago
Mongo was analyzed recently by Jepsen again and it didn't turn out well...

https://twitter.com/jepsen_io/status/1261276984681754625

badpun · 6 years ago
What is your use case? SQL RDBMS are generally a sensible default and you should use NoSQL only in places where they cannot be used (this is mostly related to scale requirements that are too much for RDBMS to handle).
pier25 · 6 years ago
I'm looking for the best DB for a serverless backend (cloud functions).

Typically the problem with RDBMS is that it's very expensive to handle thousands of concurrent connections. NoSQL doesn't have that issue. For example FaunaDB is designed for serverless and has no practical connection limits, Mongo Atlas gives you 500 concurrent connections on the free tier [1], etc.

In comparison Postgres on Heroku only gives you 500 connections on the most expensive plans. Even the $50 per month Postgres plan only gives you 50 concurrent connections.

[1] https://docs.atlas.mongodb.com/reference/atlas-limits/

kvz · 6 years ago
Most of the time when people think they need NoSQL they don’t. But if you really do, and value your data, FoundationDB
nunez · 6 years ago
DynamoDB is pretty good, honestly
thecodemonkey · 6 years ago
Why NoSQL?
pier25 · 6 years ago
Because generally speaking SQL databases don't work well with serverless and are a pain to distribute geographically.
gregjor · 6 years ago
Huh. A lot of us use serverless relational databases now, if by “serverless” you mean “runs on a remote server someone else manages.” AWS RDS, for example.

A pain to distribute geographically? What do you think big enterprises and banks use? Oracle or Mongo? If by “a pain” you mean “not free” then you’re right. Depends on how valuable your data is and how much you care about integrity.