Readit News logoReadit News
Posted by u/spennant 9 months ago
Show HN: Cloud-Ready Postgres MCP Servergithub.com/stuzero/pg-mcp...
Hey HN,

I built pg-mcp, a Model Context Protocol (MCP) server for PostgreSQL that provides structured schema inspection and query execution for LLMs and agents. It's multi-tenant and runs over HTTP/SSE (not stdio)

Features - Supports multiple database connections from multiple agents

- Schema Introspection: Returns table structures, types, indexes and constraints; enriched with descriptions from pg_catalog. (for well documented databases)

- Read-Only Queries: Controlled execution of queries via MCP.

- EXPLAIN Tool: Helps smart agents optimize queries before execution.

- Extension Plugins: YAML-based plugin system for Postgres extensions (supports pgvector and postgis out of the box).

- Server Mode: Spin up the container and it's ready to accept connections at http://localhost:8000/sse

saberience · 9 months ago
Just for everyone here, the code for "building an MCP server", is importing the standard MCP package for Typescript, Python, etc, then writing as little as 10 lines of code to define something is an MCP tool.

Basically, it's not rocket science. I also built MCP servers for Mysql, Twilio, Polars, etc.

spennant · 9 months ago
I built this to fill a specific need. It's not really made to talk to Claude Desktop (although it can). It's built to take multiple connections from agents who bring their own LLM (via API key or whatever) and provide context to the LLM model. Yes, it does import the standard Python MCP package, however it does quite a few non-standard things in order to achieve the multi-tenancy. I encourage you to look at the code.
koakuma-chan · 9 months ago
What is multi-tenancy?
runako · 9 months ago
From HN guidelines:

> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.

We are hackers here. Building is good. Sharing is good. All this is true even if you personally know how to do what is being shared, and it is easy for you. I promise you there are people who encounter every sharing post here and do not think what is posted is easy.

brulard · 9 months ago
I think we exactly need to hear things like that. This is what I was wondering. Why is every MCP project such a big news? Isn't it just a few lines of code?
asdev · 9 months ago
5% hackers, 95% FANG employees who think this is meaningless
esafak · 9 months ago
You do have to implement every functionality that you want to expose.
1zael · 9 months ago
This is wild. Our company has like 10 data scientists writing SQL queries on our DB for business questions. I can deploy pg-mcp for my organization so everyone can use Claude to answer whatever is on their mind? (e.x."show me the top 5 customers by total sales")

sidenote: I'm scared of what's going to happen to those roles!

clusterfook · 9 months ago
Yep gonna be easy

Q: show me the top 5 customers by total sales

A: System.Data.Odbc.OdbcException (0x80131937): ERROR [57014] ERROR: canceling statement due to statement timeout;

Q: Why do I get this error

A: Looks like it needs an index, let me create that for you. Done. Rerunnign query.

could not close temporary statistics file "pg_stat_tmp/global.tmp": No space left on device

Q: Why this error

A: 429 Too Many Requests

Rub hands... great next 10 years to be a backend dev.

curious_cat_163 · 9 months ago
That’s a good example of a worst case scenario. This is why we would still need humans loitering about.

The question is do they still need 10? Or 2 would suffice? How about 5?

This does not need to be a debate about the absolutes.

brulard · 9 months ago
I have to say I had a very good results creating and optimizing quite complex queries with Sonnet. But letting LLM run them on their own in production is quite a different beast.
fullstackchris · 9 months ago
and the next 10 after that, and the next 10 after that, and...
otabdeveloper4 · 9 months ago
Probably nothing. "Expose the database to the pointy-haired boss directly, as a service" is an idea as old a computing itself. Even SQL itself was originally an iteration of that idea. Every BI system (including PowerBI and Tableau) were supposed to be that.

It doesn't work because the PHB doesn't have the domain knowledge and doesn't know which questions to ask. (No, it's never as simple as group-by and top-5.)

jaccola · 9 months ago
I would say SQL still is that! My wife had to learn some SQL to pull reports in some non-tech finance job 10 years ago. (I think she still believes this is what I do all day…)

I suppose this could be useful in that it prevents everyone in the company having to learn even the basics of SQL which is some barrier, however minimal.

Also the LLM will presumably be able to see all the tables/fields and ‘understand’ them (with the big assumption that they are even remotely reasonably named) so English language queries will be much more feasible now. Basically what LLMs have over all those older attempts is REALLY good fuzziness.

I see this being useful for some subset of questions.

pclmulqdq · 9 months ago
A family friend maintains a SQL database of her knitting projects that she does as a hobby. The PHB can easily learn SQL if they want.
spennant · 9 months ago
It won’t be that easy. First off, most databases in the wild are not well documented. LLMs benefit from context, and if your tables/columns have non-intuitive or non-descriptive names, the SQL may not even work. Second, you might benefit from an LLM fine-tuned on writing code and/or an intelligent Agent that checks for relevancy and ambiguity in user input prior to attempting to answer the question. It would also help if the agent executed the query to see how it answered the user’s question. In other words “reasoning”… pg-mcp simply exposes the required context for Agents to do that kind of reasoning.
nickdichev · 9 months ago
The COMMENT command will finally be useful :)
dinfinity · 9 months ago
Then let the AI first complete the documentation by looking at the existing documentation, querying the DB (with pg-mcp), etc.

Do human reviewing and correcting of the updated documentation. Then ensure that the AI knows that the documentation might still contain errors and ask it to do the 'actual' work.

moltar · 9 months ago
There are LLM SQL benchmarks. [1] And state of the art solution is still only at 77% accuracy. Would you trust that?

[1] https://bird-bench.github.io/

flappyeagle · 9 months ago
Yes. Ask it to do it 10 times and pick the right answer
risyachka · 9 months ago
So you will ask "What is our churn?", get a random result, and then turn your whole marketing strategy around wrong number?

Thats cute.

Kiro · 9 months ago
There are hundreds of text-to-SQL companies and integrations already. What's different about this that makes you react like that?
romanovcode · 9 months ago
Those companies will be dead once this goes mainstream. Why pay to a 3rd party company when you can ask LLM to create graphs and analysis of whatever you want. Pair it with scheduled tasks and I really don't see any value in those SaaS products.
a-dub · 9 months ago
is that true? i'd like that, but i get the sense that this mcp stuff is more oriented around programming assistant and agent applications.

i suppose the desktop app can use it, but how good is it for this general purpose "chat with the database for lightweight analytics" use cases is it worth the trouble of dealing with some electron app to make it work?

sshine · 9 months ago
> i get the sense that this mcp stuff is more oriented around programming assistant and agent applications

Agents will become ubiquitous parts of the user interface that is currently the chat.

So if you bother with a website or an electron app now, MCP will just add more capabilities to what you can control using agents.

slt2021 · 9 months ago
didn't Tableau (and some other BI solutions) have this feature out of the box?
fulafel · 9 months ago
From docker-compose

    ports:
      - "8000:8000"
This will cause Docker to expose this to the internet and even helpfully configure an allow rule to the host firewall, at least on Linux.

rubslopes · 9 months ago
Good catch.

OP, exposing your application without authentication is a serious security risk!

Quick anecdote: Last week, I ran a Redis container on a VPS with an exposed port and no password (rookie mistake). Within 24 hours, the logs revealed someone attempting to make my Redis instance a slave to theirs! The IP traced back to Tencent, the Chinese tech giant... Really weird. Fortunately, there was nothing valuable stored in it.

_ncyj · 9 months ago
> The IP traced back to Tencent, the Chinese tech giant... Really weird.

They're a large cloud provider in Asia like Amazon AWS or Microsoft Azure. I doubt such a tech company would make it that obvious when breaking the law.

spennant · 9 months ago
I made a few assumptions about the actual deployer and their environment that I shouldn’t have… I’ll need to address this. Thanks!
tudorg · 9 months ago
This is great, I like in particular that there are extensions plugins. I’ll be looking at integrating this in the Xata Agent (https://github.com/xataio/agent) as custom tooling.
spennant · 9 months ago
Xata.io looks very interesting!!! I was thinking about building an intelligent agent for pg-mcp as my net project but it looks like you did a lot of the hard work already. When thinking about the "AI Stack" I usually separate concerns like this:

  UI <--> Agent(s) <--> MCP Server(s) <--> Tools/Resources
             |
           LLM(s)

tudorg · 9 months ago
That's very similar to what we are thinking as well, and we'd like to separate the Agent tools into an MCP server as well as use MCP for custom tools.
scottpersinger · 9 months ago
Where's the pagination? How does a large query here not blow up my context:

https://github.com/stuzero/pg-mcp/blob/main/server/tools/que...

spennant · 9 months ago
It's coming...
jillesvangurp · 9 months ago
Is there more to MCP than being a simple Remote Procedure Call framework that allows AI interactions to include function calls driven by the AI model? The various documentation pages are a bit hand wavy on what the protocol actually is. But it sounds to me that RPC describes all/most of it.
doug_durham · 9 months ago
The biggest contribution is the LLM compatible metadata that describes the tool and its argument. It is trivial to adopt. In python you can use FASTMcp to add a decorator to a function, and as long as that function returns a JSON string you are in business. The decorator extracts the arguments and doc strings and presents that to the LLM.
jillesvangurp · 9 months ago
What makes a spec LLM compatible? I've thrown a lot of different things at gpt o1 and it generally understands them more better than I do. OpenAI specifications, unstructured text, log output, etc.
spennant · 9 months ago
Indeed. Anything you do with MCP can be done in more traditional ways.
oulipo · 9 months ago
Nice!

What I'd be looking for is a MCP server where I can run in "biz/R&D exploration-mode", eg:

- assume I'm working on a replica (shared about all R&D engineers) - they can connect and make read-only queries to the replica for the company data - they have a temporary read-write schema just for their current connection so they can have temporary tables and caches - temporary data is deleted when they close the session

How could you make a setup like that so that when using your MCP server, I'm not worried about the model / users modifying the data, but only doing their own private queries/tables?

ahamilton454 · 9 months ago
I don’t understand the advantage of having the transport protocol be HTTP/SSE rather than studio especially in this case when it’s literally running locally.
spennant · 9 months ago
The use case for pg-mcp is server deployment - local running is just for dev purposes. HTTP/SSE enables multiple concurrent connections and network access, which stdio can't provide.