So i've been building a product and my clients really hate the idea that their code is stored on my database (unencrypted). The problem is that I need to process the data in the background often and thus I cannot store it end-to-end encrypted. Is there any service that allows you to deploy some sort of database that only the client accesses and at the same time allows me to process it somehow maybe via apis?
Something to keep in mind is that some clients are not operating in good faith, their goal isn't to work together to find a solution but to present roadblocks. The reasoning can be complicated, perhaps there's internal politics around which solution to use, perhaps your solution is receiving pushback because it's not the preferred solution of one stakeholder. You'll probably never know the true motivations, it's important not to get caught up in engineering a solution to a problem that doesn't really exist.
You've mentioned that the data you need access to is code: GitHub is a perfect comparable. GitHub's cloud service is used by the majority of companies with code, in fact, I'd guess even your clients are using GitHub's hosted services. If the problem is that your company doesn't have the reputation necessary to give these clients confidence that you can securely manage their code, that may just be a sign that right now, these clients aren't the right fit for you, and you should work with less antsy clients until you have built up the credibility.
Or as simple as “the less I appear to value this solution, the lower the supplier will estimate my maximum price for it”
Self-hosting seems like the most reliable option for the time being (or executing functions on the encrypted data without decrypting it) however, is it standard practice that I use Kubernetes to give them a preconfigured database that they can deploy on my own cloud? I wouldn't access the code except temporarily through a little script that talks to my cloud that comes along the database in the pod that they "self host." Would that be considered standard practice?
Providing your customers with their own database in your environment is a method for segregating their data and ensuring that there's no unintentional co-mingling of their data with other customers (which is a common problem in a multi-tenant environment) but it does not protect the customer data from being accessed by you: if code you are executing can access the data, then you can access the data.
Reading between the lines ("a large portion of my possible clients seemed to be happy with the idea of the solution I provided") it sounds like my initial understanding of the situation was incorrect: I thought that you had been asked to build this specific architecture by your clients but it sounds like it's the opposite: you've had an idea, come up with an architecture and then validated that idea with potential clients by describing the architecture? Is that correct?
If that's the actual situation, I think this is a much simpler problem to solve. Architecture is architecture, it isn't a part of the solution, it's a means to an end. There are a very small number of clients who may have strict security/compliance requirements that do necessitate this sort of complexity (which is where self-hosting comes in) but for the majority of clients, how the product works is immaterial, they care only about the results.
Realising that you've made a terrible mistake when building a system using the architecture you designed 6 months ago is a rite of passage, it is the process: every vision you have today for how your system will work is probably going to be wrong 6 months from now. That's completely normal, you will learn more about how your system should work in 1 month of building than you would in 6 months of planning.
Try to take a step back from thinking about architecture. One of the biggest dangers when working on an early stage technology product is committing yourself to a technical direction that then dictates the product direction. If, for example, you decide today to build a system that in which clients self-host the database that your code accesses, and then you decide you want to build a feature that requires 10x as many queries to the database, oops, you can't build that, because it would require your clients upgrade their self-hosted database resources, and getting them to do that will be all but impossible.
If you want to share more about your idea, I can outline some ideas about how I might approach building it in a cheap way that allows for validating the idea. There are exceptions but nowadays, given the maturity of the software development space, most ideas can be built and launched to validate with real customers in 1 month. If your vision for how you'll build something requires, 3, 6 or 12 months to get customers using it, it's probably over complicated.
Customer Managed Keys - You have everything encrypted in your database via a key the customer has. You request (likely automated) that key every time you process the data. They can revoke at any point, and have an audit log of every access.
Self Hosting - Let the customer host your solution themselves or automate spinning up a cloud environment for them that they have full control over.
Both are kind of a pain to implement, but that lets you charge more for these enterprise features.
The visualization about halfway down https://www.anjuna.io/solution/secure-ai (my employer) is an example of the self-hosted flavor of this. Happy to discuss deeper, my contact info is in my bio.
Deleted Comment
Isn't that O(n)? Is there a typo or am I missing something?
(for my business, anyway) I've found this wording to be enough for bigger customers:
Data is stored on AWS RDS, encrypted at rest by an industry standard AES-256 encryption algorithm (more on that here: https://aws.amazon.com/rds/features/security/)
The data accessed by the app is not encrypted, you can still work on the data as you would usually do. It's mostly a compliance thing. Not sure what level of security it _actually_ brings to the data itself, but most companies are okay with "encryption at rest".
I’ve enjoyed building on nitro myself and most things should run in it just fine, just need to build the networking vsock proxy into the nitro image for anything that needs networking (such as DB, where you store the encrypted at rest data).
Because for enterprise clients they're going to want their own database. Which has it's own licensing and operating costs - that you should be building into your price. And since they will have their own database it can be encrypted with a key that is unique to them.
For small business customers, a shared database is the only way to stay profitable.
This idea (customer owns the data, code is deployed next to the data, data never leaves customer perimeter) is the exact use case for the native application framework:
https://docs.snowflake.com/en/developer-guide/native-apps/na...
It’s not clear what the core problem is. Are they contractually or by law obligated to comply with security/privacy requirements? Are they afraid you’ll misuse their data (steal their business, etc).
If you can be explicit about what “hate” means, you can find a solution, or decide this is not a potential customer.