eric-burel (u/eric-burel)

eric-burel commented on Google's Gemini API Free Tier Fiasco: Developers Hit by Silent Rate Limit Purge quasa.io/media/google-s-g... · Posted by u/eric-burel

eric-burel · 8 days ago

Related: https://news.ycombinator.com/item?id=46223311 Switching to paid version was already problematic as it uses GCP account system which doesn't have a spending limit, and API keys do not have an expiration date. So the free offer was great for freelancers and SMEs and yet the paid version was the worst possible scenario you can imagine for same freelancers and SMEs. OpenRouter + free models and the increased rate limit you get after buying 10 credits (10€) is my current favourite choice for learning/teaching.

eric-burel commented on Microsoft has a problem: lack of demand for its AI products windowscentral.com/artifi... · Posted by u/mohi-kalantari

eric-burel · 11 days ago

Microsoft is using the deep penetration of SharePoint in companies to sell Copilot license. At least in France it's well and alive and I see much more Copilot licenses than actual OpenAI uses.

eric-burel commented on Program-of-Thought Prompting Outperforms Chain-of-Thought by 15% (2022) arxiv.org/abs/2211.12588... · Posted by u/mkagenius

samus · 18 days ago

If the generated code uses a suitable programming language, like the safe subset of Haskell, then the risk is significantly lower. Anyway it makes sense to execute this code in the user's browser instead of on the server.

eric-burel · 13 days ago

Yeah I mean you can replace sandboxing buy other safe alternatives but the idea is the same, the generated code has to be considered as 100% untrusted. Supply chain attacks are especially nasty.

eric-burel commented on Reverse engineering a $1B Legal AI tool exposed 100k+ confidential files alexschapiro.com/security... · Posted by u/bearsyankees

windexh8er · 15 days ago

> To add onto this, it is a characteristic of their design to statistically pick things that would be bad choices, because humans do too.

Spot on. If we look at, historically, "AI" (pre-LLM) the data sets were much more curated, cleaned and labeled. Look at CV, for example. Computer Vision is a prime example of how AI can easily go off the rails with respect to 1) garbage input data 2) biased input data. LLMs have these two as inputs in spades and in vast quantities. Has everyone forgotten about Google's classification of African American people in images [0]? Or, more hilariously - the fix [1]? Most people I talk to who are using LLMs think that the data being strung into these models has been fine tuned, hand picked, etc. In some cases for small models that were explicitly curated, sure. But in the context (no pun) of all the popular frontier models: no way in hell.

The one thing I'm really surprised nobody is talking about is the system prompt. Not in the manner of jailbreaking it or even extracting it. But I can't imagine that these system prompts aren't collecting mass tech debt at this point. I'm sure there's band aid after band aid of simple fixes to nudge the model in ever so different directions based on things that are, ultimately, out of the control of such a large culmination of random data. I can't wait to see how these long term issues crop and and duct taped for the quick fixes these tech behemoths are becoming known for.

[0] https://www.bbc.com/news/technology-33347866 [1] https://www.theguardian.com/technology/2018/jan/12/google-ra...

eric-burel · 13 days ago

Talking about the debt of a system prompt feels really weird. A system prompt tied to an LLM is the equivalent of crafting a new model in the pre-LLM era. You measure their success using various quality metrics. And you improve the system prompt progressively to raise these metrics. So it feels like bandaid but that's actually how it's supposed to work and totally equivalent to "fixing" a machine learning model by improving the dataset.

eric-burel commented on Reverse engineering a $1B Legal AI tool exposed 100k+ confidential files alexschapiro.com/security... · Posted by u/bearsyankees

joshribakoff · 15 days ago

In one instance it could not even describe why a test is bad unit test (asserting true is equal to true), which doesn’t even require context or multi file reasoning.

Its almost as if it has additional problems beyond the context limits :)

eric-burel · 13 days ago

In an agentic setup you are still dependent on having relatively smart models that's true.

eric-burel commented on Reverse engineering a $1B Legal AI tool exposed 100k+ confidential files alexschapiro.com/security... · Posted by u/bearsyankees

input_sh · 15 days ago

There's about a dozen workarounds around context limits, agents being one of them, MCP servers being another one, AGENTS.md being the third one, but none of them actually solve the issue of a context window being so small that it's useless for anything even remotely complex.

Let's imagine a codebase that can fit onto a revolutionary piece of technology known as a floppy drive. As we all know, a floppy drive can store <2 megabytes of storage. But a 100k tokens is only about 400 kilobytes. So, to process the whole codebase that can fit onto a floppy drive, you need 5 agents plus the sixth "parent process" that those 5 agents will report to.

Those five agents can report "no security issues found" in their own little chunk of the codebase to the parent process, and that parent process will still be none the wiser about how those different chunks interact with each other.

eric-burel · 13 days ago

You can have an agent that focuses on studying the interactions. What you're saying is that an AI cannot find every security issue but neither do humans otherwise we wouldn't have security breaches in the first place. You are describing a relatively basic agentic setup mostly using your AI-assisted text editor but a commercial security bot is a much more complex beast hopefully. You replace context by memory and synthesis for instance, the same way our brain works.

eric-burel commented on National Security Strategy of the United States of America [pdf] whitehouse.gov/wp-content... · Posted by u/TechTechTech

eric-burel · 13 days ago

Sounds a lot like "nationalism in a single country"

eric-burel commented on Reverse engineering a $1B Legal AI tool exposed 100k+ confidential files alexschapiro.com/security... · Posted by u/bearsyankees

input_sh · 15 days ago

Prediction: it won't.

You can't fit every security consideration into the context window.

eric-burel · 15 days ago

You may want to read about agentic AI, you can for instance call an LLM multiple times with different security consideration everytime.