Some insider knowledge: Lilli was, at least a year ago, internal only. VPN access, SSO, all the bells and whistles, required. Not sure when that changed.
McKinsey requires hiring an external pen-testing company to launch even to a small group of coworkers.
I can forgive this kind of mistake on the part of the Lilli devs. A lot of things have to fail for an "agentic" security company to even find a public endpoint, much less start exploiting it.
That being said, the mistakes in here are brutal. Seems like close to 0 authz. Based on very outdated knowledge, my guess is a Sr. Partner pulled some strings to get Lilli to be publicly available. By that time, much/most/all of the original Lilli team had "rolled off" (gone to client projects) as McKinsey HEAVILY punishes working on internal projects.
So Lilli likely was staffed by people who couldn't get staffed elsewhere, didn't know the code, and didn't care. Internal work, for better or worse, is basically a half day.
This is a failure of McKinsey's culture around technology.
McKinsey has a weird structure where there are too many cooks in the kitchen.
Everybody there is reviewed on client impact, meaning it ends up being an everybody-for-themselves situation.
So as a developer you have little guidance (in fact, you're still being reviewed on client impact, even if you have 0 client exposure).
Then a (Senior) Partner comes in with this idea (that will get them a good review), and you jump on that. After all, it's all you can do to get a good review.
You work on it, and then the (Senior) Partner moves on. But it's not done. It's enough for the review, but continuing to work on it doesn't bring you anything, in fact, it will actually pull you down, as finishing the project doesn't give immediate client results.
So what does this mean? Most products of McKinsey are a grab-bag of raw ideas of leadership, implemented as a one-off, without a cohesive vision or even a long-term vision at all. It's all about the review cycle.
McKinsey is trying to do software like they do their other engagements. It doesn't work. You can't just do something for 6 months and then let it go. Software rots.
The fact that they laid off a good amount of (very good) software engineers in 2024 is a reflection on how they see software development.
And McKinsey's people, who go to other companies, take those ideas with them. Result: The UI of your project changes all the time, because everybody is looking at the short-term impact they have that gets them a good review, not what is best for the project in the long term.
McKinsey was on a spree to become the best tech consulting company and brought a lot of great tech talent but the 2023 crisis made leadership turn 180 and simply ditch/ignore all the tech experts they brought to the firm.
All the expertise has left the firm and now they are more and more becoming another BS tech consulting firm, with strategy folks that don't even know that ML is AI advising clients on Enterprise AI transformation.
The tech initiative was a failure and Lilli's problem is just a symptom of it.
> One of those unprotected endpoints wrote user search queries to the database. The values were safely parameterised, but the JSON keys — the field names — were concatenated directly into SQL.
I was expecting prompt injection, but in this case it was just good ol' fashioned SQL injection, possible only due to the naivety of the LLM which wrote McKinsey's AI platform.
I just wonder how much professional grade code written by LLMs, "reviewed" by devs, and commited that made similar or worse mistakes. A funny consequence of the AI boom, especially in coding, is the eventual rise in need for security researchers.
The tacit knowledge to put oauth2-proxy in front of anything deployed on the Internet will nonetheless earn me $0 this year, while Anthropic will make billions.
I don’t love the title here. Maybe this is a “me” problem, but when I see “AI agent does X,” the idea that it might be one of those molt-y agents with obfuscated ownership pops into my head.
In this case, a group of pentesters used an AI agent to select McKinsey and then used the AI agent to do the pentesting.
While it is conventional to attribute actions to inanimate objects (car hits pedestrians), IMO we should be more explicit these days, now that unfortunately some folks attribute agency to these agentic systems.
If true, it's quite irresponsible. They are admitting to allowing a agent to autonomously execute code on the network. Autonomously perform hacking activities.
> This was McKinsey & Company — a firm with world-class technology teams [...]
Not exactly the word on the street in my experience. Is McKinsey more respected for software than I thought? Otherwise I'm curious why TFA didn't just politely leave this bit out.
No, they don't have world class technology teams, they hire contractors to do all the tech stuff, their expertise is in management, yes that's world class.
> Not exactly the word on the street in my experience.
Depends on the street you're on. Are you on Main Street or Wall Street?
If you're hiring them to help with software for solving a business problem that will help you deliver value to your customers, they're probably just like anyone else.
If you're hiring them to help with software for figuring out how to break down your company for scrap, or which South African officials to bribe, well, that's a different matter.
I've got no idea who codewall is. Is there acknowledgment from McKinsey that they actually patched the issue referenced? I don't see any reference to "codewall ai" in any news article before yesterday and there's no names on the site.
The scariest part isn't the SQL injection - it's that the system prompts could be changed through the same flaw. A single UPDATE statement could have quietly altered how Lilli guided 43,000 McKinsey consultants on strategy, M&A, and risk assessments. There was no deployment, no code changes, and no audit trail.
This is what happens when AI platforms skip the controls that have been standard in enterprise systems for decades. In any proper ERP deployment, you would have clear separation of duties. The system that serves user queries should never have write access to its own configuration. System prompts that control AI behavior should be treated like master data in SAP: they should be versioned, controlled for access, and auditable. They shouldn’t be in the same database as user content, writable by anyone who finds an open endpoint.
McKinsey patched the issue quickly, which is a positive step. However, the decision to store writable prompts alongside user data shows that no one with a background in enterprise controls was involved in the design.
- "The agent mapped the attack surface and found the API documentation publicly exposed — over 200 endpoints, fully documented. Most required authentication. Twenty-two didn't."
McKinsey requires hiring an external pen-testing company to launch even to a small group of coworkers.
I can forgive this kind of mistake on the part of the Lilli devs. A lot of things have to fail for an "agentic" security company to even find a public endpoint, much less start exploiting it.
That being said, the mistakes in here are brutal. Seems like close to 0 authz. Based on very outdated knowledge, my guess is a Sr. Partner pulled some strings to get Lilli to be publicly available. By that time, much/most/all of the original Lilli team had "rolled off" (gone to client projects) as McKinsey HEAVILY punishes working on internal projects.
So Lilli likely was staffed by people who couldn't get staffed elsewhere, didn't know the code, and didn't care. Internal work, for better or worse, is basically a half day.
This is a failure of McKinsey's culture around technology.
McKinsey has a weird structure where there are too many cooks in the kitchen.
Everybody there is reviewed on client impact, meaning it ends up being an everybody-for-themselves situation.
So as a developer you have little guidance (in fact, you're still being reviewed on client impact, even if you have 0 client exposure).
Then a (Senior) Partner comes in with this idea (that will get them a good review), and you jump on that. After all, it's all you can do to get a good review.
You work on it, and then the (Senior) Partner moves on. But it's not done. It's enough for the review, but continuing to work on it doesn't bring you anything, in fact, it will actually pull you down, as finishing the project doesn't give immediate client results.
So what does this mean? Most products of McKinsey are a grab-bag of raw ideas of leadership, implemented as a one-off, without a cohesive vision or even a long-term vision at all. It's all about the review cycle.
McKinsey is trying to do software like they do their other engagements. It doesn't work. You can't just do something for 6 months and then let it go. Software rots.
The fact that they laid off a good amount of (very good) software engineers in 2024 is a reflection on how they see software development.
And McKinsey's people, who go to other companies, take those ideas with them. Result: The UI of your project changes all the time, because everybody is looking at the short-term impact they have that gets them a good review, not what is best for the project in the long term.
McKinsey was on a spree to become the best tech consulting company and brought a lot of great tech talent but the 2023 crisis made leadership turn 180 and simply ditch/ignore all the tech experts they brought to the firm.
All the expertise has left the firm and now they are more and more becoming another BS tech consulting firm, with strategy folks that don't even know that ML is AI advising clients on Enterprise AI transformation.
The tech initiative was a failure and Lilli's problem is just a symptom of it.
I wonder what was the experience at Bain and BCG
Loading comment...
Loading comment...
Loading comment...
Loading comment...
Loading comment...
Loading comment...
Loading comment...
Loading comment...
Loading comment...
McKinsey challenges graduates to use AI chatbot in recruitment overhaul: https://www.ft.com/content/de7855f0-f586-4708-a8ed-f0458eb25...
Loading comment...
Loading comment...
Loading comment...
They look to package up something and sell it as long as they can.
AI solutions won't have enough of a shelf life, and the thought around AI is evolving too quickly.
Very happy to be wrong and learn from any information folks have otherwise.
Loading comment...
I was expecting prompt injection, but in this case it was just good ol' fashioned SQL injection, possible only due to the naivety of the LLM which wrote McKinsey's AI platform.
I thought we might finally have a high profile prompt injection attack against a name-brand company we could point people to.
Loading comment...
Loading comment...
Loading comment...
Loading comment...
Dead Comment
In this case, a group of pentesters used an AI agent to select McKinsey and then used the AI agent to do the pentesting.
While it is conventional to attribute actions to inanimate objects (car hits pedestrians), IMO we should be more explicit these days, now that unfortunately some folks attribute agency to these agentic systems.
> No human in the loop
If true, it's quite irresponsible. They are admitting to allowing a agent to autonomously execute code on the network. Autonomously perform hacking activities.
Loading comment...
Loading comment...
You're doing that by calling them "agentic systems".
Loading comment...
Not exactly the word on the street in my experience. Is McKinsey more respected for software than I thought? Otherwise I'm curious why TFA didn't just politely leave this bit out.
Loading comment...
- understanding existing systems
- what the paint points are
- making suggestions on how to improve those systems given the paint points
- that includes a mix of tech changes, process updates and/or new systems etc
Now, when it comes to implementing this, in my experience it usually ends up being the already in place dev teams.
Source: worked at a large investment bank that hired McKinsey and I knew one of the consultants from McK prior to working at the bank.
Loading comment...
Loading comment...
Loading comment...
Depends on the street you're on. Are you on Main Street or Wall Street?
If you're hiring them to help with software for solving a business problem that will help you deliver value to your customers, they're probably just like anyone else.
If you're hiring them to help with software for figuring out how to break down your company for scrap, or which South African officials to bribe, well, that's a different matter.
https://www.google.com/search?q=codewall+ai
Deleted Comment
Edit: Apparently, this is the CEO https://github.com/eth0izzle
Loading comment...
I assume that means McKinsey would need to disclose it, or at least alert the former employees of the breach?
Loading comment...
This is what happens when AI platforms skip the controls that have been standard in enterprise systems for decades. In any proper ERP deployment, you would have clear separation of duties. The system that serves user queries should never have write access to its own configuration. System prompts that control AI behavior should be treated like master data in SAP: they should be versioned, controlled for access, and auditable. They shouldn’t be in the same database as user content, writable by anyone who finds an open endpoint.
McKinsey patched the issue quickly, which is a positive step. However, the decision to store writable prompts alongside user data shows that no one with a background in enterprise controls was involved in the design.
Well, there you go.
They’ve long been all hype no substance on AI and looks like not much has changed.
They might be good at other things but would run for the hills if McKinsey folks want to talk AI.