I Went to SQL Injection Court

Hi everyone, I'm the plaintiff in this lawsuit. I'm still working on my companion post for tptacek's post! I'll have it ready Soon TM, but feel free to me any questions in the meantime here.

While you're waiting, check out this older post: https://mchap.io/that-time-the-city-of-seattle-accidentally-...

qingcharles · 10 months ago

Matt, you do the Lord's work.

Bear in mind that Matt technically lost this, even with the backing of some of the absolute best civil rights lawyers in the country, Loevy and Loevy, fighting on his behalf. This shows you the absurd difficulty in fighting city hall, especially if you're crazy enough to do it without representation.

The one thing working in our favor is what is proposed in TFA: change the law. Once the state Supreme Court has ruled you're hosed unless you can get an amendment. Illinois has a very strong history of amending its FOIA statute, although a proportion of those changes are to further protect information from disclosure, not always on the side of sunshine.

Another change that needs to happen is strong punishment for bodies who lose these fights. In Illinois this is limited to a "$5000 civil penalty" against the body. What is a civil penalty? It's vaguely defined. They used to throw the money to the plaintiff, but in the later cases I fought they simply awarded the money to the county. As one State's Attorney said to me "I don't care if I lose every case, I just write a check out to myself."

(one final note: be careful what you wish for when you litigate, you can end up with an appellate decision like this that solidifying in law the exact thing you were fighting. It's nobody's fault, but it happens. I ended up with one absurd decision that removed prisoners' rights rather than enhanced them.)

tptacek · 10 months ago

A losing public body is also generally on the hook for attorney's fees, which can be considerable. But the general problem here is that the public bodies are all spending someone else's money, so the real deterrent you have is how much of their time you can credibly threaten to eat up with legal actions.

dataflow · 10 months ago

I don't understand the argument that knowing the column names doesn't help an attacker? Especially in a database that doesn't allow wildcards, doesn't it make things much easier if you know you can do '); SELECT col FROM logins, as opposed to having to guess the column name?

And I don't think I disagree with the court on schema vs. file layouts either. It's not the file layout, but it's analogous: it tells you how the "files" (records) are laid out on the "file system" (database tables). For example, denormalization is very analogous to inlining of data in a file record. The notion that filesystems are effectively databases itself is a well known one too. How do you argue they aren't analogous?

tczMUFlmoNk · 10 months ago

You can always `SELECT table_name, column_name, data_type FROM information_schema.columns`, which is part of the SQL standard. https://www.postgresql.org/docs/current/infoschema-columns.h...

Plus, generally if you have SQL injection, you have multiple tries. You're not going to be locked out after one shot. And there's only so many combinations of `SELECT {id,userid,user_id,uid} FROM {user,users,login,logins,customer,customer}` before you find something useful.

chaps · 10 months ago

The Department of Justice disagrees and voluntarily releases column and table names: https://www.justice.gov/afp/media/1186431/dl?inline=

gwd · 10 months ago

> I don't understand the argument that knowing the column names doesn't help an attacker?

So Kevin Mitnick supposedly did most of his hacking using "social engineering". He'd call up some person, pretend to be in some other department within their organization, and ask them for some specific bit of information he needed to further his attack (or ask them to change some specific thing that would allow him to further his attack).

Would knowing the structure of Illinois governmental organizations help someone perform social engineering attacks against them? Yes, absolutely.

Should Illinois therefore keep the internal structures of their organizations -- the department names and the officials who run them -- secret? No, absolutely not.

First of all, if an attacker doesn't know them, they'll just use other social engineering attacks to figure them out; i.e., hiding the structure doesn't stop social engineering attacks, it just slows them down. Secondly, the value to the public of being able to navigate governmental structures far outweighs the cost of potential attacks.

This seems to me to be a direct analog: The "organizational structure" is the "database schema", and the "willingness to help a random person on the phone who seems to know what they're talking about" is the "SQL injection vulnerability". If an attacker knows the schema, their job is faster; but if they don't know the schema, they'll just use attacks to figure out the schema; so keeping it private doesn't stop an attack, only slow it down. And the benefit to the public of being able to issue FOIA requests far outweighs the cost of potential attacks.

AdamJacobMuller · 10 months ago

> And I don't think I disagree with the court on schema vs. file layouts either.

I disagree that the law should prohibit disclosing "file layouts" but it's pretty clear that the law does block that, and I fundamentally agree with you that schemas are directly analogous to file layouts and thus restricted.

dmurray · 10 months ago

And this part seems self-defeating:

> Attackers like me use SQL injection attacks to recover SQL schemas. The schema is the product of an attack, not one of its predicates”.

If it's the product of an attack, but not the end goal, surely it's of value to the attacker?

It seems clear to me that the statute does, as worded, in principle allow the city not to disclose the database schema - it would compromise the security of the system, or at the very least, it would for some systems, so each request needs to be litigated individually.

The proposed amendment sounds like a good way to fix this - is it likely that will pass?

econ · 10 months ago

If you have an injection friendly application then that is the security problem.

Say someone hacks the db, is the problem easy to guess table names? The column should never have be called "passwords"?

Perhaps 30 years ago that would sound good.

Obscurity should hardly ever be a line of defense. If it is the only defense the problem isn't that it wasn't obscure enough.

Edit:

I'll do you one better. If you so much as suggest that obscurity is good security you actually openly invite people to fool around with your applications. The odds holes are to be found are much better than elsewhere.

ic4l · 10 months ago

I agree with you. Knowing the exact column names can speed up an attack and, in some cases, make it more feasible.

Why don’t they just request disclosure of what’s actually stored and allow renaming of the columns? It seems odd that knowing the exact column names would be necessary if the goal is simply to understand what data is being stored and its intended purpose.

fsckboy · 10 months ago

>It's not the file layout, but it's analogous...How do you argue they aren't analogous?

laws don't get to be analogous

foia request: "I'd like the report the committee prepared about the costs for the new bridge"

response: "denied. the report contains costs laid out in tables with headings, which while not being schemas are analogous, with schemas not being files but being analogous"

mcv · 10 months ago

Yeah, I think it's still useful info for an attacker. But only if the system was actually developed by amateurs who never heard of parameterized queries.

I find it a bit bizarre that the city uses "our system was developed with no consideration for security" as a valid defense.

IshKebab · 10 months ago

'); SELECT * FROM logins --

foota · 10 months ago

Out of curiosity, could you ask for something like "one row of data from every table in the CANVAS database"?

mbreese · 10 months ago

This is a technical solution to a people problem. My reading is that the city doesn’t want to give up this information. If that’s the case, a technical solution wouldn’t work, no matter how easy it is. And given that this has already gone to the Illinois Supreme Court (and lost), the only solution is what is discussed at the end: updating the law.

hathawsh · 10 months ago

Kudos to you for enduring through this fight! We can only achieve transparency when people choose not to be complacent. Thank you.

What do you think are the next steps?

chaps · 10 months ago

My first step is to actually finish my post :)

But after that, getting a reasonable law passed to fix this now-broken nonsense.

doctorpangloss · 10 months ago

What are the administrators of CANVAS hiding?

chaps · 10 months ago

Hard to say. One of my personal drivers for this lawsuit is a tip I received that said that Chicago has a list of vendors whose tickets are dropped in the back-end. When I requested that info, the city said they had no such list. I trust my source, so having schema information could help figure out the extent and if they were lying.

butlike · 10 months ago

'ethnicity' header, 'net_income' header... wouldn't doubt chicago could be cave man enough to do this

maCDzP · 10 months ago

Have you tried looking for information from the developer about CANVAS? With any luck the developer has support documentation online that describes CANVAS and maybe you'll be able to narrow down your FOIA request.

manquer · 10 months ago

I think the point of the lawsuit is less about CANVAS schema itself and more about the ability of the government to hide this kind of information from FOIA requests.

notjulianjaynes · 10 months ago

Damn, this is impressive. I've been fighting with a state agency since December for 17,000 emails. I don't think I've ever tried to request emails and received zero push-back, but a $33 million estimate just, chef's kiss

gwerbret · 10 months ago

Very interesting case! Just one question: to what extent do changes in database schemata fall under FOIA in Illinois? That is, if they should change the database schema to conceal whatever it is they're fighting tooth and nail to hide, are they compelled to retain detailed information about that change? Or can they later present you (should the legislation pass) with a cleaned-up, nothing-to-see-here updated version?

mcnichol · 10 months ago

I don't want to take away any steam from your sails but giving bad information in regards to case law shouldn't be taken lightly. Your "expert witness" did you a disservice.

Schema is very much a critical field in terms of AuthZ privileges. Just knowing the structure is not far off from knowing the max entropy a password may hold. In regards to InfoSec, table structure is the recon phase which limits effort and minimizes time. Someone with that much time in security knows DBs will be hacked, not if but when. Time is an incredibly important tool which is why we have expirations on so many authN and authZ windows of attack.

I'm glad that you are challenging them but I believe a credible engineer would have made mince meat of your expert and hurt the rest of us who want to see you successful.

It's possible rewriting certain statutes can help us but there is no company worth its salt that would share DB schema.

thayne · 10 months ago

> Just knowing the structure is not far off from knowing the max entropy a password may hold

Not if the password is hashed, as it should be. Unless the schema somehow indicates that it uses a hash algorithm such as bcrypt that has a maximum password length. And even then, if they pre-hash the password, the password itself could have more entropy than that. And if there is a maximum password length, then you can probably figure that out via other means, like it being listed in the requirements when you set your password. It does tell you the size of the hash of the password, but if the maximum entropy is sufficiently high, as it should be, then it doesn't really matter; it would still be impractical to brute force.

> there is no company worth its salt that would share DB schema

So you are saying that every company with a self-hosted or open source product that uses a database isn't worth their salt? If your DB is running on a customer's infrastructure, that customer will by necessity have access to the schema. And likewise if the source code for a product is publicly available it is trivial to determine the schema.

ra · 10 months ago

They can produce a report using english language labels instead of the db column names. Their argument isn't fact it's vexatious obstenance.

hennell · 10 months ago

As mentioned in the post FOIA tends to only include existing records/information, it doesn't extend to producing new work. So producing a new report would be considered too much work. (But fighting a lawsuit to not reveal the schema is fine )

waitwhatwhoa · 10 months ago

When can we submit witness slips? Is there a mailing list for updates we can join? Good luck!

hn_user82179 · 10 months ago

This older post was such a fantastic read, thanks for sharing your story!

layoric · 10 months ago

It's dated from ~2 weeks ago... is there other date information I am missing?

monksy · 10 months ago

What I want to know: How much malort does the city expensive a year?

foota · 10 months ago

> Normally, a flustered public records officer would just reject a giant request for being for “unduly burdensome”… but this sort of estimate is practically unheard of. So much so that other FOIA nerds have told me that this is the second biggest request they've ever seen. The passive aggression is thick. Needless to say, it's not something I'm willing to pay for!

Welcome to Seattle :-)

geoduck14 · 10 months ago

> that's the second biggest FOIA request I've ever seen!

-Guybrush, from The Secret of Monkey Island

mmaunder · 10 months ago

Thanks for fighting the good fight for us all!

rnewme · 10 months ago

The footer links to dead x account.

While I believe that the city should share the schema, and that the city is effectively argues for security through obscurity, I disagree with the main premise of the article: that knowing SQL schema doesn't help the attacker.

If I understand the argument of the author here:

> Attackers like me use SQL injection attacks to recover SQL schemas. The schema is the product of an attack, not one of its predicates

The author appears to imply that once the vulnerability is found, the schema can be recovered anyway. It is not always the case. It is perfectly viable to find a SQL injection that would allow to fetch some data from the table that is being queried, but not from any other table, including `information_schema` or similar. If all the signal you get from the vunlerability is also "query failed" or "query succeeded, here's the data", knowing the schema makes it much easier to exploit.

> the problem is that every computer system connected to the Internet is being attacked every minute of every day

If you specifically log failed DB queries, than for all the possible injections that such 24/7 attacks would find you have already patched them. The log would then be not deafening until someone stumbles on the actual injection (that, for example, only exists for logged in users, and thus is not found by bots), in which case you have time to see it and patch before the attacker finds a way to actually utilize it.

Knowing schema both expedites their ability to take advantage of the vulnerability, but also increases their chances of probing the injection without triggering the query failure to begin with.

florbnit · 10 months ago

> that knowing SQL schema doesn't help the attacker.

Knowing the name of the service helps the attacker, knowing the name of government officials working at city hall helps attackers, knowing the legal description of what a parking ticket is helps attackers. If you are sued and decide you want to hack the government knowing the details of the suit against you helps you in your attack.

The barrier is not “any helpful information must be censored” the barrier is “don’t disclose passwords or code that would divulge backdoors” a schema cannot be that.

Volundr · 10 months ago

I'm not an attacker, just a boring old software dev. If there's an SQL Injection I'd say all bets are off re: schema.

That said I've definitely worked on applications where knowing the schema could help you exfill data in the absence of a full injection. The most obvious being a query that's constructed based on url parameters, where the parameters aren't whitelisted.

So I actually do agree that the schema could potentially be of marginal benefit to the attacker.

butlike · 10 months ago

Wouldn't admitting this in court pin you with some sort of negligence? (if you knew having a schema revealed would compromise your app in some way).

pockmarked19 · 10 months ago

Reminds me that the recently discovered “leak emails using YouTube” exploit kicked off from reading what is essentially, a schema.

https://brutecat.com/articles/leaking-youtube-emails

robocat · 10 months ago

> kicked off from reading what is essentially, a schema.

I wouldn't call json a schema.

In the HN discussion tptacek replied that "$10,000 feels extraordinarily high for a server-side web bug": https://news.ycombinator.com/item?id=43025038

However his comment assumes monetisation is selling the bug; (tptacek deeply understands the market for bugs). However I would have thought monetisation could be by scanning as many YouTube users as possible for their email addresses: and then selling that limited database to a threat actor. You'd start the scan with estimated high value anonymous users. Only Google can guess how many emails would have been captured before some telemetry kicked off a successful security audit. The value of that list could possibly well exceed $10000. Kinda depends on who is doxxed and who wants to pay for the dox.

It's hard to know what the reputational cost to Google would be for doxxing popular anonymous accounts. I'm guessing video is not so often anonymous so influencers are generally not unknown?

I'm guessing trying to blackmail Google wouldn't work (once you show Google an account that is doxxed, they would look at telemetry logs or perhaps increase telemetry). I wonder if you could introduce enough noise and time delay to avoid Google reverse-engineering the vulnerability? Or how long before a security audit of code would find the vulnerability?

Certainly I can see some governments paying good money to dox anonymous videos that those governments dislike. The Saudis have money! You could likely get different government security departments to bid against each other... Thousands seems doable per dox? The value would likely decrease as you dox more.

tptacek · 10 months ago

If you specifically log failed database queries, where "failure" means "indicative of SQL injection", then nothing you can do with the schema is going to reduce the signal in that feed --- even a single SQL syntax error would be worth following up on. No, I don't think your logic holds.

kmoser · 10 months ago

I don't understand your logic. Knowledge of the schema can give an attacker an edge because they now know the exact column names to probe. Whether these probes get logged is irrelevant; even if it makes the system more vulnerable for an instant, it's still more vulnerable.

Even if logging failed queries is your metric, then knowledge of column names would make it more likely for an attacker to craft correct queries, which would not get logged, thus making your logs less useful than if the attacker had to guess at column names and, in so doing, incur failed queries.

lucb1e · 10 months ago

> nothing you can do with the schema is going to reduce the signal in that feed --- even a single SQL syntax error would be worth following up on

Syntax errors coming from your web application mean there is a page somewhere with a bugged feature, or perhaps the whole page is broken. Of course that's worth following up on?

Edit: maybe I should add a concrete example. I semi-regularly look at the apache error logs for some of my hobby projects (mainly I check when I'm working on it anyway and notice another preexisting bug). I've found broken pages based on that and either fixed them or at least silenced the issue if it was an outdated script or page anyway. Professionals might handle this more professionally, or less because it's about money and not just making good software, idk

wglb · 10 months ago

> "query failed" or "query succeeded, here's the data"

Blind SQL injection is a type where no error is produced, but some subtle signal can indicate success or failure. The most interesting one that I know about is where the presence of a successful injection was a normal looking response that was one byte longer than an unsuccessful injection. This was used to not only figure out the schema, but to fully exfiltrate the entire database.

There is nothing in the log on the server that indicates an error.

Most of the relatively introductory SQL injection exercises that I taught proceed without any knowledge of the schema.

This is why SQL injection is so insidious.

berkes · 10 months ago

Not just with SQLi, but I've managed to statistically proof "information" with timing attacks.

Where if you join another table (by e.g. requesting extra info in a graphql query) the response goes from ms to s or even m. Indicating the size of the joined table.

Or where I could change a "?sort[updated_at]=desc" to a "?sort[password_hash]" through trial-and-error and suddenly see the response time drop from ms to seconds (in this case finding columns that exist but aren't indexed).

Even if the response content is exactly the same, we know things exist, are big, not indexed, or simply present, by timing the attack.

A famous one is obviously the timing trick to find out that an email is in the system because "user = user.find(email) && user.password_matches(password)" short cirquits if the email does not exist but spends significant time on hashing the password for matching it. A big lot of backends and apps make this mistake.

gerdesj · 10 months ago

That's where the court's technical distinction between the words: "could" and "would", is important. It appears they have reduced the distinction to a risk assessment which is more objective than opining wildly!

For example: I've just re-wired a three gang light switch. I verified power on with my multimeter (test the meter), cut the power and then retested all the circuits to make sure I had got it right.

It turns out that switch three is on a separate ring main. Cool I didn't get to test my body's ability to take a whopper of a shock. In the UK it is common to have upstairs and downstairs rings for light circuits. Our kitchen has quite a few lights in it so it got a separate ring as well. Anyway there are quite a lot of wires in there because all of them are two way switches. Oh and I am allowed to work on them because of the switch location - not kitchen and not bathroom, ie a low risk location

I noted down the connections, and took them all out. I put Wagos over the flying ends to make them safe, turned the power back on and got on with the job in hand.

I then cut the power (both circuits) checked again with my Fluke. Oh bollocks ... enable power, test the Fluke and then cut power again and recheck the circuits.

Now I re-terminated all the connections. There was plenty of additional wire so I decided to cut and re-strip the conductors, to make sure that I avoided potential failures due to "work hardening" from the inevitable pushing and pulling and "gentle" forcing into position. Once all the conductors were screwed down I pulled on them fairly forcefully to make sure they wont fall out.

I screwed down the switch face plate and restored power. Its a brushed metal finish switch so I did test it was not live, because I'm careful. I tested the functionality ie all three switch circuits (three) from all the switches (six).

So, given that description is it possible that the connectors might fall out in the future and short on say, the metal back box. Of course it is possible. It could happen but would it happen?

You could postulate all sorts of scenarios. Perhaps I may be careful but I might be cack handed and forgetful and got something wrong anyway and a wire might still drop out. Now we are at the point of whataboutery! and that wont wash.

The would/could distinction is a powerful one and it is analogous to how we do risk assessments.

I'm certainly not saying you are wrong in your assessment but I think you are fiddling with details to conjure up a "could" and not a "would". I agree that knowing the schema would assist a hacking attempt but would it make a successful crack more likely - no I don't think so. It is a classic case of obscurity despite security but a rather more complicated one than putting the ssh daemon on port 2222.

Cripes - I need to get out more!