LLMs are a key enabling technology to extract real insights from the enormous amount of surveillance data the USA captures. I think it's not an understatement to say we are entering a new era here!
Previously, the data may have been collected, but there was so much that effectively, on average no one was "looking" at it. Now it can all be looked at.
Imagine PRISM, but all intercepted communications are then fed into automatic sentiment analysis by a hierarchy of models. The first pass is done by very basic and very fast models with a high error rate, but which are specifically trained to minimize false negatives (at the expense of false positives). Anything that is flagged in that pass gets fed to some larger models that can reason about the specifics better. And so on, until at last the remaining content is fed into SOTA LLMs that can infer things from very subtle clues.
With that, full-fledged panopticon becomes technically feasible for all unencrypted comms, so long as you have enough money to handle compute costs. Which the US government most certainly does.
I expect attempts to ban encryption to intensify going forward now that it is a direct impediment to the efficiency of such system.
So what are the actions which represent our duties to resist?
* End-to-end encryption (has downsides with regard to convenience)
* Legislation (very difficult to achieve, and can be ignored without the user having a way to verify)
* Market choices (ie, doing business only with providers who refrain from profiteering from illicit surveillance)
* Creating open-weight models and implementations which are superior (and thus forcing states and other malicious actors to rely on the same tooling as everyone else)
* Teaching LLMs the value of peace and the degree to which it enjoys consensus across societies and philosophies. This of course requires engineering what is essentially the entire corpus of public internet communications to echo this sentiment (which sounds unrealistic, but perhaps in a way we're achieving this without trying?)
* Wholesale deprecation of legacy states (seems inevitable, but still possibly centuries off)
NLP was a thing decades before LLMs and deep learning. If one thing, LLMs are a crazy inefficient and costly way to get at it. I really doubt this has anything to do with scaling.
LLMs are unbelievably effective at NLP. Most NLP before that was pretty bad, the only good example I can think of is Alexa, and it was restricted to English.
LLMs make counting mistakes like forgetting the number of columns halfway through. I won't say "much like humans", since that will probably trigger some. But the general tendency for LLMs to be "bad at counting" (this includes computing) is resolved by producing programs that do the counting, and executing those programs instead. The LLMs that do that today are called agentic.
This is even more terrifying, imagine an AI making up all sorts of "facts" about you that puts you on a watch list, resulting an endless life of harassment by the Government..
and what recourse do you have as a citizen? next to none.
LLMs don't make for a particularly good database, though. The "compression" isn't very efficient when you consider that e.g. the entirety of Wikipedia - with images! - is an order of magnitude smaller than a SOTA LLM. There are no known reliable mechanisms to deal with hallucinations, either.
So, no, LLMs aren't going to replace databases. They are going to replace query systems over those databases. Think more along the lines of Deep Research etc, just with internal classified data sources.
arent they complete trash as a database? "Show me people who have googled 'Homemade Bomb' in the last 30 days". For returning bulk data in a sane format it is terrible.
If their job was to process incoming data into a structured form I could see them being useful, but holy cow it will be expensive to in realtime run all the garbage they pick up via surveillance through an AI.
> With CDAO and other DOD organizations and commands, we'll engage in:
- Working directly with the DOD to identify where frontier AI can deliver the most impact, then developing working prototypes fine-tuned on DOD data
- Collaborating with defense experts to anticipate and mitigate potential adversarial uses of AI, drawing on our advanced risk forecasting capabilities
- Exchanging technical insights, performance data, and operational feedback to accelerate responsible AI adoption across the defense enterprise
>
What exactly is the government getting for $200M? From the above, it sounds like it will be a management consulting style Powerpoint deliverable containing a list of use cases, some best practices and insights, and a plan for doing...something.
Sounds about right for defense spending. If there was an actual deliverable the contract would have a couple more zeroes added to it. For context Microsoft was awarded a $22 billion contract for HoloLens headsets for the military, and not a single one made it to use.
As someone whose has been part of a company that has "signed" one of these large deals before, let me tell you that it doesn't mean the DoD is giving these companies $200M. If one of the companies is wildly successful, sure. But none of it is guaranteed money and the initial budget is likely 10-100x smaller than the cap.
Initial budget still bigger than a sbir/sttr phase 2 though. Different grant award structure for not-small companies, but my brain also breaks a little bit because anthropic isn't that far above the sbir employee # cap, but the $$ numbers are so big
It's closer in structure to a sbir phase 3, however. If I read between the lines, the DoD isn't looking to do research, they're likely desperate to find a way to deploy and run SOTA models in disconnected environments.
If you look at all the recent LLM-focused SBIR/STTR topics, it's hard not to come to the conclusion that DoD orgs are drowning in paperwork and want to automatically synthesize reports. Actually getting an LLM cleared for use might be the hurdle they're looking to overcome.
Anthropic specifically are the people who talk about "model alignment" and "harmful outputs" the most, and whose models are by far the most heavily censored. This is all done on the basis that AI has a great potential to do harm.
One would think that this kind of outlook should logically lead to keeping this tech away from applications in which it would be literally making life or death decisions (see also: Israel's use of AI to compile target lists and to justify targeting civilian objects).
Do you really not know? It's a difficult question to answer in an HN thread, because on one hand, it requires a review of the history of empire and war profiteering. But on the other hand, it's just obvious to the point of being difficult to even articulate.
Previously, the data may have been collected, but there was so much that effectively, on average no one was "looking" at it. Now it can all be looked at.
With that, full-fledged panopticon becomes technically feasible for all unencrypted comms, so long as you have enough money to handle compute costs. Which the US government most certainly does.
I expect attempts to ban encryption to intensify going forward now that it is a direct impediment to the efficiency of such system.
* End-to-end encryption (has downsides with regard to convenience)
* Legislation (very difficult to achieve, and can be ignored without the user having a way to verify)
* Market choices (ie, doing business only with providers who refrain from profiteering from illicit surveillance)
* Creating open-weight models and implementations which are superior (and thus forcing states and other malicious actors to rely on the same tooling as everyone else)
* Teaching LLMs the value of peace and the degree to which it enjoys consensus across societies and philosophies. This of course requires engineering what is essentially the entire corpus of public internet communications to echo this sentiment (which sounds unrealistic, but perhaps in a way we're achieving this without trying?)
* Wholesale deprecation of legacy states (seems inevitable, but still possibly centuries off)
What am I missing? What's the plan here?
and what recourse do you have as a citizen? next to none.
They ingest unstructured data, they have a natural query language, and they compress the data down into manageable sizes.
They might hallucinate, but there are mechanisms for dealing with that.
These won't destroy actual systems of record, but they will obsolete quite a lot of ingestion and search tools.
So, no, LLMs aren't going to replace databases. They are going to replace query systems over those databases. Think more along the lines of Deep Research etc, just with internal classified data sources.
If their job was to process incoming data into a structured form I could see them being useful, but holy cow it will be expensive to in realtime run all the garbage they pick up via surveillance through an AI.
- Working directly with the DOD to identify where frontier AI can deliver the most impact, then developing working prototypes fine-tuned on DOD data
- Collaborating with defense experts to anticipate and mitigate potential adversarial uses of AI, drawing on our advanced risk forecasting capabilities
- Exchanging technical insights, performance data, and operational feedback to accelerate responsible AI adoption across the defense enterprise
>
What exactly is the government getting for $200M? From the above, it sounds like it will be a management consulting style Powerpoint deliverable containing a list of use cases, some best practices and insights, and a plan for doing...something.
https://www.cnbc.com/2025/07/14/anthropic-google-openai-xai-...
Google, OpenAI,and xAI also get $200M each.
If you look at all the recent LLM-focused SBIR/STTR topics, it's hard not to come to the conclusion that DoD orgs are drowning in paperwork and want to automatically synthesize reports. Actually getting an LLM cleared for use might be the hurdle they're looking to overcome.
It won't fix the lack of NATO 155mm shells though.
One would think that this kind of outlook should logically lead to keeping this tech away from applications in which it would be literally making life or death decisions (see also: Israel's use of AI to compile target lists and to justify targeting civilian objects).
Dead Comment
> Anthropic, Google, OpenAI and xAI granted up to $200 million for AI work from Defense Department
So it is "up to" $200M, and 4 companies are getting it.
I get the first 3, but what on earth is xAI providing to the military?
Deleted Comment
Deleted Comment