Readit News logoReadit News
paradite commented on Google will allow only apps from verified developers to be installed on Android   9to5google.com/2025/08/25... · Posted by u/kotaKat
rvnx · 10 hours ago
If this is a thing then the solution they offer is incorrect. A big giant red screen: “warning the identity of this application developer has not been verified and this could be an application stealing your data, etc” would have worked.

What they want is to get rid of apps like YouTube Vanced that are making them lose money (and other Play Store apps)

paradite · 8 hours ago
It won't work because of too many false positives. People are already trained to ignore warnings, like how they blindly accept T&C without reading.
paradite commented on DeepSeek-v3.1   api-docs.deepseek.com/new... · Posted by u/wertyk
segmondy · 5 days ago
garbage benchmark, inconsistent mix of "agent tools" and models. if you wanted to present a meaningful benchmark, the agent tools will stay the same and then we can really compare the models.

there are plenty of other benchmarks that disagree with these, with that said. from my experience most of these benchmarks are trash. use the model yourself, apply your own set of problems and see how well it fairs.

paradite · 4 days ago
Hey. I like your roast on benchmarks.

I also publish my own evals on new models (using coding tasks that I curated myself, without tools, rated by human with rubrics). Would love you to check out and give your thoughts:

Example recent one on GPT-5:

https://eval.16x.engineer/blog/gpt-5-coding-evaluation-under...

All results:

https://eval.16x.engineer/evals/coding

paradite commented on Mark Zuckerberg freezes AI hiring amid bubble fears   telegraph.co.uk/business/... · Posted by u/pera
Macha · 5 days ago
This link is not paywalled, unlike the WSJ link.
paradite · 5 days ago
It's pay wall for me.
paradite commented on Node.js is able to execute TypeScript files without additional configuration   nodejs.org/en/blog/releas... · Posted by u/steren
rovingeye · 9 days ago
I can understand the argument, since npm has no solution for TypeScript packages, unlike JSR:

"You publish TypeScript source, and JSR handles generating API docs, .d.ts files, and transpiling your code for cross-runtime compatibility."

Still would have been nice to have this for private packages.

This makes Deno/Bun much more attractive alternatives

paradite · 8 days ago
JSR does that? Now that might be a good reason to move my packages over to get rid of tsup.
paradite commented on     · Posted by u/NoScopeNinja
paradite · 17 days ago
Is this from Moonshot AI (company behind the Kimi K2), or a 3rd party?

Judging from the design, I assume it's not officially related to the model.

paradite commented on Crush: Glamourous AI coding agent for your favourite terminal   github.com/charmbracelet/... · Posted by u/nateb2022
cristea · a month ago
I would love a comparison between all these new tools, like this with Claude Code, opencode, aider and cortex.

I just can’t get an easy overview of how each tool works and is different

paradite · a month ago
The performance not only depends on the tool, it also depends on the model, and the codebase you are working on (context), and the task given (prompt).

And all these factors are not independent. Some combinations work better than others. For example:

- Claude Sonnet 4 might work well with feature implementation, on backend code python code using Claude Code.

- Gemini 2.5 Pro works better for big fixes on frontend react codebases.

...

So you can't just test the tools alone and keep everything else constant. Instead you get a combinatorial explosion of tool * model * context * prompt to test.

16x Eval can tackle parts of the problem, but it doesn't cover factors like tools yet.

https://eval.16x.engineer/

paradite commented on Study mode   openai.com/index/chatgpt-... · Posted by u/meetpateltech
kobenni · a month ago
In my experience asking questions to Claude, the amount of incorrect information it gives is on a completely different scale in comparison to traditional sources. And the information often sounds completely plausible too. When using a text book, I would usually not Google every single piece of new information to verify it independently, but with Claude, doing that is absolutely necessary. At this point I only use Claude as a stepping stone to get ideas on what to Google because it is giving me false information so often. That is the only "effective" usage I have found for it, which is obviously much less useful than a good old-fashioned textbook or online course.

Admittedly I have less experience with ChatGPT, but those experiences were equally bad.

paradite · a month ago
What kind of questions / domains were you encountering false information on?

u/paradite

KarmaCake day2555March 23, 2014
About
Building products:

16x Eval - Effortlessly evaluate prompts and models

https://eval.16x.engineer/

16x Prompt - Streamline AI Coding Workflow

https://prompt.16x.engineer/

16x Tracker - Track, Filter, and Organize Reddit Keyword Hits

https://tracker.16x.engineer/

--

Older projects

https://ai-simulator.com/

https://16x.engineer/

[ my public key: https://keybase.io/paradite; my proof: https://keybase.io/paradite/sigs/KmrMtMWIIJSc-46410nPNevLQ4ICFUNP-F2RTCKTVhc ]

View Original