Readit News logoReadit News
hustwindmaple1 commented on TPU Deep Dive   henryhmko.github.io/posts... · Posted by u/transpute
cheptsov · 6 months ago
It’s so ridiculous to see TPUs being compared to NVIDIA GPUs. IMO proprietary chips such as TPU had no future sure to the monopoly on the cloud services. There is no competition across the cloud services providers. The only way to access TPUs is through GCP. As the result nobody wants to use them regardless of the technology. This is the biggest fault of GCP. Further the road, the gap between NVIDIA GPUs and Google TPUs (call it „moat“ or CUDA) is going to grow.

The opposite situation is with AMD which are avoiding the mistakes of Google.

My hope though is that AMD doesn’t start to compete with cloud service providers, e.g. by introducing their own cloud.

hustwindmaple1 · 6 months ago
Every major Cloud vendor is trying to develop their custom AI ASIC. Putting Google aside, Amazon has trainium/inferentia, which Anthropic uses quite extensively. Microsoft is doing sth. similar, although they are quite behind. OpenAI is doing it. Meta is doing it. That's why the stock price of Broadcom/Marvell soared.
hustwindmaple1 commented on KumoRFM: A Foundation Model for In-Context Learning on Relational Data   kumo.ai/company/news/kumo... · Posted by u/cliffly
simplesort · 7 months ago
Jure Leskovec was my Professor at Stanford a few years back, cool to see he's behind this.

He seemed like a good guy and got the sense that he was destined to do something big

hustwindmaple1 · 7 months ago
I remember Kumo was focusing on GNN when it was founded (Jure's strength back then). Looks like they are pivoting or have pivoted.
hustwindmaple1 commented on Linear Programming for Fun and Profit   modal.com/blog/resource-s... · Posted by u/hmac1282
cweld510 · 7 months ago
You're right -- we do relax the integrality constraint, gaining performance at the expense of some precision, and we're generally able to paper over the difference at scheduling time. We've investigated integer linear programming for some use cases, but for solves to run quickly, we have to constrain the inputs significantly.
hustwindmaple1 · 7 months ago
You are basically doing a heurstic. Your solutions are not guaranteed to be optimal. Integer programming is the way to do.
hustwindmaple1 commented on Google Gemini has the worst LLM API   venki.dev/notes/google-ge... · Posted by u/indigodaddy
simonw · 8 months ago
I still don't really understand what Vertex AI is.

If you can ignore Vertex most of the complaints here are solved - the non-Vertex APIs have easy to use API keys, a great debugging tool (https://aistudio.google.com), a well documented HTTP API and good client libraries too.

I actually use their HTTP API directly (with the ijson streaming JSON parser for Python) and the code is reasonably straight-forward: https://github.com/simonw/llm-gemini/blob/61a97766ff0873936a...

You have to be very careful when searching (using Google, haha) that you don't accidentally end up in the Vertext documentation though.

Worth noting that Gemini does now have an OpenAI-compatible API endpoint which makes it very easy to switch apps that use an OpenAI client library over to backing against Gemini instead: https://ai.google.dev/gemini-api/docs/openai

Anthropic have the same feature now as well: https://docs.anthropic.com/en/api/openai-sdk

hustwindmaple1 · 8 months ago
If you are not a paying GCP user, there is really no point to even look at Vertex AI.

Just stick with AI Studio and the free developer AI along with it; you will be much much happier.

hustwindmaple1 commented on Tracing the thoughts of a large language model   anthropic.com/research/tr... · Posted by u/Philpax
colah3 · 9 months ago
Thanks for the feedback! I'm one of the authors.

I just wanted to make sure you noticed that this is linking to an accessible blog post that's trying to communicate a research result to a non-technical audience?

The actual research result is covered in two papers which you can find here:

- Methods paper: https://transformer-circuits.pub/2025/attribution-graphs/met...

- Paper applying this method to case studies in Claude 3.5 Haiku: https://transformer-circuits.pub/2025/attribution-graphs/bio...

These papers are jointly 150 pages and are quite technically dense, so it's very understandable that most commenters here are focusing on the non-technical blog post. But I just wanted to make sure that you were aware of the papers, given your feedback.

hustwindmaple1 · 9 months ago
Really appreciate your team's enormous efforts in this direction, not only the cutting edge research (which I don't see OAI/DeepMind publishing any paper on) but aslo making the content more digestible for non-research audience. Please keep up the great work!
hustwindmaple1 commented on SoftBank Group to Acquire Ampere Computing for 6.5B   group.softbank/en/news/pr... · Posted by u/geerlingguy
hn_throwaway_99 · 9 months ago
> Run by Masa who many consider an oracle because his placed massive bets years before other people and those bets have paid of big.

I'm not sure how many folks these days would consider him an "oracle". He clearly did fabulously well with some early Internet investments, but he also was famous for folly after folly of overpriced investments in the 2010s (the Softbank "Vision Funds").

I'd be curious if there is a simple accounting list of Softbank's major investments ranked from biggest winners to biggest losers. I guess it pretty much highlights the dynamics of the VC business model - you only need a few giant winners to offset the boatload of losers. In Softbank's case, I'm guessing their biggest winners are Yahoo Japan (which was the dominate site in Japan for a long time, and long after the US Yahoo fell into irrelevance) and Alibaba, which saw their early $20 million investment balloon into billions.

But did Softbank have any winners from their 2010s spending spree (along the time where they shoveled good money after bad into WeWork)?

hustwindmaple1 · 9 months ago
His Alibaba/Arm bets are legendary. Both made $100B+
hustwindmaple1 commented on China tells its AI leaders to avoid U.S. travel over security concerns   wsj.com/world/china/china... · Posted by u/bookofjoe
klipt · 10 months ago
Great public transit. Affordable housing. Huge variety of delicious food. Low crime. People who care enough about the social good to wear masks in crowded places during a pandemic.
hustwindmaple1 · 10 months ago
Housing is definitely not easily affordable. But the other points are valid.
hustwindmaple1 commented on How to scale your model: A systems view of LLMs on TPUs   jax-ml.github.io/scaling-... · Posted by u/mattjjatgoogle
jdeaton · a year ago
if you're using tpu why are you using pytorch
hustwindmaple1 · 10 months ago
there is limited TPU support in pytorch via torch_xla
hustwindmaple1 commented on Andrej Karpathy: Deep Dive into LLMs Like ChatGPT [video]   youtube.com/watch?v=7xTGN... · Posted by u/leroman
ipsum2 · 10 months ago
He's made more than 5 videos covering basically the same topic, of transformer architecture and training. Wonder whats different about this one?
hustwindmaple1 · 10 months ago
When he drops a vid, you don't ask questions. You watch first and then ask questions :)

u/hustwindmaple1

KarmaCake day35August 30, 2022View Original