benjismith (u/benjismith)

benjismith commented on Ask HN: Who wants to be hired? (March 2026) · Posted by u/whoishiring

benjismith · 10 days ago

Product Engineer (full stack, design + ux)

Former Founder / Principal Engineer of https://shaxpir.com

Location: Portland, OR

Remote: Yes

Willing to Relocate: Yes

Technologies: Java, AWS, TypeScript, Elasticsearch, Realtime Collab, LLM APIs, Claude Code, etc

Résumé/CV: https://www.linkedin.com/in/benjismith/

Email: benji@benjismith.net

benjismith commented on 1.5 TB of VRAM on Mac Studio – RDMA over Thunderbolt 5 jeffgeerling.com/blog/202... · Posted by u/rbanffy

delaminator · 3 months ago

> Working with some of these huge models, I can see how AI has some use, especially if it's under my own local control. But it'll be a long time before I put much trust in what I get out of it—I treat it like I do Wikipedia. Maybe good for a jumping-off point, but don't ever let AI replace your ability to think critically!

It is a little sad that they gave someone an uber machine and this was the best he could come up with.

Question answering is interesting but not the most interesting thing one can do, especially with a home rig.

The realm of the possible

Video generation: CogVideoX at full resolution, longer clips

Mochi or Hunyuan Video with extended duration

Image generation at scale:

FLUX batch generation — 50 images simultaneously

Fine-tuning:

Actually train something — show LoRA on a 400B model, or full fine-tuning on a 70B

but I suppose "You have it for the weekend" means chatbot go brrrrr and snark

benjismith · 3 months ago

> show LoRA on a 400B model, or full fine-tuning on a 70B

Yeah, that's what I wanted to see too.

benjismith commented on AWS CEO says replacing junior devs with AI is 'one of the dumbest ideas' finalroundai.com/blog/aws... · Posted by u/birdculture

benjismith · 3 months ago

I think the biggest injury to the hiring of junior devs happened after COVID made remote-work ubiquitous. It's a lot harder for a junior dev to get real mentorship, including the ambient kind of mentorship-by-osmosis, when everyone works alone in a sad dark room in their basement, rather than in an office with their peers and mentors.

The advent of agentic coding is probably punch #2 in the one-two punch against juniors, but it's an extension of a pattern that's been unfolding for probably 5+ years now.

benjismith commented on Claude Sonnet will ship in Xcode developer.apple.com/docum... · Posted by u/zora_goron

CharlesW · 6 months ago

I'd encourage you to read TFA, but why bother? The submitter clearly didn't either.

benjismith · 6 months ago

I read it. I also searched the page for the word "Opus" and it didn't appear anywhere. The word "Sonnet" appears, but only once.

There's also "GPT-4.1 or GPT-5", but that's not what my question implied, which was that it's weird to offer Sonnet but not Opus.

benjismith commented on Claude Sonnet will ship in Xcode developer.apple.com/docum... · Posted by u/zora_goron

benjismith · 6 months ago

Sonnet only?

benjismith commented on What happens when people don't understand how AI works theatlantic.com/culture/a... · Posted by u/rmason

rnkn · 9 months ago

> isn't this also telling us what "understanding" is?

When people start studying theory of mind someone usually jumps in with this thought. It's more or less a description of Functionalism (although minus the "mental state"). It's not very popular because most people can immediately identify an phenomenon of understanding separate from the function of understanding. People also have immediate understanding of certain sensations, e.g. the feeling of balance when riding a bike, sometimes called qualia. And so on, and so forth. There is plenty of study on what constitutes understanding and most healthily dismiss the "string of words" theory.

benjismith · 9 months ago

A similar kind of question about "understanding" is asking whether a house cat understands the physics of leaping up onto a countertop. When you see the cat preparing to jump, it take a moment and gazes upward to its target. Then it wiggles its rump, shifts its tail, and springs up into the air.

Do you think there are components of the cat's brain that calculate forces and trajectories, incorporating the gravitational constant and the cat's static mass?

Probably not.

So, does a cat "understand" the physics of jumping?

The cat's knowledge about jumping comes from trial and error, and their brain builds a neural network that encodes the important details about successful and unsuccessful jumping parameters. Even if the cat has no direct cognitive access to those parameters.

So the cat can "understand" jumping without having a "meta-understanding" about their understanding. When a cat "thinks" about jumping, and prepares to leap, they aren't rehearsing their understanding of the physics, but repeating the ritual that has historically lead them to perform successful jumps in the past.

I think the theory of mind of an LLM is like that. In my interactions with LLMs, I think "thinking" is a reasonable word to describe what they're doing. And I don't think it will be very long before I'd also use the word "consciousness" to describe the architecture of their thought processes.

benjismith commented on OpenAI Audio Models openai.fm/... · Posted by u/IndignantTyrant

lukebuehler · a year ago

yes, I think you are right. When I did the math on 11labs million chars I got the same numbers (Pro plan).

I'm super happy about this, since I took a bet that exactly this would happen. I've just been building a consumer TTS app that could only work with significant cheaper TTS prices per million character (or self-hosted models)

benjismith · a year ago

Same for me :)

benjismith commented on OpenAI Audio Models openai.fm/... · Posted by u/IndignantTyrant

benjismith · a year ago

Is there way to get "speech marks" alongside the generated audio?

FYI, Speech marks provide millisecond timestamp for each word in a generated audio file/stream (and a start/end index into your original source string), as a stream of JSONL objects, like this:

{"time":6,"type":"word","start":0,"end":5,"value":"Hello"}

{"time":732,"type":"word","start":7,"end":11,"value":"it's"}

{"time":932,"type":"word","start":12,"end":16,"value":"nice"}

{"time":1193,"type":"word","start":17,"end":19,"value":"to"}

{"time":1280,"type":"word","start":20,"end":23,"value":"see"}

{"time":1473,"type":"word","start":24,"end":27,"value":"you"}

{"time":1577,"type":"word","start":28,"end":33,"value":"today"}

AWS uses these speech marks (with variants for "sentence", "word", "viseme", or "ssml") in their Polly TTS service...

The sentence or word marks are useful for highlighting text as the TTS reads aloud, while the "viseme" marks are useful for doing lip-sync on a facial model.

https://docs.aws.amazon.com/polly/latest/dg/output.html