ichiwells (u/ichiwells)

ichiwells commented on The unreasonable effectiveness of an LLM agent loop with tool use sketch.dev/blog/agent-loo... · Posted by u/crawshaw

forgingahead · 7 months ago

Love to see the Ruby implementations! Thanks for sharing.

ichiwells · 7 months ago

Thank you so much for sharing this!

We are using ruby to build a powerful AI toolset in the construction space, and we love how simple all of the SaaS parts are and not reinventing the wheel, but the ruby LLM SDK ecosystem is a bit lagging, so we've written a lot of our own low-level tools.

(btw we are also hiring rubyists https://news.ycombinator.com/item?id=43865448)

ichiwells commented on Ask HN: Who is hiring? (May 2025) · Posted by u/whoishiring

ichiwells · 8 months ago

Ichi | Remote (US) | Full-Time | Senior Software Engineer, Full Stack | LLM-powered construction code compliance

Ichi is building an AI-powered professional toolset to transform construction permitting and code compliance. We're helping municipalities, plan examiners, architects, and construction professionals streamline the permitting process to reduce bottlenecks—ultimately accelerating construction timelines in an $11T global industry.

We're a small team (you == employee #6) backed by Costanoa Ventures looking for a high-ownership Senior Software Engineer (8+ years exp) who can build from scratch and deeply cares about product quality. You'll design and implement product features that integrate intuitive UX and powerful multi-modal (and multi-model) LLM pipelines, working directly with users to refine our approach.

Tech stack: Ruby, Rails, Python, React, TypeScript, GraphQL, Postgres, AWS.

Must have: Full stack experience, product intuition, and interest in LLMs.

CS degree or equivalent preferred; we work on hard problems and first-principles thinking and attention to detail are essential.

Our name "Ichi" (Japanese for "city", and "one") reflects our mission to improve the physical spaces around us through never-before-possible technology and bridging a fundamental information asymmetry.

Apply: https://www.ichiplan.com/senior-software-engineer

ichiwells commented on Apple's AI isn't a letdown. AI is the letdown cnn.com/2025/03/27/tech/a... · Posted by u/ndr42

ichiwells · 9 months ago

One of apple’s biggest missed with “AI” in my opinion, is not building a universal search.

For all the hype LLM generation gets, I think the rise of LLM-backed “semantic” embedding search does not get enough attention. It’s used in RAG (which inherits the hallucinatory problems), but seems underutilized elsewhere.

The worst (and coincidentally/paradoxically I use the most) searches I’ve seen is Gmail and Dropbox, both of which cannot find emails or files that I know exist, even if using the exact email subject and file name keywords.

Apple could arguably solve this with a universal search SDK, and I’d value this far more than yet-another-summarize-this-paragraph tool.

ichiwells commented on RubyLLM: A delightful Ruby way to work with AI github.com/crmne/ruby_llm... · Posted by u/ksec

miki123211 · 9 months ago

What's a good way to learn the modern Ruby ecosystem nowadays?

I played with Ruby when I was a teenager (~2015 or so), and I definitely remember enjoying it. I know there's still a vocal group of users who love it, so I would be interested in digging in again.

ichiwells · 9 months ago

I would actually start with the Rails Guides docs, they’re very good and running the given commands should actually work:

https://guides.rubyonrails.org/getting_started.htm

Just have a toy app you want to build in mind

ichiwells commented on RubyLLM: A delightful Ruby way to work with AI github.com/crmne/ruby_llm... · Posted by u/ksec

ichiwells · 9 months ago

I run engineering for a venture backed AI-first startup and we use Ruby/Rails.

For us, it made sense to leverage one of the best domain modeling and ORM frameworks out there. Most of our inference is http calls to foundational models, but we can still fine tune and host models on GPUs using Python.

Inference matters, but part of building an effective user platform are the same old SaaS problems we’ve had before, and Rails just works. Inbound and outbound email done in a day. Turning an OCR’d title from ALL CAPS into Title Case is one method call and not a whole custom algorithm, etc.

A lot of people seem to think Ruby is slow for some reason but it’s as fast as Python, and with falcon as fast as node for async behavior. Safe to say the application language taking 0.03 seconds instead of 0.003 seconds when you have to wait 3 seconds for first token is absolutely not the bottleneck with LLM heavy workflows, anyway.

And yes, metaprogramming is a powerful tool with which you can easily shoot yourself in the foot. We culturally just don’t write any code that’s not greppable so don’t use method_missing kinds of things unless it’s in a robust gem like active record. Pretty trivial problem to solve really.

PS - We’re hiring, if that philosophy aligns with you!

ichiwells commented on Long Read: Lessons from Building Semantic Search for GitHub and Why I Failed tzx.notion.site/What-I-Le... · Posted by u/zxt_tzx

whakim · 9 months ago

I was the first employee at a company which uses RAG (Halcyon), and I’ve been working through issues with various vector store providers for almost two years now. We’ve gone from tens of thousands to billions of embeddings in that timeframe - so I feel qualified to at least offer my opinion on the problem.

I agree that starting with pgvector is wise. It’s the thing you already have (postgres), and it works pretty well out of the box. But there are definitely gotchas that don’t usually get mentioned. Although the pgvector filtering story is better than it was a year ago, high-cardinality filters still feel like a bit of an afterthought (low-cardinality filters can be solved with partial indices even at scale). You should also be aware that the workload for ANN is pretty different from normal web-app stuff, so you probably want your embeddings in a separate, differently-optimized database. And if you do lots of updates or deletes, you’ll need to make sure autovacuum is properly tuned or else index performance will suffer. Finally, building HNSW indices in Postgres is still extremely slow (even with parallel index builds), so it is difficult to experiment with index hyperparameters at scale.

Dedicated vector stores often solve some of these problems but create others. Index builds are often much faster, and you’re working at a higher level (for better or worse) so there’s less time spent on tuning indices or database configurations. But (as mentioned in other comments) keeping your data in sync is a huge issue. Even if updates and deletes aren’t a big part of your workload, figuring out what metadata to index alongside your vectors can be challenging. Adding new pieces of metadata may involve rebuilding the entire index, so you need a robust way to move terabytes of data reasonably quickly. The other challenge I’ve found is that filtering is often the “special sauce” that vector store providers bring to the table, so it’s pretty difficult to reason about the performance and recall of various types of filters.

ichiwells · 9 months ago

> Finally, building HNSW indices in Postgres is still extremely slow (even with parallel index builds), so it is difficult to experiment with index hyperparameters at scale

For anyone coming across this without much experience here, for building these indexes in pgvector it makes a massive difference to increase your maintenance memory above the default. Either as a separate db like whakim mentioned, or for specific maintenance periods depending on your use case.

``` SHOW maintenance_work_mem; SET maintenance_work_mem = X; ```

In one of our semantic search use cases, we control the ingestion of the searchable content (laws, basically) so we can control when and how we choose to index it. And then I've set up classic relational db indexing (in addition to vector indexing) for our quite predictable query patterns.

For us that means our actual semantic db query takes about 10ms.

Starting from 10s of millions of entries, filtered to ~50k (jurisdictionally, in our case) relevant ones and then performing vector similarity search with topK/limit.

Built into our ORM and zero round-trip latency to Pinecone or syncing issues.

EDIT: I imagine whakim has more experience than me and YMMV, just sharing lesson learned. Even with higher maintenance mem the index building is super slow for HNSW