Readit News logoReadit News
afiodorov commented on 28M Hacker News comments as vector embedding search dataset   clickhouse.com/docs/getti... · Posted by u/walterbell
tim333 · 20 days ago
That's cool - it gave me quite a good answer when I tried it. Does it cost you much to run?

I tried "Who's Gary Marcus" - HN / your thing was considerably more negative about him than Google.

afiodorov · 20 days ago
The running costs are very low. Since posting it today we burned 30 cents in DeepSeek inference. Postgres instance though costs me $40 a month on Railway; mostly due to RAM usage during to HNSW incremental update.
afiodorov commented on 28M Hacker News comments as vector embedding search dataset   clickhouse.com/docs/getti... · Posted by u/walterbell
simlevesque · 20 days ago
I have a question: what hardware did you use and how long did you need to generate the embeddings ?
afiodorov · 20 days ago
Daily updates I do on my m4 mac air: takes about 5 minutes to process roughly 10k fresh comments. Historic backfill was done on an Nvidia GPU rented on vast.ai for a few dollars. If I recall correctly took about an hour or so. It’s mentioned in the README.md on GitHub.
afiodorov commented on 28M Hacker News comments as vector embedding search dataset   clickhouse.com/docs/getti... · Posted by u/walterbell
afiodorov · 20 days ago
I've been embedding all HN comments since 2023 from BigQuery and hosting at https://hn.fiodorov.es

Source is at https://github.com/afiodorov/hn-search

afiodorov commented on How AI hears accents: An audible visualization of accent clusters   accent-explorer.boldvoice... · Posted by u/ilyausorov
afiodorov · 2 months ago
Apparently Persian and Russian are close. Which is surprising to say the least. I know people keep getting confused about how Portuguese from Portugal and Russian sound close yet the Persian is new to me.
afiodorov commented on Why is everything so scalable?   stavros.io/posts/why-is-e... · Posted by u/kunley
afiodorov · 2 months ago
I've found that building my side projects to be "scalable" is a practical side effect of choosing the most cost-effective hosting.

When a project has little to no traffic, the on-demand pricing of serverless is unbeatable. A static site on S3 or a backend on Lambda with DynamoDB will cost nothing under the AWS free tier. A dedicated server, even a cheap one, is an immediate and fixed $8-10/month liability.

The cost to run a monolith on a VPS only becomes competitive once you have enough users to burn through the very generous free tiers, which for many side projects is a long way off. The primary driver here is minimizing cost and operational overhead from day one.

afiodorov commented on Ask HN: Who wants to be hired? (October 2025)    · Posted by u/whoishiring
afiodorov · 3 months ago
Data all-rounder with 10 years building everything from low-latency Go microservices to training ML models to large-scale AWS data pipelines. Looking for a senior, autonomous role at a small company/startup.

  Location: Las Palmas, Spain
  Remote: Yes
  Willing to relocate: No
  Technologies: Go, Python, SQL, Kubernetes, Docker, AWS (S3, EMR, RDS, Aurora, Athena), Apache Spark, Apache Airflow, TypeScript, React, gRPC, REST APIs, PostgreSQL, Google BigQuery, LangChain, LangGraph, RAG, faster-whisper
  Résumé/CV: https://cv.fiodorov.es
  Email: hn@fiodorov.es

afiodorov commented on Ask HN: What are you working on? (September 2025)    · Posted by u/david927
afiodorov · 3 months ago
RAG search that contains all HN comments since 2023

https://hn.fiodorov.es

I treat it more like a homework exercise for a Coursera course but I like the result.

afiodorov commented on Getting AI to work in complex codebases   github.com/humanlayer/adv... · Posted by u/dhorthy
afiodorov · 3 months ago
> It was uncomfortable at first. I had to learn to let go of reading every line of PR code. I still read the tests pretty carefully, but the specs became our source of truth for what was being built and why.

This is exactly right. Our role is shifting from writing implementation details to defining and verifying behavior.

I recently needed to add recursive uploads to a complex S3-to-SFTP Python operator that had a dozen path manipulation flags. My process was:

* Extract the existing behavior into a clear spec (i.e., get the unit tests passing).

* Expand that spec to cover the new recursive functionality.

* Hand the problem and the tests to a coding agent.

I quickly realized I didn't need to understand the old code at all. My entire focus was on whether the new code was faithful to the spec. This is the future: our value will be in demonstrating correctness through verification, while the code itself becomes an implementation detail handled by an agent.

afiodorov commented on Qwen3-Coder: Agentic coding in the world   qwenlm.github.io/blog/qwe... · Posted by u/danielhanchen
libraryofbabel · 5 months ago
Really though? That’s only 2 hours per week writing code.

It’s true to say that time writing code is usually a minority of a developer’s work time, and so an AI that makes coding 20% faster may only translate to a modest dev productivity boost. But 5% time spent coding is a sign of serious organizational disfunction.

afiodorov · 5 months ago

  sign of serious organizational disfunction.
You're not wrong, but it's a "dysfunction" that many successful tech companies have learned to leverage.

The reality is, most engineers spend far less than half their time writing new code. This is where the 80/20 principle comes into play. It's common for 80% of a company's revenue to come from 20% of its features. That core, revenue-generating code is often mature and requires more maintenance than new code. Its stability allows the company to afford what you call "dysfunction": having a large portion of engineers work on speculative features and "big bets" that might never see the light of day.

So, while it looks like a bug from a pure "coding hours" perspective, for many businesses, it's a strategic feature!

afiodorov commented on AI capex is so big that it's affecting economic statistics   paulkedrosky.com/honey-ai... · Posted by u/throw0101c
oytis · 5 months ago
I just hope when (if) the hype is over, we can repurpose the capacities for something useful (e.g. drug discovery etc.)
afiodorov · 5 months ago
I live next to an abandoned building from the Spanish property boom. It's now occupied illegally. Hype's over yet the consequence is staring at me every day. I am sure it'll eventually be knocked down or repurposed yet it'd be better had the misallocation never happened.

u/afiodorov

KarmaCake day776June 21, 2019
About
Likes Computer Science, Mathematics, AI, ML
View Original