In GitHub you show stats that say a "cache hit" is 200ms and a miss is 1-2s (LLM call).
I don't think I understand how you get a cache hit off a novel tweet. My understanding is that you
1) get a snake case category from an LLM
2) embed that category
3) check if it's close to something else in the embedding space via cosine similarity
4) if it is, replace og label with the closest in embedding space
5) if not, store it
Is that the right sequence? If it is, it looks to me like all paths start with an LLM, and therefore are not likely to be <200ms. Do I have the sequence right?
Happy user of https://reflex.dev framework here.
I was tired of writing backend APIs with the only purpose that they get consumed by the same app's frontend (typically React). Leading to boilerplate code both backend side (provide APIs) and frontend side (consume APIs: fetch, cache, propagate, etc.).
Now I am running 3 different apps in productions for which I no longer write APIs. I only define states and state updates in Python. The frontend code is written in Python, too, and auto-transpiled into a React app. The latter keeping its states and views automagically in sync with the backend. I am only 6 months into Reflex so far, but so far it's been mostly a joy. Of course you've got to learn a few but important details such as state dependencies and proper state caching, but the upsides of Reflex are a big win for my team and me. We write less code and ship faster.
[0] https://console.grok.com/
[1] https://www.claude.com/product/claude-code