majdalsado (u/majdalsado)

majdalsado commented on State of AI: An Empirical 100T Token Study with OpenRouter openrouter.ai/state-of-ai... · Posted by u/anjneymidha

majdalsado · 19 days ago

Very interesting how Singapore ranks 2nd in terms of token volume. I wonder if this is potentially Chinese usage via VPN, or if Singaporean consumers and firms are dominating in AI adoption.

Also interesting how the 'roleplaying' category is so dominant, makes me wonder if Google's classifier sees a system prompt with "Act as a X" and classifies that as roleplay vs the specific industry the roleplay was intended to serve.

majdalsado commented on Ask HN: Who is hiring? (December 2025) · Posted by u/whoishiring

majdalsado · 22 days ago

We're an early-stage (pre-seed) VC-backed startup automating RFP proposals for the AEC (Architecture, Engineering, Construction) industry.

We are building an agentic AI platform that embeds directly into Microsoft Word, helping firms find and win more work, while serving as their knowledge management hub for all business development.

You would be the first full-time engineering hire working directly with the founders (I'm the technical co-founder). We need someone who can ship production code across the whole stack.

The Stack: - Frontend: Next.js, React - Backend: FastAPI (Python), Temporal

Hard problems you will solve:

Deep Word Integration: Building a high-performance, "Cursor-like" experience within the constraints of Office.js.

Agentic Workflows: Orchestrating AI agents that can read complex government requirements, reason about compliance, and generate winning output autonomously.

Evolving Knowledge Graph: Architecting a library system that doesn't just store files, but learns from project history and feedback loops.

If you want a chance to work on a hard problem, in an exciting space, with a strong team that has validated the market and de-risked the business, with major upside, let's talk!

To apply, email me directly: majd [at] bidaya.ai or [here](https://app.dover.com/apply/bidaya/d5f29bbb-9c67-4c4e-82bf-5...) Mention HN in the subject.

majdalsado commented on Launch HN: Lucidic (YC W25) – Debug, test, and evaluate AI agents in production · Posted by u/AbhinavX

AbhinavX · 5 months ago

Langfuse and Helicone work well for traditional LLM operations, but AI agents are different. We discovered that AI agents require fundamentally different tooling, here are some examples.

First, while LLMs simply respond to prompts, agents often get stuck in behavioral loops where they repeat the same actions; to address this, we built a graph visualization that automatically detects when an agent reaches the same state multiple times and groups these occurrences together, making loops immediately visible.

Second, our evaluations are much more tailored for AI Agents. LLM ops evaluations usually occur at a per prompt level (i.e hallucination, qa-correctness) which makes sense for those use cases, but agent evaluations are usually per session or run. What this means is that usually a single prompt in isolation didn’t cause an issue but some downstream memory issue or previous action caused this current tool to fail. So, we spent a lot of time creating a way for you to create a rubric. Then, to evaluate the rubric (so that there isn’t context overload) we created an agentic pipeline which has tools like viewing rubric examples, ability to zoom “in and out” of a session (to prevent context overload), referencing previous examples, etc.

Third, time traveling and clustering of similar responses. LLM debugging is straightforward because prompts are stateless and are independent from one another, but agents maintain complex state through tools, context, and memory management; we solved this by creating “time travel” functionality that captures the complete agent state at any point, allowing developers to modify variables like context or tool availability and replay from that exact moment and then simulate that 20-30 times and group together similar responses (with our clustering alg).

Fourth, agents exhibit far more non-deterministic behavior than LLMs because a single tool call can completely change their trajectory; to handle this complexity, we developed workflow trajectory clustering that groups similar execution paths together, helping developers identify patterns and edge cases that would be impossible to spot in traditional LLM systems.

majdalsado · 5 months ago

This makes sense. We'll look into this some more, will be making a decision next couple days :)

Good luck!

majdalsado commented on Launch HN: Lucidic (YC W25) – Debug, test, and evaluate AI agents in production · Posted by u/AbhinavX

majdalsado · 5 months ago

I'm looking into a tool like this for my startup. Why should I use this over Langfuse or Helicone?

majdalsado commented on Oldest white wine in the world found in a first-century tomb in Spain doi.org/10.1016/j.jasrep.... · Posted by u/The_suffocated

mandibeet · 2 years ago

Unlikely that the wine would taste good by modern standards but still I wanna try it

majdalsado · 2 years ago

With a 0.14mg/L Lead content I'm not sure that you do... (28x acceptable amounts)