Rutledge (u/Rutledge)

Rutledge commented on Gemini CLI: your open-source AI agent blog.google/technology/de... · Posted by u/meetpateltech

Rutledge · 2 months ago

Here's the image from Wayback: https://web.archive.org/web/20250625051706/https://blog.goog...

The biggest diffs from Claude code (the current champion): 1. Generous free tier (60 RPM!) 2. Open Source Apache (Standard after OAI Codex did the same)

Rutledge commented on First MCP Server for Eval twitter.com/scorecardai/s... · Posted by u/Rutledge

Rutledge · 3 months ago

Hi HN- we're excited to launch the first remote MCP server for claude.ai and cursor for LLM evaluation. Would love your thoughts and feedback :)

Rutledge · 3 months ago

Aannnnndd X is down x) Here's the LI: https://www.linkedin.com/posts/scorecard-ai_introducing-scor...

Rutledge commented on First MCP Server for Eval twitter.com/scorecardai/s... · Posted by u/Rutledge

Rutledge · 3 months ago

Hi HN- we're excited to launch the first remote MCP server for claude.ai and cursor for LLM evaluation. Would love your thoughts and feedback :)

Posted by u/Rutledge 3 months ago

First MCP Server for Eval twitter.com/scorecardai/s...

Rutledge commented on Cisco Launches Agntcy AI Framework outshift.cisco.com/blog/b... · Posted by u/Rutledge

Rutledge · 6 months ago

Here's the repo: https://github.com/agntcy and docs: https://docs.agntcy.org/pages/abstract.html

Posted by u/Rutledge 6 months ago

Cisco Launches Agntcy AI Framework outshift.cisco.com/blog/b...

Rutledge commented on Agenteval.org: An Open-Source Benchmarking Initiative for AI Agent Evaluation scorecard.io/blog/introdu... · Posted by u/Rutledge

Rutledge · 6 months ago

This initiative is designed to be community-driven, so we're looking forward to your feedback on what agent benchmarking needs exist in your domains. While starting with legal AI, we plan to expand across industries where benchmarks for AI agents evaluation are needed.

Posted by u/Rutledge 6 months ago

Agenteval.org: An Open-Source Benchmarking Initiative for AI Agent Evaluation scorecard.io/blog/introdu...

Rutledge commented on Vercel Fluid Compute vercel.com/fluid... · Posted by u/spking

schniz · 6 months ago

We built Fluid with noisy neighbors(=requests to the same instance) in mind. So because we are a data-driven team, we

1. track metrics and have our own dashboards to ensure we proactively understand and act whenever something like that happens 2. also use these metrics in our routing to smartly know when to scale up. we have tested a lot of variations of all the metrics we gather and things are looking good

anyway, the more workload types we will host with this system, the more we know and the better/performant it will get. we're running this for a while now, and it shows great results.

there's no magic, just data coming from a complex system, fed into a fairly complex system!

hope that answers the question, and thanks for trusting us

Rutledge · 6 months ago

Yes quite helpful- thanks for explaining and will try it out!

Rutledge commented on Vercel Fluid Compute vercel.com/fluid... · Posted by u/spking

cramforce · 6 months ago

The big difference is how the microvm is utilized. Lambda reserves the entire VM to handle a request end to end. Fluid can use a VM for multiple concurrent requests. Since most workloads are often idle waiting for IO, this ends up being much more efficient.

Displaimer: CTO of Vercel here

Rutledge · 6 months ago

The concurrent request handling seems great for our AI eval workloads, where we're waiting for LLM API calls and DB operations but curious how Vercel handles potential noisy neighbor issues when one request consumes excessive CPU/memory?

Disclosure: CEO of Scorecard- AI eval platform, current Vercel customer. Intrigued since most of our time serverless time is spent waiting for model responses, but cautious about 'magic' solutions.

u/Rutledge

KarmaCake day176March 15, 2012View Original