fryz (u/fryz) - Readit News

fryz commented on Show HN: Chess on a Donut/Torus and Deep-Dive mchess.io/donut... · Posted by u/mannymakes

fryz · 20 days ago

looks like the white king/queen aren't on the right colors (queen goes on her color) - confused me a bit when trying to map the space to a 2d board

fryz commented on You should write an agent fly.io/blog/everyone-writ... · Posted by u/tabletcorry

losvedir · 2 months ago

I appreciate the goal of demystifying agents by writing one yourself, but for me the key part is still a little obscured by using OpenAI APIs in the examples. A lot of the magic has to do with tool calls, which the API helpfully wraps for you, with a format for defining tools and parsed responses helpfully telling you the tools it wants to call.

I kind of am missing the bridge between that, and the fundamental knowledge that everything is token based in and out.

Is it fair to say that the tool abstraction the library provides you is essentially some niceties around a prompt something like "Defined below are certain 'tools' you can use to gather data or perform actions. If you want to use one, please return the tool call you want and it's arguments, delimited before and after with '###', and stop. I will invoke the tool call and then reply with the output delimited by '==='".

Basically, telling the model how to use tools, earlier in the context window. I already don't totally understand how a model knows when to stop generating tokens, but presumably those instructions will get it to output the request for a tool call in a certain way and stop. Then the agent harness knows to look for those delimiters and extract out the tool call to execute, and then add to the context with the response so the LLM keeps going.

Is that basically it? Or is there more magic there? Are the tool call instructions in some sort of permanent context, or could the interaction demonstrated in a fine tuning step, and inferred by the model and just in its weights?

fryz · 2 months ago

The "magic" is done via the JSON schemas that are passed in along with the definition of the tool.

Structured Output APIs (inc. the Tool API) take the schema and build a Context-free Grammar, which is then used during generation to mask which tokens can be output.

I found https://openai.com/index/introducing-structured-outputs-in-t... (have to scroll down a bit to the "under the hood" section) and https://www.leewayhertz.com/structured-outputs-in-llms/#cons... to be pretty good resources

fryz commented on How to Catch a Wily Poacher in a Sting: A Thermal Robotic Deer wsj.com/us-news/how-to-ca... · Posted by u/Element_

comrade1234 · 5 months ago

> In the early 1900's, a lot of folks thought turkey's were extinct because of over hunting and poaching, and the National Wild Turkey Foundation took efforts to restore the population for hunting.

Well they've definitely recovered in NW Wisconsin. Theyre everywhere and the males won't even move out of the way of cars.

fryz · 5 months ago

yeah - it's one of the best success stories of wildlife conservation in the modern era.

fryz commented on How to Catch a Wily Poacher in a Sting: A Thermal Robotic Deer wsj.com/us-news/how-to-ca... · Posted by u/Element_

comrade1234 · 5 months ago

Hate to say it but poaching deer is a long tradition in Wisconsin. I had a relative on the u of mn Duluth football team and they'd get protein from poached deer. But yes, in the last decades the dnr has been nore and more effective.

fryz · 5 months ago

FWIW, not saying it's right (as a hunter I wouldn't ever do this myself), but most of the biologists that build the population models, inc. the ones that they use to set the amount of hunting licenses or tags sold, build a certain amount of poaching into their models.

It's a particularly hard problem to solve - the hobby is usually spread through traditional means (you do it if your parents did it), and going all the way back in certain communities this was the main way to get meat, even before it became regulated. It's difficult to stop something that not only puts food on the table for your family, but has been done that way for generations.

This was one of the main contributors to the decline of the turkey population in the lower 48. In the early 1900's, a lot of folks thought turkey's were extinct because of over hunting and poaching, and the National Wild Turkey Foundation took efforts to restore the population for hunting.

fryz commented on Lasagna Battery Cell amazingribs.com/more-tech... · Posted by u/nixass

fryz · 5 months ago

This is one of those joyful concepts you learn about as a homeowner, especially on older homes.

If you have plumbing that's done in different metal materials (copper, steel, lead, etc.) and any of your pipes touch, you have to perform regular maintenance and apply a dielectric grease (another one of those single-use materials that you have to buy and store away) or your pipes could corrode and cause a ton of damage.

fryz commented on Show HN: High-performance GenAI engine now open source github.com/arthur-ai/arth... · Posted by u/fryz

serguei · 8 months ago

We've been ramping up our gen ai usage for the last ~month at Upsolve and it's becoming a huge pain. There are already a million solutions for observability out there, but I like that this one is open source and can detect hallucinations

Thanks for open sourcing and sharing, excited to try this out!!

fryz · 8 months ago

Yeah thanks for the feedback.

We think we stand out from our competitors in the space because we built first for the enterprise case, with consideration for things like data governance, acceptable use, and data privacy and information security that can be deployed in managed easily and reliably in customer-managed environments.

A lot of the products today have similar evaluations and metrics, but they either offer a SAAS solution or require some onerous integration into your application stack.

Because we started w/ the enterprise first, our goal was to get to value as quickly and as easily as possible (to avoid shoulder-surfing over zoom calls because we don't have access to the service), and think this plays out well with our product.

fryz commented on Show HN: High-performance GenAI engine now open source github.com/arthur-ai/arth... · Posted by u/fryz

jdbtech · 8 months ago

Looks great! How does the system detect hallucinations?

fryz · 8 months ago

Yeah great question

We based our hallucination detection on "groundedness" on a claim-by-claim basis, which evaluates whether the LLM response can be cited in provided context (eg: message history, tool calls, retrieved context from a vector DB, etc.)

We split the response into multiple claims, determine if a claim needs to be evaluated (eg: and isn't just some boilerplate) and then check to see if the claim is referenced in the context.