Readit News logoReadit News
amitness commented on New tools for building agents   openai.com/index/new-tool... · Posted by u/meetpateltech
swyx · a year ago
> Even their "function calling" abstraction still hallucinates parameters and schema

huh? sample code please? this should not be true since Structured Outputs came out - literally prevented from generating invalid json

(more: https://www.latent.space/p/openai-api-and-o1)

amitness · a year ago
It's not enabled by default for their function calling API. So, hallucination is possible.

You have to set 'strict' to True manually to use the same grammar-based sampling they use for structured outputs.

https://platform.openai.com/docs/guides/function-calling?api...

amitness commented on Show HN: I made a website to semantically search ArXiv papers   papermatch.mitanshu.tech/... · Posted by u/Quizzical4230
Quizzical4230 · a year ago
That's has a major downgrade. For binary embeddings, the top 10 results are same as fp32, albeit shuffled. However after the 10th result, I think quality degrades quite a bit. I was planning to add a reranking strategy for binary embeddings. What do you think?
amitness · a year ago
Try this trick that I learned from Cohere: - Fetch top 10*k (i.e. 100) results using the hamming distance - Rerank by taking dot product between query embedding (full precision) and binary doc embeddings - Show top-10 results after re-ranking
amitness commented on Open source inference time compute example from HuggingFace   github.com/huggingface/se... · Posted by u/burningion
dinp · a year ago
Great work! When I use models like o1, they work better than sonnet and 4o for tasks that require some thinking but the output is often very verbose. Is it possible to get the best of both worlds? The thinking takes place resulting in better performance but the output is straightforward to work with like with sonnet and 4o. Did you observe similar behaviour with the 1B and 3B models? How does the model behaviour change when used for normal tasks that don't require thinking?

Also how well do these models work to extract structured output? Eg- perform ocr on some hand written text with math, convert to html and format formulas correctly etc. Single shot prompting doesn't work well with such problems but splitting the steps into consecutive api calls works well.

amitness · a year ago
OpenAI recommends using o1 to generate the verbose plan and then chain the verbose output to a cheaper model (e.g. gpt-4o-mini) to convert it into structured data / function calls / summary etc. They call it planner-executor pattern. [1]

[1] https://vimeo.com/showcase/11333741/video/1018737829

u/amitness

KarmaCake day530October 13, 2017
About
AI Engineer. Twitter: https://twitter.com/amitness
View Original