michaelgiba (u/michaelgiba)

michaelgiba commented on Structured outputs create false confidence boundaryml.com/blog/struc... · Posted by u/gmays

> You’re essentially forcing the model to express the actual answer it wants to express in a constrained language.

You surely aren't implying that the model is sentient or has any "desire" to give an answer, right?

And how is that different from prompting in general? Isn't using english already a constraint? And isn't that what it is designed for, to work with prompts that provide limits in which to determine the output text? Like there is no "real" answer that you supress by changing your prompt.

So I don't think its a plausible explanation to say this happens because we are "making" the model return its answerr in a "constrained language" at all.

michaelgiba · 4 days ago

> You surely aren't implying that the model is sentient or has any "desire" to give an answer, right?

The model is a probabilistic machine that was trained to generate completions and then fine tuned to generate chat style interactions. There is an output, given the prompt and weights, that is most likely under the model. That’s what one could call the model’s “desired” answer if you want to anthropomorphize. When you constrain which tokens can be sampled at a given timestep you by definition diverge from that

michaelgiba commented on Structured outputs create false confidence boundaryml.com/blog/struc... · Posted by u/gmays

michaelgiba · 4 days ago

It’s not surprising that there could be a very slight quality drop off for making the model return its answer in a constrained way. You’re essentially forcing the model to express the actual answer it wants to express in a constrained language.

However I would say two things: 1. I doubt this quality drop couldn’t be mitigated by first letting the model answer in its regular language and then doing a second constrained step to convert that into structured outputs. 2. For the smaller models I have seen instances where the constrained sampling of structured outputs actually HELPS with output quality. If you can sufficiently encode information in the structure of the output it can help the model. It can effectively let you encode simple branching mechanisms to execute at sample time

michaelgiba commented on 73% of AI startups are just prompt engineering pub.towardsai.net/i-rever... · Posted by u/kllrnohj

michaelgiba · a month ago

73% of startups are just writing computer programs

michaelgiba commented on Ask HN: Has an LLM Discovered Calculus? · Posted by u/cobbzilla

michaelgiba · 2 months ago

Interesting idea. Although I wouldn't consider `but restrict the data set to publications from <= year 1600` "easy".

If you did have access to a high-quality pretraining dataset and you could explore training up to 1600, then up to 1610, 1620, ... 1700 and look at how the presence of calculus was learned over that period. Running some tests with the intermediate models to capture the effect

michaelgiba commented on Llamafile Returns blog.mozilla.ai/llamafile... · Posted by u/aittalam

michaelgiba · 2 months ago

I’m glad to see llamafile being resurrected. A few things I hope for:

1. Curate a continuously extended inventory of prebuilt llamafiles for models as they are released 2. Create both flexible builds (with dynamic backend loading for cpu and cuda) and slim minimalist builds 3. Upstreaming as much as they can into llama.cpp and partner with the project

michaelgiba · 2 months ago

Crazier ideas would be: - extend the concept to also have some sort of “agent mode” where the llamafiles can launch with their own minimal file system or isolated context - detailed profiling of main supported models to ensure deterministic outputs

michaelgiba commented on Llamafile Returns blog.mozilla.ai/llamafile... · Posted by u/aittalam

michaelgiba · 2 months ago

I’m glad to see llamafile being resurrected. A few things I hope for:

1. Curate a continuously extended inventory of prebuilt llamafiles for models as they are released 2. Create both flexible builds (with dynamic backend loading for cpu and cuda) and slim minimalist builds 3. Upstreaming as much as they can into llama.cpp and partner with the project

michaelgiba commented on OpenMaxIO: Forked UI for MinIO Object Storage github.com/OpenMaxIO/open... · Posted by u/nimbius

Spivak · 2 months ago

Step 1: I made this thing and am freely giving it away for the benefit of everyone. Come join the party!

Step 2: Wow, because this is a community project I can depend on it continuing to exist freely thanks to a large base of diverse parties invested in its continued growth and availability.

Step 3: Just kidding, I'm taking back the thing I made. Sorry if you were depending on it, migrate to something else or pay me.

Step 4: WTF dude?!

Step 5: Why are you all so entitled?

michaelgiba · 2 months ago

They stopped publishing images, not like they changed anything significant about the product itself.

Frankly the whole thing is not newsworthy

michaelgiba commented on Sampling and structured outputs in LLMs parthsareen.com/blog.html... · Posted by u/SamLeBarbare

viralpraxis · 3 months ago

You can specify `minimum` and `maximum` property for these fields. So this schema

  {
    "$id": "https://example.com/test.schema.json",
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "title": "Person",
    "type": "object",
    "properties": {
      "hp": {
        "type": "integer",
        "description": "HP",
        "minimum": 1,
        "maximum": 15
      }
    }
  }

is converted to this BNF-like representation:

  hp ::= ([1-9] | "1" [0-5]) space
  hp-kv ::= "\"hp\"" space ":" space hp
  root ::= "{" space  (hp-kv )? "}" space
  space ::= | " " | "\n"{1,2} [ \t]{0,20}

michaelgiba · 3 months ago

For anyone curious here is an interactive write up about this http://michaelgiba.com/grammar-based/index.html