Show HN: Fructose – LLM calls as strongly typed functions

nextos · 2 years ago

IMHO, in the future programming may look similar to this. Write a type declaration for a function with an expressive type system, e.g. refinement types. Then use LLMs + SAT/SMT to generate provably correct code.

This strikes a happy medium, where machines are assisting programmers, making them much more productive. Yet the resulting code is understandable as a human has decomposed everything into functions, and also robust as it is formally verified.

I am working on a F# proof-of-concept system like this, there are other alternatives around implemented in Haskell and other languages with varying levels of automation. It is potentially an interesting niche for a startup.

ravenstine · 2 years ago

Yeah, functional programming and pure functions seem perfect for generative programming. Granted, I think they're perfect in general, but as long as human programmers are still stuck in the world of object-orientation (the modern sense), then they're going to be wasting the time in the LLM feedback loop. The LLM should be be able to write a unit of code however it wants in a way that is as self-contained as possible. Since an LLM can, in theory, quickly "understand" code that most software engineers would object to, then we should get out of the way of LLMs rather than expect them to be like we are.

nickpsecurity · 2 years ago

The field that has been doing this is called Program Synthesis. Here’s an example survey:

https://www.microsoft.com/en-us/research/publication/program...

I’ve wanted to see the traditional techniques combined with modern ML to sort of drive the search and generation process. Then, we’d still have the advantages of both formal specifications and classic AI (esp traceability). While looking for a synthesis link, I stumbled onto one paper trying to mix the two approaches:

https://ojs.aaai.org/index.php/AAAI/article/download/5048/49...

amiantos · 2 years ago

> Write a type declaration for a function with an expressive type system, e.g. refinement types. Then use LLMs + SAT/SMT to generate provably correct code.

This is how I use copilot currently, so I might not be following on what part of this is 'future' facing or relevant to this Fructose project?

Not being contrarian, I thought this was an interesting point but as I thought about it more I realized, "wait, they're describing what I already do".

nextos · 2 years ago

Do you use Copilot on a language equipped with refinement or dependent types, and use said types to constrain generation of entire functions, in a single step, that are also formally verified?

AgentOrange1234 · 2 years ago

How do you do this? I know nothing but this sounds really interesting.

trenchgun · 2 years ago

How!!! Could you think of writing a blog post about it?

fwip · 2 years ago

This project seems pretty different to what you've proposed. Fructose looks like it's "just" asking an LLM to evaluate a function (written in English text), and then jamming whatever comes out back into your type system.

Being able to sometimes answer a given question is perhaps a first step to writing code that can answer that question reliably, but it's a long way from an LLM that does the former to one that does the latter.

ddellacosta · 2 years ago

Oh yeah, you just reminded me of this cool talk I saw at Strange Loop a while back. Not about the AI parts but re: program synthesis in Haskell:

"Type-Driven Program Synthesis" by Nadia Polikarpova https://www.youtube.com/watch?v=HnOix9TFy1A

Links to more projects and papers by Prof. Polikarpova: https://cseweb.ucsd.edu/~npolikarpova/

I think this is one of the main projects she discusses in the talk: https://github.com/nadia-polikarpova/synquid

EDIT: meant to mention this too, which I think has been around a bit longer, not that I've ever used it in production: https://ucsd-progsys.github.io/liquidhaskell/

nextos · 2 years ago

Lots of related work also by A. Solar-Lezama https://people.csail.mit.edu/asolar, sometimes in collaboration with N. Polikarpova.

pyinstallwoes · 2 years ago

Oddly similar to summoning demons.

edunteman · 2 years ago

yeah I had a moment working with fructose where I realized "oh this is more like functional programming than I expected"

obeavs · 2 years ago

Is the F# POC open source? Link?

nextos · 2 years ago

Not yet, it's a bit rough. The LLM I am using requires a bit of extra fine-tuning to be really smooth, I need to rent a bigger GPU. Besides, I am working on some novel integration between transformers and SAT/SMT that will take me some time to finish.

minimaxir · 2 years ago

This approach may be too high-level "magic" to the point of being difficult to work with and iterate upon.

Looking at the prompt templates (https://github.com/bananaml/fructose/tree/main/src/fructose/... ), they use LangChain-esque "just try to make the output to be valid JSON" when APIs such as GPT-4 Turbo which this model uses by default now support function calling/structured data natively and do a very good job of it (https://news.ycombinator.com/item?id=38782678), and libraries such as outlines (https://github.com/outlines-dev/outlines) which is more complex but can better ensure a dictionary output for local LLMs.

edunteman · 2 years ago

Many of our early users have said this as well. I don't want this to turn into an abstraction monstrosity: the more unadulterated the prompt, the better. We're looking to outlines as inspiration for doing this logic as part of the model vs the client. Thanks for the links!

icyfox · 2 years ago

Big proponent of guaranteed outputs for LLMs. I wrote a library awhile back (gpt-json) that did something similar by querying the OpenAI API. At the end of the day though while their responses are _highly likely_ to be valid JSON they're not guaranteed. There's only so much that can be done with remote calls to their model's black box.

The future here really lies in compiling down context free grammars. They let you model json, yml, csv, and other programming languages as finite state machines that can force LLM transitions. They end up being pretty magical: you can force value typing, enums, and syntax validation of multivariate payloads. For use in data pipelines they can't be beat.

I did some experiments a few weeks ago on training models to generate these formats explicitly with jsonformers/outlines. Finetuning in the right format is still important to maximize output. You can end up seeing a 7% lift if you finetune explicitly for your desired format. [^1] At inference time the CFGs will constrain your model to do what it's actually intended to.

[^1]: https://freeman.vc/notes/constraining-llm-outputs

KTibow · 2 years ago

OpenAI lets you force valid JSON now

https://platform.openai.com/docs/guides/text-generation/json...

edunteman · 2 years ago

valid json, yes, but not a specific json schema (yet, who knows, maybe they ship schema support, I'm surprised they haven't)

edunteman · 2 years ago

it seems this is in the context of "extraction" where all of the data is already present in the input text, and all that's needed is the reformatting. This is something we've been wrestling with (even today): is the role of fructose and/or our eventual formatting model to provide both intelligence + formatting (such as generating novel data, on the fly, constrained to structure), or just formatting (anything-to-json). Not sure what the answer is and not expecting one, we're just realizing the clear split between needs of users running extraction vs more generative/creative tasks.

disconcision · 2 years ago

we don't need to limit ourselves to context-free, either. its possible to enforce scope as well, and even force per-token type correctness, as least for somewhat syntactically well-behaved languages that use local type inference.

itfollowsthen · 2 years ago

> not unlike other packages such as marvin

This feels pretty much identical to Marvin? Like the entire API?

From a genuine place of curiosity: I get that your prompts are different, but like why in the name of open source would you just not contribute to these libraries instead of starting your own from scratch?

edunteman · 2 years ago

Thanks for asking, and I'd agree. I'd give the same answer as the folks asking about instructor: we built this in a week and are sharing it early, this package API happens to have landed on what Marvin is doing, we're likely to change over time, especially leaning toward running our own models as part of it.

lefttoreader · 2 years ago

Wait. So why not just contribute to an existing open source project if you’re going to implement an identical API?

If you run your own models as a part of it, surely you could hook up your models as a backend to whatever abstractions you’re copying here.

nate_nowack · 2 years ago

yeah this seems to be pretty much the same interface as `fn` from marvin, except w/o pydantic (see https://github.com/PrefectHQ/marvin?tab=readme-ov-file#-buil...)

babyshake · 2 years ago

Does anyone else get bothered by how this seemingly results in code that won't compile?

Instead of this:

@ai() def describe(animals: list[str]) -> str: """ Given a list of animals, use one word that'd describe them all. """

it would seem a lot more intuitive to do this:

def describe(animals: list[str]) -> str: return ai("""Given a list of animals, use one word that'd describe them all.""", animals)

pedrovhb · 2 years ago

Technically a function body needs at least one statement. A docstring is just an expression statement (a string), so a function definition with just a docstring is synctatically valid Python. I've seen people say multiline string literals are Python's version of multiline comments, but that's really just convention; it's a noop expression statement. Same as doing

    def foo():
        4

Which is also an expression statement as a function body, and also does nothing. Contrast to actually using a comment as a function body; comments aren't statements (nor expressions, so they can't be used as an expression statement):

    def foo():
        # this doesn't work

> IndentationError: expected an indented block after function definition on line 1

Of course, this doesn't really matter at all, and I get that it feels strange. I've just been thinking about grammars and syntax lately, and it's been interesting to now have the vocabulary and mental model to understand these unintuitive things :)

thekombustor · 2 years ago

I typically use "pass" for this exact case of having a stub function body (typically to be implemented later)

dragonwriter · 2 years ago

Wouldn't the correct way be with the use of ellipsis, as is used in type stubs?

  @ai()
  def describe(animals: list[str]) -> str:
    """ Given a list of animals, use one word that'd describe them all. """

    ...

edunteman · 2 years ago

Yeah the pyright doesn't like the annotated return type not being honored by the empty stub function. I wonder if there's a way to trick it.

For your suggestion, the decorator would still be required to overload the function execution with the remote call, otherwise you'd just be calling the function body, but we have considered special wrapper return types to help play better with pyright (and also give programmatic access to debug details of the call), but that'd add bloat to the package and subtract from the more native python feel we're aiming for.

dragonwriter · 2 years ago

> Yeah the pyright doesn't like the annotated return type not being honored by the empty stub function. I wonder if there's a way to trick it.

Python has an existing convention for this (so its not a "trick"), the use of the special value Ellipsis (literal: ...)

https://mypy.readthedocs.io/en/stable/stubs.html

mattew · 2 years ago

Good stuff. How does this compare to Instructor? I’ve been using this extensively

https://jxnl.github.io/instructor/

edunteman · 2 years ago

answered in different thread. tldr: not that different for now. we're likely to do some serverside optimizations, esp. given our gpu inference history.

yaj54 · 2 years ago

I like your UX a lot more. Modeling the llm calls as actual python functions allows them to mesh well with existing code organization dev tooling. And using a decorator to "implement" a function just feels like a special kind of magic. I'd need more ability to use my own "prompt templates" to use this as a lib but I'm definitely going to try using this general pattern.

imtringued · 2 years ago

Since you are going down this route, I would recommend you guys to build some sort of unit test driven fine tuning framework, where you may provide input output examples expressed as simple function calls. You could then let the LLM generate examples and check them using the unit tests and keep the valid results to build up a valid data set. For bonus points, the unit tests themselves could also call the LLM to check if the output passes criteria expressed in natural language or not.

shykes · 2 years ago

I love this emerging space at the intersection of programming and LLMs. It goes beyond having the LLMs generate code: that's an obvious and amazing use case, but it's far from the only one.

Another project I'm excited about in this area is GPTScript, which launched last week: http://github.com/gptscript-ai/gptscript.