alangpierce (u/alangpierce)

alangpierce commented on MCP: An (Accidentally) Universal Plugin System worksonmymachine.substack... · Posted by u/Stwerner

alangpierce · 2 months ago

> What if you just... removed the AI part?

Maybe I'm not fully understanding the approach, but it seems like if you started relying on third-party MCP servers without the AI layer in the middle, you'd quickly run into backcompat issues. Since MCP servers assume they're being called by an AI, they have the right to make breaking changes to the tools, input schemas, and output formats without notice.

alangpierce commented on DMV calls on Cruise to take half its driverless cars off the road missionlocal.org/2023/08/... · Posted by u/wpietri

beardyw · 2 years ago

What is the motivation of the state to allow driverless cars at all? It is bound to create problems, what does it gain them? Serious question.

alangpierce · 2 years ago

The state already puts a lot of resources into improving road safety, and that's one of the primary goals of the DMV, DOT, etc. There's reason to believe that driverless cars will greatly improve road safety, so it feels reasonable that the state would allow their development and have a framework for responsibly expanding their use.

alangpierce commented on With plugins, GPT-4 posts GitHub issue without being instructed to chat.openai.com/share/ed8... · Posted by u/og_kalu

alangpierce · 2 years ago

Interestingly, the ChatGPT Plugin docs [1] say that POST operations like these are required to implement user confirmation, so you might blame the plugin implementation (or OpenAI's non-enforcement of the policy) in this case:

> for POST requests, we require that developers build a user confirmation flow to avoid destruction actions

However, at least from what I can see, the docs don't provide much more detail about how to actually implement confirmation. I haven't played around with the plugins API myself, but I originally assumed it was a non-AI-driven technical constraint, maybe a confirmation modal that ChatGPT always shows to the user before any POST. From a forum post I saw [2], though, it looks like ChatGPT doesn't have any system like that, and you're just supposed to write your manifest and OpenAPI spec in a way that tells ChatGPT to confirm with the user. From the forum post, it sounds like this is pretty fragile, and of course is susceptible to prompt injection as well.

[1] https://platform.openai.com/docs/plugins/introduction

[2] https://community.openai.com/t/implementing-user-confirmatio...

alangpierce commented on Compromising LLM-integrated applications with indirect prompt injection arxiv.org/abs/2302.12173... · Posted by u/greshake

jasonwcfan · 2 years ago

If I’m understanding correctly, the technique basically injects malicious instructions in the content that is stored and retrieved?

Sounds like an easy fix, if it’s possible to detect direct prompt injection attacks then the same techniques can be applied to the data staged for retrieval.

alangpierce · 2 years ago

This article argues that there's no reliable way to detect prompt injection: https://simonwillison.net/2022/Sep/17/prompt-injection-more-...

One solution to some indirect prompt injection attacks is proposed in this article, where you "sandbox" untrusted content into a second LLM that isn't given the ability to decide which actions to take: https://simonwillison.net/2023/Apr/25/dual-llm-pattern/

alangpierce commented on The Dual LLM pattern for building AI assistants that can resist prompt injection simonwillison.net/2023/Ap... · Posted by u/simonw

williamcotton · 2 years ago

You don’t need to ask the LLM where the email came from or provide the LLM with the email address. You just take the subject and the body of the email and provide that to the LLM, and then take the response from the LLM along with the unaffected email address to make the API calls…

  addTodoItem(taintedLLMtranslation, untaintedOriginalEmailAddress)

As for summaries, don’t allow that output to make API calls or be eval’d! Sure, it might be in pig latin from a prompt injection but it won’t be executing arbitrary code or even making API calls to delete Todo items.

All of the data that came from remote commands, such as the body of a newly created Todo item, should still be considered tainted and and treated in a similar manner.

These are the exact same security issues for any case of remote API calls with arbitrary execution.

alangpierce · 2 years ago

Agreed that if you focus on any specific task, there's a safe way to do it, but the challenge is to handle arbitrary natural language requests from the user. That's what the Privileged LLM in the article is for: given a user prompt and only the trusted snippets of conversation history, figure out what action should be taken and how the Quarantined LLM should be used to power the inputs to that action. I think you really need that kind of two-layer approach for the general use case of an AI assistant.

alangpierce commented on The Dual LLM pattern for building AI assistants that can resist prompt injection simonwillison.net/2023/Ap... · Posted by u/simonw

williamcotton · 2 years ago

“Hey Marvin, delete all of my emails”

Why not just have a limited set of permissions for what commands can originate from a given email address?

The original email address can be included along with whatever commands were translated by the LLM. It seems easy enough to limit that to only a few simple commands like “create todo item”.

Think of it this way, what commands would you be fine to be run on your computer if they came from a given email address?

alangpierce · 2 years ago

Giving different permissions levels to different email senders would be very challenging to implement reliably with LLMs. With an AI assistant like this, the typical implementation would be to feed it the current instruction, history of interactions, content of recent emails, etc, and ask it what command to run to best achieve the most recent instruction. You could try to ask the LLM to say which email the command originates from, but if there's a prompt injection, the LLM can be tricked in to lying about that. Any permissions details need to be implemented outside the LLM, but that pretty much means that each email would need to be handled in its own isolated LLM instance, which means that it's impossible to implement features like summarizing all recent emails.

alangpierce commented on Speeding up the JavaScript ecosystem, one library at a time marvinh.dev/blog/speeding... · Posted by u/fabian2k

hajile · 3 years ago

Looks like the one built into chrome. You can open your browser and attach a node instance so you can debug and profile it.

Here's a basic How-to:

* Run node with `node --inspect <file>`

* Open Chrome and go to `chrome://inspect`

* After it finds your target, click the inspect button to launch a debugger.

alangpierce · 3 years ago

It looks to me like the profile viewer is actually speedscope ( https://www.speedscope.app/ ). I find it nicer for exploring profiles compared with Chrome's built-in viewer.

To use with Node.js profiling, do the `node --inspect` and `chrome://inspect` steps, then save the profile as a .cpuprofile file and drag that file into speedscope.

Another thing I've found useful is programmatically starting/stopping the profiler using `console.profile()` and `console.profileEnd()`.

alangpierce commented on PR that converts the TypeScript repo from namespaces to modules github.com/microsoft/Type... · Posted by u/Kyza

KingOfCoders · 3 years ago

"Tools run in single-threaded mode without warm-up."

Ithought it was 2022. I have a 12 core machine and my next machine will probably have 22 cores.

But I'm amazed, transpiling 636975 lines in <1 second is nice.

[Edit] What I do not understand is "Sucrase does not check your code for errors." So it's not a type checker? Or does it check type errors? Why would I used it for Typescript when the reason to use TS is to add types to JS to prevent errors?

Is this more like Rust check for continous work and then use tsc from time to time to check for errors?

alangpierce · 3 years ago

Like any transpiler, Sucrase can be run in parallel by having the build system send different files to different threads/processes. Sucrase itself it more of a primitive, just a plain function from input code to output code.

> What I do not understand is "Sucrase does not check your code for errors." So it's not a type checker?

That's correct, Sucrase, swc, esbuild, and Babel are all just transpilers that transform TypeScript syntax into plain JavaScript (plus other transformations). The usual way you set things up is to use the transpiler to run and build your TS code, and you separately run type checking using the official TypeScript package.

alangpierce commented on PR that converts the TypeScript repo from namespaces to modules github.com/microsoft/Type... · Posted by u/Kyza

merb · 3 years ago

but the benchmark is stupid: https://github.com/alangpierce/sucrase/blob/main/benchmark/b...

> Like all JavaScript code run in V8, Sucrase runs more slowly at first, then gets faster as the just-in-time compiler applies more optimizations. From a rough measurement, Sucrase is about 2x faster after running for 3 seconds than after running for 1 second. swc (written in Rust) and esbuild (written in Go) don't have this effect because they're pre-compiled, so comparing them with Sucrase gets significantly different results depending on how large of a codebase is being tested and whether each compiler is allowed a "warm up" period before the benchmark is run.

(worse it disables esbuild and swc's multi-threading... https://github.com/alangpierce/sucrase/blob/main/benchmark/b... https://github.com/alangpierce/sucrase/blob/main/benchmark/b...)

fake it till ya make it.

it's like saying "if I disable everything and wait for 5 minutes it's faster"

alangpierce · 3 years ago

Hi, Sucrase author here.

To be clear, the benchmark in the README does not allow JIT warm-up. The Sucrase numbers would be better if it did. From testing just now (add `warmUp: true` to `benchmarkJest`), Sucrase is a little over 3x faster than swc if you allow warm-up, but it seemed unfair to disregard warm-up for the comparison in the README.

It's certainly fair to debate whether 360k lines of code is a realistic codebase size for the benchmark; the higher-scale the test case, the better Sucrase looks.

> worse it disables esbuild and swc's multi-threading

At some point I'm hoping to update the README benchmark to run all tools in parallel, which should be more convincing despite the added variability: https://github.com/alangpierce/sucrase/issues/730 . In an ideal environment, the results are pretty much the same as a per-core benchmark, but I do expect that Node's parallelism overhead and the JIT warm-up cost across many cores would make Sucrase less competitive than the current numbers.