This article has been around for some time, and it still shines. It focuses on building a product rather than just prototyping an LLM wrapper and waiting for the dark magic of GenAI.
Chapters like "The Strive for Faster Latencies" and "Post-Processing" are truly inspiring.
Creating production-level DevTools demands much more effort than merely wrapping around a ChatCompletion endpoint and mindlessly stuffing a context window with everything accessible inside the IDE (so-called "prompt engineering").
I attempt to use LLMs in coding tasks many times a day, the capability is there: Opus can make and execute a plan, GPT-4 can find and fix a few persistent typographical errors when Opus attempts verbatim output, and Dolphin-8x7 (or a bunch of other consistently candid models) can de-noise out the static interference from the Morality Police over-aligment.
For a long time it’s been about extravagances like the ability to recover from a lost session, or get a link queried and used as context, or do a reset of the slowly but inevitably corrupted state without losing all of your painstaking assembled point in some abstract state space.
Basic product building on the core LLM experience is way more important than incremental improvements on LLMs, I’d take robustness, consistency, and revision control over an LLM breakthrough, a chat bot session is a tech demo if I lose my work with my browser tab.
Interesting, what gave you this impression? This article was first published only 4-6 months ago around Oct 2023.
Don't get me wrong, I'm a fan of sourgraph and their founder, Quinn, is quite charismatic. I almost went for a job offer with them 5+ years ago. But let's be real, startups are not a winning game for a non-founder, thank goodness I stuck it out with BigCorp. In any case, this isn't that old of a post, cheers.
They seem to have got the correct impression that it's been "around for some time" as "some time" does not mean "a very long time" and you just confirmed that it indeed isn't a just-published article but one which, coming out last year, has been around "for some time".
I'm guessing you're just reading into the phrase an implication that isn't there about it being particularly old.
> Congratulations, you just wrote a code completion AI!
> In fact, this is pretty much how we started out with Cody autocomplete back in March!
Am I wrong in thinking that there's only like 3(?) actual AI companies and everything else is just some frontend to ChatGPT/LLama/Claude?
Is this sustainable? I guess the car industry is full of rebadged models with the same engines and chassis. It's just wild that we keep hearing about the AI boom as though there's a vibrant competitive ecosystem and not just Nvidia, a couple of software partners and then a sea of whiteboxers.
For those who might not be aware of this, there is also an open source project on GitHub called "Twinny" which is an offline Visual Studio Code plugin equivalent to Copilot: https://github.com/rjmacarthy/twinny
It can be used with a number of local model services. Currently for my setup on a NVIDIA 4090, I'm running both the base and instruct model for deepseek-coder 6.7b using 5_K_M Quantization GGUF files (for performance) through llama.cpp "server" where the base model is for completions and the instruct model for chat interactions.
you've got to differentiate between training, inference and hardware. they all benefit from the "AI boom" but at different levels and have varying levels of substitutability (Google tells me that's a real word)
it make sense, in order to come up with a base model, you will need a lot of quality training data, tons of compute.
the role of an AI startup is to come up with ideas, thus useful products.
Most of existing products pre-AI, are also front ends to existing operations systems and exiting databases, because creating the whole stack does not make sense.
At least we have state of the art open models that we can use freely
I mean... the article goes on to explain all their value add... which of course can be replicated, but it's not as if you can just grab an API key and do the same.
But OpenAI could make this themselves in a few weeks, right? If they happen to decide to, then this company is done.
That's what I don't get about all these AI startups. The core value of their business is an API call they don't own. It's like people who were selling ______ apps on iPhone when Apple added their own built-in _____ features to iOS, except the entire industry is like that.
One thing I really miss is a standard way to block any copilot/ai code completion tool from reaching specific files.
That’s particularly important for .env files, containing sensitive info. We don’t want leaking secrets outside our machine, imagine the risks if they become part of the next training dataset.
That’d be really easy to standardize, it’s just another .gitignore-like file.
With Cody, we have a relationship w/ both Anthropic and OpenAI to never use any data submitted via Cody users for training and data is not retained either.
> With Cody, we have a relationship w/ both Anthropic and OpenAI to never use any data submitted via Cody users for training and data is not retained either.
Can you say more about this? Is it public? Contractual? Verifiable somehow?
These types of tools should exclude all files and directories in the .gitignore as standard operating procedure, unless those files are specifically included. Not just because of secrets, but also because these files are not considered to be part of the repository source code and it would be unusual to need to access them for most tasks.
We need a standardized .aiignore file that everyone can work with. Aider does this with .aiderignore. They all just need to agree on the common filename
I agree, but you really shouldn't keep unencrypted secrets locally to begin with.
Most secret managers allow you to either specify value references in .env files, or provide a way of running programs that specifically gives them access to secrets.
I wouldn’t want Cody interrogating my local Terraform state or tfvars, for example. There might not be unecrypted secrets in there, but there’s configuration data that I don’t want to disclose too.
I often start a new project and work on it using Copilot for quite a while before making it official, running "git init" and creating a .gitignore file.
Very interesting! I wonder to what extent this assumption is true in tying completions to traditional code autocomplete.
> One of the biggest constraints on the retrieval implementation is latency
If I’m getting a multi line block of code written automagically for me based on comments and the like, I’d personally value quality over latency and be more than happy to wait on a spinner. And I’d also be happy to map separate shortcuts for when I’m prepared to do so (avoiding the need to detect my intent).
This is great feedback and something we are looking at in regards to Cody. We value developer choice and at the moment for Chat developers can choose between various LLM models (Claude 3 Opus, GPT 4-Turbo, Mixtral 8x7b) that offer different benefits.
For autocomplete, at the moment we only support Starcoder because it has given us the best return on latency + quality, but we'd def love to support (and give users the choice to set an LLM of their choice, so if they prefer waiting longer for higher quality results, they should be able to)
> We value developer choice and at the moment for Chat developers can choose between various LLM models (Claude 3 Opus, GPT 4-Turbo, Mixtral 8x7b) that offer different benefits.
I wish y'all would put a little more effort into user experience. When you go to subscribe it says:
> Claude Instant 1.2, Claude 2, ChatGPT 3.5 Turbo, ChatGPT 4 Turbo Preview
Trying to figure out what's supported was tedious enough[0] that I just ended up renewing my Copilot subscription instead.
[0] Your contact page for "information about products and purchasing" talks about scheduling a meeting. One of your welcome emails points us to your Discord but then your Discord points us to your forum.
> Because of the language-specific nature of this heuristic, we generally do not support multi-line completions for all languages. However, we’re always happy to extend our list of supported languages and, since Cody is open-source, you can also contribute and improve the list. (link to https://github.com/sourcegraph/cody/blob/main/vscode/src/com...)
That link to the list of supported languages is broken. I couldn't find a similar file elsewhere in the repo: maybe the list got folded up into a function in another file? Also a bit annoying that I couldn't find the info on the company's website (though I gave up pretty quickly).
The list of supported languages is at https://sourcegraph.com/docs/cody/faq#what-programming-langu.... It works for all programming languages, but it works better on some than others, depending on the LLM and the language-specific work we've done on context-fetching, syntax-related heuristics, etc. On the LLM point, Cody supports multiple LLMs for both chat and autocomplete (Claude 3, GPT-4 Turbo, Mixtral, StarCoder, etc.), which is great for users but makes it tough to give any formal definition of "supported language".
Even that link doesn't claim to be comprehensive, including an extremely vague "shell scripting languages (like Bash, PowerShell)," as if I am supposed to cross my fingers and hope this means a nonstandard shell like fish works, when it seems like that could introduce weird or inscrutable bugs coming from incompatible bash syntax. (One under-discussed downside of LLM code generation is that it further penalizes innovation in programming languages.)
What I find frustrating is that Cody is doing deterministic language-specific tricks that don't depend on the underlying LLM, so why not just give a list of all the languages you did deterministic tricks for and call that the supported languages? Why be vague? Why deflect to the underlying LLM to encourage people to try your product with languages you won't do any work to support?
Claiming it works on "all programming languages" is lazy and dishonest, and clearly false. There's nothing magical about LLMs that relieves software developers of the duty to specify the limitations of their software.
Fantastic article and impressive work by this company. They're basically wrapping LLM's with a working memory, and tying it to user input. And thus we step a little closer to AGI/ASI.
(I left this comment earlier but I'll c+p here as well)
I wrote a blog post comparing Cody to Copilot a little while ago. Some of the stuff might be outdated now, but I think it still captures the essence of the differences between the two. Obviously I'm a little biased as I work for Sourcegraph, but I tried to be as fair as one could be. Happy to dive deeper into any details.
https://sourcegraph.com/blog/copilot-vs-cody-why-context-mat...
Our biggest differentiators are context, choice, and scale. We've been helping developers find and understand code for the last 10 years and are now applying a lot of that to Cody in regards to fetching the right context. When it comes choice, we support multiple LLMs and are always on the lookout in supporting the right LLM for the job. We recently rolled out Claude 3 Opus as well as Ollama support for offline/local inference.
Cody also has a free tier where you can give it a try and compare for yourself, which is what I always recommend people do :)
I ended up disabling copilot. The reason is that the completions do not always integrate with the rest of the code, in particular with non-matching brackets. Often it just repeats some other part of the code. I had much fewer cases of this with Cody. But, arguably, the difference is not huge. But then add on top of this choice of models.
I noticed I had a lot fewer of these problems these last few weeks. I suspect the Copilot team has put a lot more effort into quality-of-life recently.
For instance, I'd often get a problem where I'd type "foo(", and VsCode would auto-close the parenthesis, so my cursor would be in "foo(|)", but Copilot wouldn't be aware of the auto-close, so it would suggest "bar)" as a completion, leading to "foo(bar))" if I accepted it. But I haven't had this problem in recent versions. Other similar papercuts I'd noticed have been fixed.
I haven't used Cody, though, so I don't know how they compare.
I've used Copilot for months and Cody just today. I'm in the habit of using autocomplete to generate multiline chunks of code. So far, Copilot seems a bit better at autocomplete.
In particular, Copilot seems to do better at generating missing TypeScript import statements. These are relative imports of files in the same small repo. Neither of them seems to really understand my codebase in the way that Cody promises - they make up imports of nonexistent files. Copilot sometimes guesses the right file, I think based on my understanding my naming conventions better.
I switched from Copilot to Supermaven and in my experience it’s more than twice as effective. The suggestions are better and incredibly fast. Co pilot was a nice productivity boost but this is next level for me, I’m genuinely building features noticeably faster.
I wrote a blog post comparing Cody to Copilot a little while ago. Some of the stuff might be outdated now, but I think it still captures the essence of the differences between the two. Obviously I'm a little biased as I work for Sourcegraph, but I tried to be as fair as one could be. Happy to dive deeper into any details.
Our biggest differentiators are context, choice, and scale. We've been helping developers find and understand code for the last 10 years and are now applying a lot of that to Cody in regards to fetching the right context. When it comes choice, we support multiple LLMs and are always on the lookout in supporting the right LLM for the job. We recently rolled out Claude 3 Opus as well as Ollama support for offline/local inference.
Cody also has a free tier where you can give it a try and compare for yourself, which is what I always recommend people do :)
Chapters like "The Strive for Faster Latencies" and "Post-Processing" are truly inspiring.
Creating production-level DevTools demands much more effort than merely wrapping around a ChatCompletion endpoint and mindlessly stuffing a context window with everything accessible inside the IDE (so-called "prompt engineering").
For a long time it’s been about extravagances like the ability to recover from a lost session, or get a link queried and used as context, or do a reset of the slowly but inevitably corrupted state without losing all of your painstaking assembled point in some abstract state space.
Basic product building on the core LLM experience is way more important than incremental improvements on LLMs, I’d take robustness, consistency, and revision control over an LLM breakthrough, a chat bot session is a tech demo if I lose my work with my browser tab.
These folks get it, I agree.
Interesting, what gave you this impression? This article was first published only 4-6 months ago around Oct 2023.
Don't get me wrong, I'm a fan of sourgraph and their founder, Quinn, is quite charismatic. I almost went for a job offer with them 5+ years ago. But let's be real, startups are not a winning game for a non-founder, thank goodness I stuck it out with BigCorp. In any case, this isn't that old of a post, cheers.
I'm guessing you're just reading into the phrase an implication that isn't there about it being particularly old.
I'm curious why you think a big corporation has a higher expected value than a startup.
> In fact, this is pretty much how we started out with Cody autocomplete back in March!
Am I wrong in thinking that there's only like 3(?) actual AI companies and everything else is just some frontend to ChatGPT/LLama/Claude?
Is this sustainable? I guess the car industry is full of rebadged models with the same engines and chassis. It's just wild that we keep hearing about the AI boom as though there's a vibrant competitive ecosystem and not just Nvidia, a couple of software partners and then a sea of whiteboxers.
It can be used with a number of local model services. Currently for my setup on a NVIDIA 4090, I'm running both the base and instruct model for deepseek-coder 6.7b using 5_K_M Quantization GGUF files (for performance) through llama.cpp "server" where the base model is for completions and the instruct model for chat interactions.
llama.cpp: https://github.com/ggerganov/llama.cpp/
deepseek-coder 6.7b base GGUF files: https://huggingface.co/TheBloke/deepseek-coder-6.7B-base-GGU...
deepseek-coder 6.7b instruct GGUF files: https://huggingface.co/TheBloke/deepseek-coder-6.7B-instruct...
Deleted Comment
the role of an AI startup is to come up with ideas, thus useful products. Most of existing products pre-AI, are also front ends to existing operations systems and exiting databases, because creating the whole stack does not make sense. At least we have state of the art open models that we can use freely
That's what I don't get about all these AI startups. The core value of their business is an API call they don't own. It's like people who were selling ______ apps on iPhone when Apple added their own built-in _____ features to iOS, except the entire industry is like that.
In the Cody settings.json file you can disable autocomplete on entire languages/file types.
Additionally, we recently rolled out a Cody Ignore file type where you can specify files/folders that Cody will not look at for context. This feature is still in experimental mode though. https://sourcegraph.com/docs/cody/capabilities/ignore-contex...
With Cody, we have a relationship w/ both Anthropic and OpenAI to never use any data submitted via Cody users for training and data is not retained either.
Can you say more about this? Is it public? Contractual? Verifiable somehow?
This does exist in GitHub Copilot - it’s called content exclusions: https://docs.github.com/en/copilot/managing-github-copilot-i...
I’m not sure if Cody has a similar feature, or if there’s any move towards a standardised solution.
> This feature is available for organization accounts with a Copilot Business subscription.
And even if you exclude the moment anyone starts a chat the files are read and sent and could form suggestions:
> Excluding content from GitHub Copilot currently only affects code completion. GitHub Copilot Chat is not affected by these settings.
Both quotes from your link
Or maybe a set of sensible global default ignores, based on the usual suspects from gitignore.io or suchlike ?
https://github.com/jimmc414/1filellm
Most secret managers allow you to either specify value references in .env files, or provide a way of running programs that specifically gives them access to secrets.
As far as I know .env file is the main supported way of setting environmental variables in VS Code.
> One of the biggest constraints on the retrieval implementation is latency
If I’m getting a multi line block of code written automagically for me based on comments and the like, I’d personally value quality over latency and be more than happy to wait on a spinner. And I’d also be happy to map separate shortcuts for when I’m prepared to do so (avoiding the need to detect my intent).
For autocomplete, at the moment we only support Starcoder because it has given us the best return on latency + quality, but we'd def love to support (and give users the choice to set an LLM of their choice, so if they prefer waiting longer for higher quality results, they should be able to)
You can do that with our local Ollama support, but that's still experimental and YMMV. Here's how to set it up: https://sourcegraph.com/blog/local-code-completion-with-olla...
I wish y'all would put a little more effort into user experience. When you go to subscribe it says:
> Claude Instant 1.2, Claude 2, ChatGPT 3.5 Turbo, ChatGPT 4 Turbo Preview
Trying to figure out what's supported was tedious enough[0] that I just ended up renewing my Copilot subscription instead.
[0] Your contact page for "information about products and purchasing" talks about scheduling a meeting. One of your welcome emails points us to your Discord but then your Discord points us to your forum.
That link to the list of supported languages is broken. I couldn't find a similar file elsewhere in the repo: maybe the list got folded up into a function in another file? Also a bit annoying that I couldn't find the info on the company's website (though I gave up pretty quickly).
What I find frustrating is that Cody is doing deterministic language-specific tricks that don't depend on the underlying LLM, so why not just give a list of all the languages you did deterministic tricks for and call that the supported languages? Why be vague? Why deflect to the underlying LLM to encourage people to try your product with languages you won't do any work to support?
Claiming it works on "all programming languages" is lazy and dishonest, and clearly false. There's nothing magical about LLMs that relieves software developers of the duty to specify the limitations of their software.
But this test didn't seem to include TypeScript so it's obviously not comprehensive. I'm not convinced this information is actually in one place.
I wrote a blog post comparing Cody to Copilot a little while ago. Some of the stuff might be outdated now, but I think it still captures the essence of the differences between the two. Obviously I'm a little biased as I work for Sourcegraph, but I tried to be as fair as one could be. Happy to dive deeper into any details. https://sourcegraph.com/blog/copilot-vs-cody-why-context-mat...
Our biggest differentiators are context, choice, and scale. We've been helping developers find and understand code for the last 10 years and are now applying a lot of that to Cody in regards to fetching the right context. When it comes choice, we support multiple LLMs and are always on the lookout in supporting the right LLM for the job. We recently rolled out Claude 3 Opus as well as Ollama support for offline/local inference.
Cody also has a free tier where you can give it a try and compare for yourself, which is what I always recommend people do :)
Correct url: https://sourcegraph.com/blog/copilot-vs-cody-why-context-mat...
I ended up disabling copilot. The reason is that the completions do not always integrate with the rest of the code, in particular with non-matching brackets. Often it just repeats some other part of the code. I had much fewer cases of this with Cody. But, arguably, the difference is not huge. But then add on top of this choice of models.
For instance, I'd often get a problem where I'd type "foo(", and VsCode would auto-close the parenthesis, so my cursor would be in "foo(|)", but Copilot wouldn't be aware of the auto-close, so it would suggest "bar)" as a completion, leading to "foo(bar))" if I accepted it. But I haven't had this problem in recent versions. Other similar papercuts I'd noticed have been fixed.
I haven't used Cody, though, so I don't know how they compare.
In particular, Copilot seems to do better at generating missing TypeScript import statements. These are relative imports of files in the same small repo. Neither of them seems to really understand my codebase in the way that Cody promises - they make up imports of nonexistent files. Copilot sometimes guesses the right file, I think based on my understanding my naming conventions better.
https://sourcegraph.com/blog/copilot-vs-cody-why-context-mat...
Our biggest differentiators are context, choice, and scale. We've been helping developers find and understand code for the last 10 years and are now applying a lot of that to Cody in regards to fetching the right context. When it comes choice, we support multiple LLMs and are always on the lookout in supporting the right LLM for the job. We recently rolled out Claude 3 Opus as well as Ollama support for offline/local inference.
Cody also has a free tier where you can give it a try and compare for yourself, which is what I always recommend people do :)
On Neovim, Cody actually does have experimental support for neovim: https://sourcegraph.com/docs/cody/clients/install-neovim. Not all features are supported as in VS Code though.
FWIW I believe Cody is much more actively developed than Copilot is these days, and so it has a more comprehensive feature set.