I've been using AI to solve isolated problems, mainly as a replacement of search engine specifically for programming. I'm still not convinced of these "write whole block of code for me" type of use case. Here's my arguments against the videos from the article.
1. Snake case to camelCase. Even without AI we can already complete these tasks easily. VSCode itself has command of "Transform to Camel Case" for selection. It is nice the AI can figure out which text to transform based on context, but not too impressive. I could select one ":, use "Select All Occurrences", press left, then ctrl+shift+left to select all the keys.
2. Generate boilerplate from documentation. Boilerplate are tedious, but not really time-consuming. How many of you spend 90% of time writing boilerplate instead of the core logic of the project? If a language/framework (Java used to be, not sure about now) requires me to spend that much time on boilerplate, that's a language to be ditched/fixed.
3. Turn problem description into a block of concurrency code. Unlike the boilerplate, these code are more complicated. If I already know the area, I don't need AI's help to begin with. If I don't know, how can I trust the generated code to be correct? It could miss a corner case that my question didn't specify, which I don't yet know existing myself. In the end, I still need to spend time learning Python concurrency, then I'll be writing the same code myself in no time.
In summary, my experience about AI is that if the question is easy (e.g. easy to find exactly same question in StackOverflow), their answer is highly accurate. But if it is a unique question, their accuracy drops quickly. But it is the latter case where we spend most of the time on.
I started like this. Then I came around and can’t imagine going back.
It’s kinda like having a really smart new grad, who works instantly, and has memorized all the docs. Yes I have to code review and guide it. That’s an easy trade off to make for typing 1000 tokens/s, never losing focus, and double checking every detail in realtime.
First: it really does save a ton of time for tedious tasks. My best example is test cases. I can write a method in 3 minutes, but Sonnet will write the 8 best test cases in 4 seconds, which would have taken me 10 mins of switching back and forth, looking at branches/errors, and mocking. I can code review and run these in 30s. Often it finds a bug. It’s definitely more patient than me in writing detailed tests.
Instant and pretty great code review: it can understand what you are trying to do, find issues, and fix them quickly. Just ask it to review and fix issues.
Writing new code: it’s actually pretty great at this. I needed a util class for config that had fallbacks to config files, env vars and defaults. And I wanted type checking to work on the accessors. Nothing hard, but it would have taken time to look at docs for yaml parsing, how to find the home directory, which env vars api returns null vs error on blank, typing, etc. All easy, but takes time. Instead I described it in about 20 seconds and it wrote it (with tests) in a few seconds.
It’s moved well past the stage “it can answer questions on stack overflow”. If it has been a while (a while=6 months in ML), try again with new sonnet 3.5.
>My best example is test cases. I can write a method in 3 minutes, but Sonnet will write the 8 best test cases in 4 seconds
For me it doesn't work. Generated tests fail to run or they fail.
I work in large C# codebases and in each file I have lots of injected dependencies. I have one public method which can call lots of private methods in the same class.
AI either doesn't properly mock the dependencies, either ignores what happens in the private methods.
If I take a lot of time guiding it where to look, it can generate unit tests that pass. But it takes longer than if I write the unit tests myself.
I’ve found it better at writing tests because it tests the code you’ve written vs what you intended. I’ve caught logic bugs because it wrote tests with an assertion for a conditional that was backwards. The readable name of the test clearly pointed out that I was doing the wrong thing (the test passed?.)
Maybe a TLDR from all the issues I'm reading in this thread:
- It's gotten way better in the last 6 months. Both models (Sonnet 3.5 and new October Sonnet 3.5), and tooling (Cursor). If you last tried Co-pilot, you should probably give it another look. It's also going to keep getting better. [1]
- It can make errors, and expect to do some code review and guiding. However the error rates are going way way down [1]. I'd say it's already below humans for a lot of tasks. I'm often doing 2/3 iterations before applying a diff, but a quick comment like "close, keep the test cases, but use the test fixture at the top of the file to reduce repeated code" and 5 seconds is all it takes to get a full refactor. Compared to code-review turn around with a team, it's magic.
- You need to learn how to use it. Setting the right prompts, adding files to the context, etc. I'd say it's already worth learning.
- I just knows the docs, and that's pretty invaluable. I know 10ish languages, which also means I don't remember the system call to get an env var in any of them. It does, and can insert it a lot faster than I can google it. Again, you'll need to code review, but more and more it's nailing idiomatic error checking in each language.
- You don't need libraries for boilerplate tasks. zero_pad is the extreme/joke example, but a lot more of my code is just using system libraries.
- It can do things other tools can't. Tell it to take the visual style of one blog post and port to another. Take it to use a test file I wrote for style reference, and update 12 other files to follow that style. Read the README and tests, then write pydocs for a library. Write a GitHub action to build docs and deploy to GitHub pages (including suggesting libraries, deploy actions, and offering alternatives). Again: you don't blindly trust anything, you code review, and tests are critical.
> Instant and pretty great code review: it can understand what you are trying to do, find issues, and fix them quickly. Just ask it to review and fix issues.
Cursor’s code review is surprisingly good. It’s caught many bugs for me that would have taken a while to debug, like off by one errors or improperly refactored code (like changing is_alive to is_dead and forgetting to negate conditionals)
This “really smart new grad” take is completely insane to me, especially if you know how LLMs work. Look at this SQL snippet Claude (the new Sonnet) generated recently.
-- Get recipient's push token and sender's username
SELECT expo_push_token, p.username
INTO recipient_push_token, sender_username
FROM profiles p
WHERE p.id = NEW.recipient_id;
Seems like the world has truly gone insane and engineers are tuned into some alternate reality a la Fox News. Well…it’ll be a sobering day when the other shoe falls.
Another fun example from yesterday: pasted a blog post in markdown into a HTML comment. Selected it and told sonnet to convert it to HTML using another blog post as a style reference.
I replaced SO with cGPT and it’s the only good case I found. Finding an answer I build onto. But outsourcing my reflexion ? That’s a dangerous path. I tried on small projects to do that, building a project from scratch with cursor just to test it. Sometimes it’s right on spot but in many instances it misses completely some cases and edge cases. Impossible to trust blindly. And if I do so and not take proper time to read and think about the code the consequences pile up and make me waste time in the long run because it’s prompt over prompt over prompt to refine it and sometimes it’s not exactly right. That messes up my thinking and I prefer to do it myself and use it as a documentation on steroids. I never used google and SO again for docs. I have the feeling that relying on it to much to write even small blocs of code will make us loose some abilities in the long run and I don’t think that’s a good thing. Will companies allow us to use AI in code interviews for boilerplate ?
The AI's are to a large degree trained on tutorial code, quick examples, howto's and so on from the net. Code that really should come with a disclamer note: "Dont use in production, only example code.".
This leads to your code being littered with problematic edge-cases that you still have to learn how to fix. Or in worst case you don't even notice that there are edge cases because you just copy-pasted the code and it works for you. The edge cases your users will find with time.
I’m slightly worried that these AI tools will hurt language development. Boilerplate heavy and overly verbose languages are flawed. Coding languages should help us express things more succinctly, both as code writers and as code readers.
If AI tools let us vomit out boilerplate and syntax, I guess that sort of helps with the writing part (maybe. As long as you fully understand what the AI is writing). But it doesn’t make the resulting code any more understandable.
Of course, as is always the case, the tools we have now are the dumbest they’ll ever be. Maybe in the future we can have understandable AI that can be used as a programming language, or something. But AI as a programming language generator seems bad.
I used to agree with this, but the proliferation of Javascript made me realize that newer/better programming languages were already not coming to save us.
Wondering. Since the seniors pair with LLMs, world needs much less juniors. Some juniors will go away to other industries, but some might start projects in new languages without LLM/business support.
Frankly, otherwise I don't see how any new lang corpus might get created.
Before you dismiss all of this because "You could do it by hand just as easily", you should actually try using Cursor. It only takes a few minutes to setup.
I'm only 2 weeks in but it's basically impossible for me to imagine going back now.
It's not the same as GH Copilot, or any of the other "glorified auto-complete with a chatbox" tools out there. It's head and shoulders better than everything else I have seen, likely because the people behind it are actual AI experts and have built numerous custom models for specific types of interactions (vs a glorified ChatGPT prompt wrapper).
> I could select one ":, use "Select All Occurrences"
Only if it's the same occurrences. Cursor can often get the idea of what you want to do with the whole block of different names. Unless you're a vim macro master, it's not easily doable.
> How many of you spend 90% of time writing boilerplate instead of the core logic of the project?
It doesn't take much time, but it's a distraction. I'd rather tab through some things quickly than context switch to the docs, finding the example, adapting it for the local script, then getting back to what I was initially trying to do. Working memory in my brain is expensive.
I still spend a good amount of time on boilerplate. Stuff that's not thinking hard about the problem I'm trying to solve. Stuff like units tests, error logging, naming classes, methods and variables. Claude is really pretty good at this, not as good as the best code I've read in my career but definitely better than average.
When I review sonnets code the code is more likely to be correct than if I review my own. If I make a mistake I'll read what I intended to write, and not what I actually wrote. Where as when I review sonnets there's 2 passes so the chance an error slips through is smaller.
Completely agree. I find it fails miserably at business logic, which is where we spend most of our time on. But does great at generic stuff, which is already trivial to find on stack overflow.
> But does great at generic stuff, which is already trivial to find on stack overflow.
The major difference is that with Cursor you just hit "tab", and that thing is done. Vs breaking focus to open up a browser, searching SO, finding an applicable answer (hopefully), translating it into your editor, then reloading context in your head to keep moving.
My experience has been different. My major use case for AI tools these days is writing tests. I've found that the generated test cases are very much in line with the domain. It might be because we've been strictly using domain driven design principles It even generates test cases that fail to show what we've missed
I have a corporation-sponsored subscription to Github CoPilot + Rider
When I'm writing unit tests or integration tests it can guess the boilerplate pretty well.
If I already have a AddUserSucceeds test and I start writing `public void Dele...` it usually fills up the DeleteUserSucceeds function with pretty good guesses on what Asserts I want there - most times it even guesses the API path/function correctly because it uses the whole project as context.
I can also open a fresh project I've never seen and ask "Where is DbContext initialised" and it'll give me the class and code snippet directly.
Have you tried recently to start a new web app from scratch? Specially the integration of frontend framework with styling and the frontend backend integration.
Oh my god get ready to waste a full weekend just to setup everything and get a formatted hello world.
That’s why I use Rails for work. But I also had to write a small Nodejs project (vite/react + express) recently for a private project, and it has a lot of nice things going for it that make modern frontend dev really easy - but boy is it time consuming to set up the basics.
that's an indictment of the proliferation of shitty frameworks and documentation. it's not hard to figure out such a combination and then keep a template of it lying around for future projects. you don't have to reach for the latest and shiniest at the start of every project.
Most frontend frameworks come with usable templates. Setting up a new Vite React project and getting to a formatted hello world can be done in half an hour tops.
> Boilerplate are tedious, but not really time-consuming.
In the aggregate, almost no programmer can think up code faster than they can type it in. But being a better typist still helps, because it cuts down on the amount you have to hold in your head.
Similar for automatically generating boilerplate.
> If I don't know, how can I trust the generated code to be correct?
Ask the AI for a proof of correctness. (And I'm only half-joking here.)
In languages like Rust the compiler gives you a lot of help in getting concurrency right, but you still have to write the code. If the Rust compiler approves of some code (AI generated or artisanally crafted), you are already pretty far along in concurrency right.
A great mind can take a complex problem and come up with a simple solution that's easy to understand and obviously correct. AI isn't quite there yet, but getting better all the time.
> In the aggregate, almost no programmer can think up code faster than they can type it in.
And thank god! Code is a liability. The price of code is coming down but selling code is almost entirely supplanted by selling features (SaaS) as a business model. The early cloud services have become legacy dependencies by now (great work if you can get it). Maintaining code is becoming a central business concern in all sectors governed by IT (i.e. all sectors, eating the world and all that).
On a per-feature basis, more code means higher maintenance costs, more bugs and greater demands on developer skills and experience. Validated production code that delivers proven customer value is not something you refactor on a whim (unless you plan to go out of business), and the fact that you did it in an evening thanks to ClippyGPT means nothing—-the costly part is always what comes after: demonstrating value or maintaing trust in a competitive market with a much shallower capital investment moat.
> In the aggregate, almost no programmer can think up code faster than they can type it in. But being a better typist still helps, because it cuts down on the amount you have to hold in your head.
I mean on the big picture level sure they can. Or in detail if it is something that they have good experience with. In many cases I get a visual of the whole code blocks, and then if I use copilot I can already predict what it is going to auto complete for me based on the context and then I can pretty much in a second know if it was right or wrong. Of course it is more so for the side projects since I know exactly what I want to do and so it feels most of the time it is having to just vomit all the code out. And I feel impatient, so copilot helps a lot with that.
I don’t even trust the API based exercises anymore unless it’s a stable and well documented API. Too many times I’ve been bitten by an AI mixing and matching method signatures from different versions, using outdated approaches, mixing in apis from similar libraries, or just completely hallucinating a method. Even if I load the entire library docs into the context, I haven’t found one that’s completely reliable.
> Snake case to camelCase
> VSCode itself has command of "Transform to Camel Case"
I never understand arguments like this. I have no idea what the shortcut for this command is. I could learn this shortcut, sure, but tomorrow I’ll need something totally different. Surely people can see the value of having a single interface that can complete pretty much any small-to-medium-complexity data transformation. It feels like there’s some kind of purposeful gaslighting going on about this and I don’t really get the motive behind it.
Exactly. I think some commenters are taking this example too literally. It's not about this specific transformation, but how often you need to do similar transformations and don't know the exact shortcut or regex or whatever to make it happen. I can describe what I want in three seconds and be done with it. Literal dropbox.png going on in this thread.
If you aren’t using AI for everything, you’re using it wrong. Go learn how to use it better. It’s your job to find out how. Corporations are going to use it to replace your job.
(Just kidding. I’m just making fun of how AI maxis reply to such comments, but they do it more subtly.)
Boilerplate comes up all the time when writing Erlang with OTP behaviors though, and sometimes you have no idea if it really is the right way or not. There are Emacs skeletons for that (through tempo), but feels like they are sometimes out of date.
1. Is such a taste task for me anyway that I don’t lose much just doing it by hand
2. The last time I wrote boilerplate heavy Java code, 15+ years ago, the IDE already generated most of it for me. Nowadays boilerplate comes in two forms for me: new project setup, which I find it far quicker to use a template or just copy and gut an existing project (and it’s not like I start new projects that often anyway), or new components that follow some structure, where AI might actually be useful but I tend to just copy an existing one and gut it.
3. These aren’t tasks I really trust AI for. I still attempt to use AI for them, but 9 out of 10 times come away disappointed. And the other 1 time end up having to change a lot of it anyway.
I find a lot of value from AI, like you, asking it SO style questions. I do also use it for code snippets, eg “do this in CSS”. Its results for that are usually (but not always) reasonably good. I also use it for isolated helper functions (write a function to flood fill a grid where adjacent values match was a recent one). The results for this range from a perfect solution first try, to absolute trash. It’s still overall faster than not having AI, though. And I use it A LOT for rubber ducking.
I find AI is a useful tool, but I find a lot of the positive stories to be overblown compared to my experience with it. I also stopped using code assistants and just keep a ChatGPT tab open. I sometimes use Claude but it’s conversation length limits turned me off.
Looking at the videos in OP, I find the parallelising task to be exactly the kind of tricky and tedious task that I don’t trust AI to do, based on my experience with that kind of task, and with my experience with AI and the subtly buggy results it has given me.
It's amazing how little of my colleagues don't use Cursor simply because they haven't taken the 10 minutes to set it up.
It's amazing how many naysayers there are about Cursor. There are many here and they obviously don't use Cursor. I know this because they point out pitfalls that Cursor barely runs into, and their criticism is not about Cursor, but about AI code in general.
Some examples:
"I tried to create a TODO app entirely with AI prompts" - Cursor doesn't work like that. It lets you take the wheel at any moment because it's embedded in your IDE.
"AI is only good for reformatting or boilerplate" - I copy over my boilerplate. I use Cursor for brand new features.
"Sonnet is same as old-timey google" - lol Google never generated code for you in your IDE, instantly, in the proper place (usually).
"the constantly changing suggested completions seem really distracting" - You don't need to use the suggested completions. I barely do. I mostly use the chat.
"IDEs like cursor make you feel less competent" - This is perhaps the strongest argument, since my quarrel is simply philosophical. If you're executing better, you're being more competent. But yes some muscles atrophy.
"My problem with all AI code assistants is usually the context" - In Cursor you can pass in the context or let it index/search your codebase for the context.
You all need to open your minds. I understand change is hard, but this change is WAY better.
Cursor is a tool, and like any tool you need to know how to use it. Start with the chat. Start by learning when/what context you need to pass into chat. Learn when Cmd+K is better. Learn when to use Composer.
I've noticed that tools like Cursor doesn't really seem to make difference in the end. The good software developers are still the good software developers, regardless of editors.
I don't think you should be upset or worried that people aren't adopting these tools as you think they should. If the tool really lives up to its hype then the non-adopters will fall behind and, for example, be forced to switch to Cursive. This happened with IDEs (e.g. IntelliSense, jump to definition). It may happen with tools like Cursive.
I certainly don't feel this way but if I'm proven wrong thats good.
To be proven wrong would be that Cursor is used by all devs or that IDEs adopt AI into their workflow?
Like OP using cursor has been a huge productivity boost. I maintain a few postgres databases, I work as a fullstack developer, and manage kubernetes configs. When using cursor to write sql tables or queries it adopts my way of writing sql. It analyzed (context) my database folder and when I ask it to create a query, a function, a table, the output is in my style. This blew me away when I first started with cursor.
Onto react/nextjs projects. In the same fashion, I have my way of writing components, fetching data, and now writing RSA. Cursor analyzed my src folder, and when asked to create components from scratch the output was again similar to my style. I use raw CSS and class names, what was an obstacle of naming has become trivial with Cursor ("add an appropriate class to this component with this styling"). Again, it analyzed all my CSS files and spits out css/classes in my writing/formatting style. And working on large projects it is easy to forget the many many components, packages, etc. that integrated/have been written already. Again, cursor comes out on top.
Am I good developer or a bad developer? Don't know. Don't care. I'm cranking out features faster than I have ever done in my decades of development. As has been said before, as a software engineer you spend more time reading code than writing. Same applies to genAI. It turns out that I can ask cursor to analyze packages, spit out code, yaml configuration, sql, and it gets me 80% done with writing from scratch. Heck, if I need types to get full client/server type completion experience, it does that too! I have removed many dependencies (tailwind, tRPC, react query, prisma, to name a few) because cursor has helped me overcome obstacles that these tool assisted in (and I still have typescript code hints in all my function calls!).
All in all, cursor has made a huge difference for me. When colleagues ask me to help them optimize sql, I ask cursor to help out. When colleagues ask to write generic types for their components, I ask cursor to help out. Whether cursor or some other tool, integrating AI with the IDE has been a boom for me.
Cursor's show-stopping problem is not if it is useful, the problem is that it is proprietary. These sorts of tools are fun to play with to try out things that might be useful in the future but relying on them puts you at the mercy of a VC backed company with corresponding dodgy motivations. The only way these technologies will be acceptable for widespread use is to measure them as we do programming languages and to only adopt free software implementations.
To an ideological position like yours, I would say... maybe? I, for one, am happy to pay for good solutions and let the market figure it out. If there are open source solutions that are just as smooth, that's great. I've seen a few, but none have been as good thus far.
I also find it fascinating how in almost every LLM-related discussion there are people always writing arguments to prove that LLMs do not work.
OK, I understand. Maybe they can't get much use of them and that's fine. But why they always insist that the tools don't work for everyone is something I can't make any sense of.
I stopped arguing online about this though. If they don't want to use LLMs that's fine too. Others (we) are taking their business.
I've been a paying customer for Jetbrains IDE for years.
After trying Cursor, I'd say if I were Jetbrains devs I'd be very worried. It's a true paradiam shift. It feels like Jetbrains' competitve edge over other editors/IDEs mostly vanished overnight.
Of course Jetbrains has its own AI-based solution and I'm sure they'll add more. But I think what Jetbrains excels -- the understanding of semantics -- is no longer that important for an IDE.
Why would they be? Cursor took an existing editor and added some AI features on top of it. Features that are enabled by a third party API with some good prompts, something easily replicable by any editor company. Current LLMs are a commodity.
I am slow to take on new tools and was coding in Notepad for far too long... but I am already on the Cursor boat. Right now, I use it for two things - code completion and pasting error messages into chat.
When people complain about LLMs hallucinating results, that doesn't really apply because it is either guessing wrong on the autocomplete (in which case I just keep typing) or it doesn't instantly point out the bug, in which case I look at the code or jump to Google.
The naysayers used to bother me, but then I realized it’s no skin off my back if they don’t want to become familiar with a transformative technology. Stay with the old tools, people are getting excited for no reason at all, everyone is just pretending to be more productive!
It reminds me of how blackberry users insisted physical keyboards were necessary and smartphone touchscreen users were deluded.
I’m not using Cursor because I don’t want my code to go through yet another middleman I’m not sure I can trust. I can relatively safely put my trust in OpenAI, but Cursor? Not so sure. How do I know they’re secure?
At one company, the CEO said AI tools in general should not be used, due to fear of invalidating a patent application in progress after the lawyer said it must be kept secret except with NDA partners. I explained that locally run LLMs don't upload anything, so those are ok. This is a company that really needs better development velocity, and is buried alive in reports that need writing,
On the other hand, at another company, where the NDAs are stronger and more one-sided, and there's a stronger culture of code silos, "who needs to know" governing read access to individual code repos, even for mundane things like web dashboards, and higher security in general, I expected nobody would be allowed to use these tools, yet I saw people talking about their Copilot and Cursor use openly on the company Slack.
There was someone sitting next to me using Cursor yesterday. I'd consider hiring them, if they're interested, but there's no way they're going to want to join a company that forbids using AI tools that upload code being worked on.
So I don't think companies are particularly consistent about this at the moment.
(Perhaps Apple's Private Cloud Compute service, and whatever equivalents we get from other cloud vendors, will eventuall make a difference to how companies see this stuff. We might also see some interesting developments with fully homomorphic encryption (FHE). That's very slow, but the highly symmetric tensor arithmetic used in ML has potential to work better with FHE than general purpose compute.)
You wouldn't be the first engineer to fade into irrelevance because they were too proud to adapt to the changing world around them. I'd encourage you to open your mind a bit.
I recently started using Cursor for all my typescript/react personal projects and the increase in productivity has been staggering. Not only has it helped me execute way faster, similar to the OP I also find that it prevents me from getting sidetracked by premature abstraction/optimization/refactoring.
I recently went from an idea for a casual word game (aka wordle) to a fully polished product in about 2h, which would have taking me 4 or 5 times that if I hadn’t used Cursor. I estimate that 90% of the time was spent thinking about the product, directing the AI, and testing and about 10% of the time actually coding.
You're getting a lot of snark in the comments, but your excitement is warranted. It's fascinating how any claims of a code tool being useful always seem to offend the ego and bring out all the chest-thumping super-programmers claiming they could do it better.
I would love to see the world where you didn't use AI and instead invested the time to make yourself a stronger programmer. A react wordle clone isn't something most developers would need 2 hours to make (sure maybe the styling / hosting AROUND the wordle clone might take longer) - I'm not saying you're a bad programmer or a bad person but what is the opportunity cost of using AI here? Are you optimising yourself into a local-minima?
they said 90% of it was spent on ideation and exploration
they didnt specifically mean they built a wordle clone, just a game like it. if they wanted just a wordle clone, they wouldve gotten one within a few minutes of using codegen tools.
I think this excitement reflects the fact that most devs are shoemakers without shoes. They could get cursor-like experience decades ago by preparing snippets, tools, templates, editor configs and knowledge bases. But they used that “a month of work can save two days of planning” principle, so now having a sort of a development toolkit feels refreshing. Those who had it aren’t that impressed.
In my experience, Cursor writes average code. This makes sense, if you think about it. The AI was trained on all the code that is publicly available. This code is average by definition.
I'm below average in a lot of programming languages and tools. Cursor is extremely useful there because I don't have to spend tens of minutes looking up APIs or language syntax.
On the other hand, in areas I know more about, I feel that I can still write better code than Cursor. This applies to general programming as well. So even if Cursor knows exactly how to write the syntax and which function to invoke, I often find the higher-level code structure it creates sub-optimal.
Overall, Cursor is an extremely useful tool. It will be interesting to see whether it will be able to crawl out of the primordial soup of averages.
Exactly right. Cursor makes it easy to get to "adequate." Which in the hundreds of places that I'm not expert or don't have a strong opinion, is regularly as good as and frequently better than my first pass. Especially as it never gets tired whereas I do.
It's a great intern, letting me focus on the few but important places that I add specific value. If this is all it ever does, that's still enormously valuable.
This is true. But with a little push here and there you can usually avoid the sub-optimal high level code structure. That's why it makes so much sense to have it in the IDE.
You can see in general anything AI produces is pretty average.
But people who buy software don't care that the code behind it is average. As long as it works.
Whereas people who buy text, images and video do care.
I've been having some difficulties with deprecated code and old patterns being suggested all the time. But I guess this is an easy issue to fix and will probably be fixed eventually.
I’m doing an experiment in this in real time: I’ve got a bunch of top-flight junior folks, all former Jane and Google and Galois and shit, but all like 24.
I’ve also been logging every interaction with an LLM and the exit status of the build on every mtime of every language mode file and all the metadata: I can easily plot when I lean on the thing and when I came out ahead, I can tag diffs that broke CI. I’m measuring it.
My conclusion is that I value LLMs for coding in exact the same way that the kids do: you have to break Google in order for me to give a fuck about Sonnet.
LLMs seem like magic unless you remember when search worked.
The best way to get useful answers was (and for me still is) to ask Goggle for "How do I blah site:stackoverflow.com". Without the site filter, Google results suck or are just a mess, and stackoverflow's own search is crap.
I don't understand, are you using LLMs purely for information retrieval, like a database (or search index)? I mean sure that's one usecase, but for me the true power of LLMs comes from actually processing and transforming information, not just retrieving it.
I have my dots wired up where I basically fire off a completion request any time I select anything in emacs.
I just spend any amount of tokens to build a database of how 4o behaves correlated to everything emacs knows, which is everything. I’m putting down tens of megabytes a day on what exact point they did whatever thing.
What domain/type of software do you and they work on? Cursor has been quite effective for me and many others say the same.
As long as one prompts it properly with sufficient context, reviews the generated code, and asks it to revise as needed, the productivity boost is significant in my experience.
Well, the context is the problem. LLMs will really become useful if they 1.) understand the WHOLE codebase AND all it's context and THEN also understand the changes over time to it (local history and git history) and finally also use context from slack - and all of that updating basically in real time.
That will be scary. Until then, it's basically just a better autocomplete for any competent developer.
Watching the videos in the article, the constantly changing suggested completions seem really distracting.
Personally, I find this kind of workflow totally counter-productive. My own programming workflow is ~90% mental work / doing sketches with pen & paper, and ~10% writing the code. When I do sit down to write the code, I know already what I want to write, don't need suggestions.
It's a tool. You get used to new tools. These days I can easily process "did something interesting appear" in the peripheral vision at the same time as continuing to type. But the most useful things don't happen while I write. Instead it's the small edits that immediately come up with "would you also like to change these other 3 things to make your change work?" Those happen in the natural breaks anyway, as I start scanning for those related changes myself.
A tool that forces me to shift from creating solutions to trying to figure out what might be wrong with some code is entirely detrimental to my workflow.
You can create markdown files containing all the planning you did and Cursor will have all of that as context to give you better suggestions. This type of prompting is what gives amazing results - not just relying on out of the box magic, which I think a lot of people are expecting.
Cursor has been an enabler for unfamiliar corners of development. Mind you, it's not a foolproof tool that writes correct code on the first try or anything close to that.
I've been in compilers, storage, and data backends for 15ish years, and had to do a little project that required recording audio clips in a browser and sending them over a websocket. Cursor helped me do it in about 5 minutes, while it would've taken at least 30 min of googling to find the relevant keywords like MediaStream and MediaRecorder, learn enough to whip something up, fail, then try to fix it until it worked.
Then I had to switch to streaming audio in near-realtime... here it wasn't as good: it tried sending segments of MediaRecorder audio which are not suitable for streaming (because of media file headers and stuff). But a bit of Googling, finding out about Web Audio APIs and Audio Worklet, and a bit of prompting, and it basically wrote something that almost worked. Sure it had some concurrency bugs like reading from the same buffer that it's overwriting in another thread. But that's why we're checking the generated code, right?
I've had similar experiences. I've basically disengaged any form of AI code generation. I do find it useful to pointing me to interesting/relevant symbols and API's however, but it doesn't save me any time connecting plumbing, nor is that really a difficult thing for any programmer to do.
In the article, you mentioned that you've been writing code for 36 years, so don't you feel IDEs like cursor make you feel less competent? Meaning I loved the process of scratching my head over a problem and then coming to a solution but now we have AI Agents solving the problems and optimizing code which takes the fun out of it.
I feel like in the early stages of becoming a programmer, learning how to do all those little baseline problems is fun.
But eventually you get to a point where you've solved variations of the problem hundreds of times before, and it's just hours of time being burnt away writing it again with small adjustments.
It's like getting into making physical things with only a screwdriver and a hammer. Working with your hands on those little projects is fun. Then eventually you level up your skills and realize making massive things is much easier with a power drill and some automated equipment, and gives you time to focus on the design and intricacies of far more complicated projects. Though there are always those times where you just want to spend a weekend fiddling with basics for fun.
That should be when you move to more sophistcated (and also complex/complicated) languages that relieve you from as much of this boilerplate as possible.
The rest is then general design and archiceture, where LLMs really don't help much with. What they are really good for is to get an idea of possible options in spaces were you have little experience or to quickly explain and summarize specific solutions and their pros and cons. But I tried to make it pick a solution based on the constraints and even with many tries and careful descriptions, the results were really bad.
I’ve been using cursor for a while now and I think that if a problem is simple enough for an LLM to work out on its own, it’s probably not worth scratching one’s head over…
I dont think people need to think, that the AI is supposed to make complicated code
i think for the most part its meant to help you "get past" all the generic code you usually write in the beginning of a project, generic functions you need in almost all systems, etc.
I don't agree, in the initial stages solving problems without LLMs will give a good enough knowledge about the intricacies involved and it helps develop a structured approach while solving a problem!
And you get less tired. I can complete more work because I'm not always getting stuck in minutia. I can focus on architecture, structure, and refactoring instead of line-by-line writing of code.
I'm not saying that I don't like writing code. I'm just saying that doing a lot of it can be mentally exhausting. Sometimes I'd just prefer to ship feature-complete stuff on-time and on-budget, then go back to my kids and wife without feeling like my brain is mush.
I think you are still thinking just on another level. E.g. you go on a walk, you fantasize about everything you are going to do, and it builds up in your head, then you come back, it is all in your head and AI will help you get it out quickly, but you have already solved the problem for yourself and so you are also able to validate quickly what the AI does.
yes this!!!
Whenever I write a prompt, I tend to divide it into smaller prompts, and in this process, my brain thinks of multiple ways to solve the problem. So yes, it's not limiting my thought process.
I didn't notice this thing until I read this.
Do they really solve the hard problems though? For me, the LLMs solve the low level problems. Usually I need to figure out an algorithm, which is the actual problem, and finally give some pseudo code to the LLM and surrounding code so it can generate a solution that looks idiomatic.
In some cases, LLMs act as a stackoverflow replacement for me, like „sort this with bubble sort, by property X“. I’d also ask it to write some test cases around that. I won’t import a bubble sort library just for this, but I also don’t want to spend any more time than necessary, implementing this for the nth time.
I don't find figuring out the syntax of a new language interesting. There's absolutely no fun in that. I know what I want to do and already understand the concepts behind it, that was the fun part to learn.
I do think that is a real risk, yes. I don't want to use LLMs as a crutch to guard against having to ever learn anything new, or having to implement something myself. There is such a thing as productive struggle which is a core part of learning.
That said, I think everyone can relate to wasting an awful lot of time on things that are not "interesting" from the perspective of the project you are working on. For example, I can't count the number of hours I've spent trying to get something specific to work in webpack, and there is no payoff because today the fashionable tool is vite and tomorrow it'll be something else. I still want to know my code inside and out, but writing a deploy script for it should not be something I need to spend time on. If I had a junior dev working for me for pennies a day, I would absolutely delegate that stuff to them.
For a lot of people the fun and rewarding part is actually building and shipping something useful to users. Not solving complex puzzles / algoritic challenges. If AI gets me in front of users faster then I'm a happier builder.
Was going to ask a similar question. Where in the experience of Cursor do you feel like you're losing some of the agency of solving the harder problems, or is this something you take in mind while using it?
I’ve “only” been coding for 20 years, but it’s the tedious problems, not the actually technically hard problems that cursor solves. I don’t need to debug 5 edge cases any more to feel like I’ve truly done the work, I know I can do that, it’s just time spent. Cursor helps me get the boring and repetitive work out of coding. Now, don’t get me wrong, there was a time where I loved building something lower level line by line, but nowadays it’s very often a “been there, done that” type of thing for me.
If I need an RNG rolled to a standard distribution, I can either spend 5 minutes looking it up, learning how to import and use a library, and adding it to my code, or I can tell Cursor to do it for me.
Crap like that, 100 times a day.
"Walk through this array and pull out every element without an index field and add it to a new array called needsToBeIndexed, send them off to the indexing service, and log any failures to the log file as shown in the function above".
Cursor lets me think closer to the level of architecting software.
Sure having a deep knowledge of my language of choices is fun, and very needed at times, but for the 40% or so of code that is boring work of moving data around, Cursor helps a lot.
1. Snake case to camelCase. Even without AI we can already complete these tasks easily. VSCode itself has command of "Transform to Camel Case" for selection. It is nice the AI can figure out which text to transform based on context, but not too impressive. I could select one ":, use "Select All Occurrences", press left, then ctrl+shift+left to select all the keys.
2. Generate boilerplate from documentation. Boilerplate are tedious, but not really time-consuming. How many of you spend 90% of time writing boilerplate instead of the core logic of the project? If a language/framework (Java used to be, not sure about now) requires me to spend that much time on boilerplate, that's a language to be ditched/fixed.
3. Turn problem description into a block of concurrency code. Unlike the boilerplate, these code are more complicated. If I already know the area, I don't need AI's help to begin with. If I don't know, how can I trust the generated code to be correct? It could miss a corner case that my question didn't specify, which I don't yet know existing myself. In the end, I still need to spend time learning Python concurrency, then I'll be writing the same code myself in no time.
In summary, my experience about AI is that if the question is easy (e.g. easy to find exactly same question in StackOverflow), their answer is highly accurate. But if it is a unique question, their accuracy drops quickly. But it is the latter case where we spend most of the time on.
It’s kinda like having a really smart new grad, who works instantly, and has memorized all the docs. Yes I have to code review and guide it. That’s an easy trade off to make for typing 1000 tokens/s, never losing focus, and double checking every detail in realtime.
First: it really does save a ton of time for tedious tasks. My best example is test cases. I can write a method in 3 minutes, but Sonnet will write the 8 best test cases in 4 seconds, which would have taken me 10 mins of switching back and forth, looking at branches/errors, and mocking. I can code review and run these in 30s. Often it finds a bug. It’s definitely more patient than me in writing detailed tests.
Instant and pretty great code review: it can understand what you are trying to do, find issues, and fix them quickly. Just ask it to review and fix issues.
Writing new code: it’s actually pretty great at this. I needed a util class for config that had fallbacks to config files, env vars and defaults. And I wanted type checking to work on the accessors. Nothing hard, but it would have taken time to look at docs for yaml parsing, how to find the home directory, which env vars api returns null vs error on blank, typing, etc. All easy, but takes time. Instead I described it in about 20 seconds and it wrote it (with tests) in a few seconds.
It’s moved well past the stage “it can answer questions on stack overflow”. If it has been a while (a while=6 months in ML), try again with new sonnet 3.5.
For me it doesn't work. Generated tests fail to run or they fail.
I work in large C# codebases and in each file I have lots of injected dependencies. I have one public method which can call lots of private methods in the same class.
AI either doesn't properly mock the dependencies, either ignores what happens in the private methods.
If I take a lot of time guiding it where to look, it can generate unit tests that pass. But it takes longer than if I write the unit tests myself.
- It's gotten way better in the last 6 months. Both models (Sonnet 3.5 and new October Sonnet 3.5), and tooling (Cursor). If you last tried Co-pilot, you should probably give it another look. It's also going to keep getting better. [1]
- It can make errors, and expect to do some code review and guiding. However the error rates are going way way down [1]. I'd say it's already below humans for a lot of tasks. I'm often doing 2/3 iterations before applying a diff, but a quick comment like "close, keep the test cases, but use the test fixture at the top of the file to reduce repeated code" and 5 seconds is all it takes to get a full refactor. Compared to code-review turn around with a team, it's magic.
- You need to learn how to use it. Setting the right prompts, adding files to the context, etc. I'd say it's already worth learning.
- I just knows the docs, and that's pretty invaluable. I know 10ish languages, which also means I don't remember the system call to get an env var in any of them. It does, and can insert it a lot faster than I can google it. Again, you'll need to code review, but more and more it's nailing idiomatic error checking in each language.
- You don't need libraries for boilerplate tasks. zero_pad is the extreme/joke example, but a lot more of my code is just using system libraries.
- It can do things other tools can't. Tell it to take the visual style of one blog post and port to another. Take it to use a test file I wrote for style reference, and update 12 other files to follow that style. Read the README and tests, then write pydocs for a library. Write a GitHub action to build docs and deploy to GitHub pages (including suggesting libraries, deploy actions, and offering alternatives). Again: you don't blindly trust anything, you code review, and tests are critical.
[1] https://www.anthropic.com/news/3-5-models-and-computer-use
Cursor’s code review is surprisingly good. It’s caught many bugs for me that would have taken a while to debug, like off by one errors or improperly refactored code (like changing is_alive to is_dead and forgetting to negate conditionals)
It can't understand. That's not what LLMs do.
Copy and paste the code to the Claude website? Or use an extension? o something else?
Done in 5 seconds.
I replaced SO with cGPT and it’s the only good case I found. Finding an answer I build onto. But outsourcing my reflexion ? That’s a dangerous path. I tried on small projects to do that, building a project from scratch with cursor just to test it. Sometimes it’s right on spot but in many instances it misses completely some cases and edge cases. Impossible to trust blindly. And if I do so and not take proper time to read and think about the code the consequences pile up and make me waste time in the long run because it’s prompt over prompt over prompt to refine it and sometimes it’s not exactly right. That messes up my thinking and I prefer to do it myself and use it as a documentation on steroids. I never used google and SO again for docs. I have the feeling that relying on it to much to write even small blocs of code will make us loose some abilities in the long run and I don’t think that’s a good thing. Will companies allow us to use AI in code interviews for boilerplate ?
This leads to your code being littered with problematic edge-cases that you still have to learn how to fix. Or in worst case you don't even notice that there are edge cases because you just copy-pasted the code and it works for you. The edge cases your users will find with time.
If AI tools let us vomit out boilerplate and syntax, I guess that sort of helps with the writing part (maybe. As long as you fully understand what the AI is writing). But it doesn’t make the resulting code any more understandable.
Of course, as is always the case, the tools we have now are the dumbest they’ll ever be. Maybe in the future we can have understandable AI that can be used as a programming language, or something. But AI as a programming language generator seems bad.
Frankly, otherwise I don't see how any new lang corpus might get created.
I'm only 2 weeks in but it's basically impossible for me to imagine going back now.
It's not the same as GH Copilot, or any of the other "glorified auto-complete with a chatbox" tools out there. It's head and shoulders better than everything else I have seen, likely because the people behind it are actual AI experts and have built numerous custom models for specific types of interactions (vs a glorified ChatGPT prompt wrapper).
Only if it's the same occurrences. Cursor can often get the idea of what you want to do with the whole block of different names. Unless you're a vim macro master, it's not easily doable.
> How many of you spend 90% of time writing boilerplate instead of the core logic of the project?
It doesn't take much time, but it's a distraction. I'd rather tab through some things quickly than context switch to the docs, finding the example, adapting it for the local script, then getting back to what I was initially trying to do. Working memory in my brain is expensive.
I still spend a good amount of time on boilerplate. Stuff that's not thinking hard about the problem I'm trying to solve. Stuff like units tests, error logging, naming classes, methods and variables. Claude is really pretty good at this, not as good as the best code I've read in my career but definitely better than average.
When I review sonnets code the code is more likely to be correct than if I review my own. If I make a mistake I'll read what I intended to write, and not what I actually wrote. Where as when I review sonnets there's 2 passes so the chance an error slips through is smaller.
Deleted Comment
The major difference is that with Cursor you just hit "tab", and that thing is done. Vs breaking focus to open up a browser, searching SO, finding an applicable answer (hopefully), translating it into your editor, then reloading context in your head to keep moving.
When I'm writing unit tests or integration tests it can guess the boilerplate pretty well.
If I already have a AddUserSucceeds test and I start writing `public void Dele...` it usually fills up the DeleteUserSucceeds function with pretty good guesses on what Asserts I want there - most times it even guesses the API path/function correctly because it uses the whole project as context.
I can also open a fresh project I've never seen and ask "Where is DbContext initialised" and it'll give me the class and code snippet directly.
Oh my god get ready to waste a full weekend just to setup everything and get a formatted hello world.
In the aggregate, almost no programmer can think up code faster than they can type it in. But being a better typist still helps, because it cuts down on the amount you have to hold in your head.
Similar for automatically generating boilerplate.
> If I don't know, how can I trust the generated code to be correct?
Ask the AI for a proof of correctness. (And I'm only half-joking here.)
In languages like Rust the compiler gives you a lot of help in getting concurrency right, but you still have to write the code. If the Rust compiler approves of some code (AI generated or artisanally crafted), you are already pretty far along in concurrency right.
A great mind can take a complex problem and come up with a simple solution that's easy to understand and obviously correct. AI isn't quite there yet, but getting better all the time.
And thank god! Code is a liability. The price of code is coming down but selling code is almost entirely supplanted by selling features (SaaS) as a business model. The early cloud services have become legacy dependencies by now (great work if you can get it). Maintaining code is becoming a central business concern in all sectors governed by IT (i.e. all sectors, eating the world and all that).
On a per-feature basis, more code means higher maintenance costs, more bugs and greater demands on developer skills and experience. Validated production code that delivers proven customer value is not something you refactor on a whim (unless you plan to go out of business), and the fact that you did it in an evening thanks to ClippyGPT means nothing—-the costly part is always what comes after: demonstrating value or maintaing trust in a competitive market with a much shallower capital investment moat.
Mo’ code mo’ problems.
I mean on the big picture level sure they can. Or in detail if it is something that they have good experience with. In many cases I get a visual of the whole code blocks, and then if I use copilot I can already predict what it is going to auto complete for me based on the context and then I can pretty much in a second know if it was right or wrong. Of course it is more so for the side projects since I know exactly what I want to do and so it feels most of the time it is having to just vomit all the code out. And I feel impatient, so copilot helps a lot with that.
Deleted Comment
* figuring out how to X in an API - eg "write method dl_file(url, file) to download file from url using requests in a streaming manner"
* Brainstorming which libraries / tools / approaches exist to do a given task. Google can miss some. AI is a nice complement for Google.
I never understand arguments like this. I have no idea what the shortcut for this command is. I could learn this shortcut, sure, but tomorrow I’ll need something totally different. Surely people can see the value of having a single interface that can complete pretty much any small-to-medium-complexity data transformation. It feels like there’s some kind of purposeful gaslighting going on about this and I don’t really get the motive behind it.
(Just kidding. I’m just making fun of how AI maxis reply to such comments, but they do it more subtly.)
2. The last time I wrote boilerplate heavy Java code, 15+ years ago, the IDE already generated most of it for me. Nowadays boilerplate comes in two forms for me: new project setup, which I find it far quicker to use a template or just copy and gut an existing project (and it’s not like I start new projects that often anyway), or new components that follow some structure, where AI might actually be useful but I tend to just copy an existing one and gut it.
3. These aren’t tasks I really trust AI for. I still attempt to use AI for them, but 9 out of 10 times come away disappointed. And the other 1 time end up having to change a lot of it anyway.
I find a lot of value from AI, like you, asking it SO style questions. I do also use it for code snippets, eg “do this in CSS”. Its results for that are usually (but not always) reasonably good. I also use it for isolated helper functions (write a function to flood fill a grid where adjacent values match was a recent one). The results for this range from a perfect solution first try, to absolute trash. It’s still overall faster than not having AI, though. And I use it A LOT for rubber ducking.
I find AI is a useful tool, but I find a lot of the positive stories to be overblown compared to my experience with it. I also stopped using code assistants and just keep a ChatGPT tab open. I sometimes use Claude but it’s conversation length limits turned me off.
Looking at the videos in OP, I find the parallelising task to be exactly the kind of tricky and tedious task that I don’t trust AI to do, based on my experience with that kind of task, and with my experience with AI and the subtly buggy results it has given me.
Dead Comment
It's amazing how many naysayers there are about Cursor. There are many here and they obviously don't use Cursor. I know this because they point out pitfalls that Cursor barely runs into, and their criticism is not about Cursor, but about AI code in general.
Some examples:
"I tried to create a TODO app entirely with AI prompts" - Cursor doesn't work like that. It lets you take the wheel at any moment because it's embedded in your IDE.
"AI is only good for reformatting or boilerplate" - I copy over my boilerplate. I use Cursor for brand new features.
"Sonnet is same as old-timey google" - lol Google never generated code for you in your IDE, instantly, in the proper place (usually).
"the constantly changing suggested completions seem really distracting" - You don't need to use the suggested completions. I barely do. I mostly use the chat.
"IDEs like cursor make you feel less competent" - This is perhaps the strongest argument, since my quarrel is simply philosophical. If you're executing better, you're being more competent. But yes some muscles atrophy.
"My problem with all AI code assistants is usually the context" - In Cursor you can pass in the context or let it index/search your codebase for the context.
You all need to open your minds. I understand change is hard, but this change is WAY better.
Cursor is a tool, and like any tool you need to know how to use it. Start with the chat. Start by learning when/what context you need to pass into chat. Learn when Cmd+K is better. Learn when to use Composer.
I don't think you should be upset or worried that people aren't adopting these tools as you think they should. If the tool really lives up to its hype then the non-adopters will fall behind and, for example, be forced to switch to Cursive. This happened with IDEs (e.g. IntelliSense, jump to definition). It may happen with tools like Cursive.
I certainly don't feel this way but if I'm proven wrong thats good.
Like OP using cursor has been a huge productivity boost. I maintain a few postgres databases, I work as a fullstack developer, and manage kubernetes configs. When using cursor to write sql tables or queries it adopts my way of writing sql. It analyzed (context) my database folder and when I ask it to create a query, a function, a table, the output is in my style. This blew me away when I first started with cursor.
Onto react/nextjs projects. In the same fashion, I have my way of writing components, fetching data, and now writing RSA. Cursor analyzed my src folder, and when asked to create components from scratch the output was again similar to my style. I use raw CSS and class names, what was an obstacle of naming has become trivial with Cursor ("add an appropriate class to this component with this styling"). Again, it analyzed all my CSS files and spits out css/classes in my writing/formatting style. And working on large projects it is easy to forget the many many components, packages, etc. that integrated/have been written already. Again, cursor comes out on top.
Am I good developer or a bad developer? Don't know. Don't care. I'm cranking out features faster than I have ever done in my decades of development. As has been said before, as a software engineer you spend more time reading code than writing. Same applies to genAI. It turns out that I can ask cursor to analyze packages, spit out code, yaml configuration, sql, and it gets me 80% done with writing from scratch. Heck, if I need types to get full client/server type completion experience, it does that too! I have removed many dependencies (tailwind, tRPC, react query, prisma, to name a few) because cursor has helped me overcome obstacles that these tool assisted in (and I still have typescript code hints in all my function calls!).
All in all, cursor has made a huge difference for me. When colleagues ask me to help them optimize sql, I ask cursor to help out. When colleagues ask to write generic types for their components, I ask cursor to help out. Whether cursor or some other tool, integrating AI with the IDE has been a boom for me.
Correct. Because they know they need to use the correct tools for the job.
> If the tool really lives up to its hype then the non-adopters will fall behind
This is already happening. I'm able to out-deploy many of my competitors because I'm using Cursor.
Have you actually spent much time with Cursor? The comparison to "Jump to definition" is pretty bad. You also misspelled its name twice.
OK, I understand. Maybe they can't get much use of them and that's fine. But why they always insist that the tools don't work for everyone is something I can't make any sense of.
I stopped arguing online about this though. If they don't want to use LLMs that's fine too. Others (we) are taking their business.
After trying Cursor, I'd say if I were Jetbrains devs I'd be very worried. It's a true paradiam shift. It feels like Jetbrains' competitve edge over other editors/IDEs mostly vanished overnight.
Of course Jetbrains has its own AI-based solution and I'm sure they'll add more. But I think what Jetbrains excels -- the understanding of semantics -- is no longer that important for an IDE.
> I don't use it because I already use JetBrains (Pycharm, mostly). Hard to see any value add of Cursor over that.
lol
Deleted Comment
When people complain about LLMs hallucinating results, that doesn't really apply because it is either guessing wrong on the autocomplete (in which case I just keep typing) or it doesn't instantly point out the bug, in which case I look at the code or jump to Google.
It reminds me of how blackberry users insisted physical keyboards were necessary and smartphone touchscreen users were deluded.
On the other hand, at another company, where the NDAs are stronger and more one-sided, and there's a stronger culture of code silos, "who needs to know" governing read access to individual code repos, even for mundane things like web dashboards, and higher security in general, I expected nobody would be allowed to use these tools, yet I saw people talking about their Copilot and Cursor use openly on the company Slack.
There was someone sitting next to me using Cursor yesterday. I'd consider hiring them, if they're interested, but there's no way they're going to want to join a company that forbids using AI tools that upload code being worked on.
So I don't think companies are particularly consistent about this at the moment.
(Perhaps Apple's Private Cloud Compute service, and whatever equivalents we get from other cloud vendors, will eventuall make a difference to how companies see this stuff. We might also see some interesting developments with fully homomorphic encryption (FHE). That's very slow, but the highly symmetric tensor arithmetic used in ML has potential to work better with FHE than general purpose compute.)
For a lighter-weight IDE I use Zed
I recently went from an idea for a casual word game (aka wordle) to a fully polished product in about 2h, which would have taking me 4 or 5 times that if I hadn’t used Cursor. I estimate that 90% of the time was spent thinking about the product, directing the AI, and testing and about 10% of the time actually coding.
Unless you work in R&D i've got some bad news for you..
Using AI enabled me to spend more time thinking about game mechanics.
they didnt specifically mean they built a wordle clone, just a game like it. if they wanted just a wordle clone, they wouldve gotten one within a few minutes of using codegen tools.
Have you written about your experience anywhere in greater length?
I'm below average in a lot of programming languages and tools. Cursor is extremely useful there because I don't have to spend tens of minutes looking up APIs or language syntax.
On the other hand, in areas I know more about, I feel that I can still write better code than Cursor. This applies to general programming as well. So even if Cursor knows exactly how to write the syntax and which function to invoke, I often find the higher-level code structure it creates sub-optimal.
Overall, Cursor is an extremely useful tool. It will be interesting to see whether it will be able to crawl out of the primordial soup of averages.
It's a great intern, letting me focus on the few but important places that I add specific value. If this is all it ever does, that's still enormously valuable.
You can see in general anything AI produces is pretty average.
But people who buy software don't care that the code behind it is average. As long as it works.
Whereas people who buy text, images and video do care.
I’ve also been logging every interaction with an LLM and the exit status of the build on every mtime of every language mode file and all the metadata: I can easily plot when I lean on the thing and when I came out ahead, I can tag diffs that broke CI. I’m measuring it.
My conclusion is that I value LLMs for coding in exact the same way that the kids do: you have to break Google in order for me to give a fuck about Sonnet.
LLMs seem like magic unless you remember when search worked.
Yikes. I didn’t even think about this, but it’s true.
I’m looking for the kinds of answers that Google used to surface from stack overflow
Fully switched over more than a year ago and never looked back.
I just spend any amount of tokens to build a database of how 4o behaves correlated to everything emacs knows, which is everything. I’m putting down tens of megabytes a day on what exact point they did whatever thing.
They didn’t get ahead by selling you the same thing they do, if they did Continue would be parity.
As long as one prompts it properly with sufficient context, reviews the generated code, and asks it to revise as needed, the productivity boost is significant in my experience.
That will be scary. Until then, it's basically just a better autocomplete for any competent developer.
If there’s no computation then there’s no computer science. It may be the case that Excel with attitude was a bubble in hiring.
But Sonnet and 4o both suck at why CUDA isn’t detected on this SkyPilot resource.
Personally, I find this kind of workflow totally counter-productive. My own programming workflow is ~90% mental work / doing sketches with pen & paper, and ~10% writing the code. When I do sit down to write the code, I know already what I want to write, don't need suggestions.
this seems physiologically unlikely.
I've been in compilers, storage, and data backends for 15ish years, and had to do a little project that required recording audio clips in a browser and sending them over a websocket. Cursor helped me do it in about 5 minutes, while it would've taken at least 30 min of googling to find the relevant keywords like MediaStream and MediaRecorder, learn enough to whip something up, fail, then try to fix it until it worked.
Then I had to switch to streaming audio in near-realtime... here it wasn't as good: it tried sending segments of MediaRecorder audio which are not suitable for streaming (because of media file headers and stuff). But a bit of Googling, finding out about Web Audio APIs and Audio Worklet, and a bit of prompting, and it basically wrote something that almost worked. Sure it had some concurrency bugs like reading from the same buffer that it's overwriting in another thread. But that's why we're checking the generated code, right?
But eventually you get to a point where you've solved variations of the problem hundreds of times before, and it's just hours of time being burnt away writing it again with small adjustments.
It's like getting into making physical things with only a screwdriver and a hammer. Working with your hands on those little projects is fun. Then eventually you level up your skills and realize making massive things is much easier with a power drill and some automated equipment, and gives you time to focus on the design and intricacies of far more complicated projects. Though there are always those times where you just want to spend a weekend fiddling with basics for fun.
The rest is then general design and archiceture, where LLMs really don't help much with. What they are really good for is to get an idea of possible options in spaces were you have little experience or to quickly explain and summarize specific solutions and their pros and cons. But I tried to make it pick a solution based on the constraints and even with many tries and careful descriptions, the results were really bad.
i think for the most part its meant to help you "get past" all the generic code you usually write in the beginning of a project, generic functions you need in almost all systems, etc.
I'm not saying that I don't like writing code. I'm just saying that doing a lot of it can be mentally exhausting. Sometimes I'd just prefer to ship feature-complete stuff on-time and on-budget, then go back to my kids and wife without feeling like my brain is mush.
If you don't know how to handle a bike, the ebike won't help you in these situations. (You might even get yourself in a tricky spot).
But if you know how to ride, it can be really fun.
Same with code. If you know how to code it can make you much more productive. If you don't know how to code, you get into tricky spots...
In some cases, LLMs act as a stackoverflow replacement for me, like „sort this with bubble sort, by property X“. I’d also ask it to write some test cases around that. I won’t import a bubble sort library just for this, but I also don’t want to spend any more time than necessary, implementing this for the nth time.
That said, I think everyone can relate to wasting an awful lot of time on things that are not "interesting" from the perspective of the project you are working on. For example, I can't count the number of hours I've spent trying to get something specific to work in webpack, and there is no payoff because today the fashionable tool is vite and tomorrow it'll be something else. I still want to know my code inside and out, but writing a deploy script for it should not be something I need to spend time on. If I had a junior dev working for me for pennies a day, I would absolutely delegate that stuff to them.
Crap like that, 100 times a day.
"Walk through this array and pull out every element without an index field and add it to a new array called needsToBeIndexed, send them off to the indexing service, and log any failures to the log file as shown in the function above".
Cursor lets me think closer to the level of architecting software.
Sure having a deep knowledge of my language of choices is fun, and very needed at times, but for the 40% or so of code that is boring work of moving data around, Cursor helps a lot.
Deleted Comment
Dead Comment