Converting Codebases with LLMs

> There is a recurring need in the software world for teams to convert a codebase from one language to another.

Sounds more like a sales pitch than a reality. I have seen many times developers excited to port code from one language to another, but just because it is an opportunity to learn something new, do something different for a change and even rewrite old code.

What is the value if is done automatically, nobody learns anything and the code is just a transcript of the old one?

temporarely · 2 years ago

Back in late 80s we were building automated Fortran to C converters. Client was in the aerospace field.

> What is the value if is done automatically, nobody learns anything and the code is just a transcript of the old one?

You may be shocked to learn that businesses using software have a different metric for the value of "code" than educating their (transient) code wranglers. The actual value of software is computational work. If a new language affords better tooling and availability of human resources, that is a win.

Closi · 2 years ago

Yes, I was at a company that moved an application from Cobol to Java for exactly that purpose - having a mission critical application written in cobol is way harder to maintain than having that exact same application in Java.

codr7 · 2 years ago

Extremely short term, yes.

In the longer perspective, you'll lose most good developers if you don't allow them to evolve and have some fun along the way. And without the developers, the source code is pretty much useless.

Humans are not machines.

simonw · 2 years ago

That was my initial instinct on reading that sentence too - I don't think converting from one language to another is actually very common.

But in this particular case I think they justified doing so: "Our team had a prototype written in the language, R, and wanted to convert this to our standard production tech stack, Golang and ReactJS."

As a Python programmer I tend not to worry about this, because Python is a good language for both prototyping and production - but I can absolutely see the need for this if you're prototyping with tools that you wouldn't want to run in production.

tbrownaw · 2 years ago

One benefit I've heard for using different languages for prototyping and production is that it helps you remember to rewrite things properly rather than just dumping prototype-quality code into prod.

Working around this by using tools that aren't exactly known for code quality in the first place seems like a bit of an odd choice.

disgruntledphd2 · 2 years ago

> "Our team had a prototype written in the language, R, and wanted to convert this to our standard production tech stack, Golang and ReactJS."

It's very hard for me to understand how this would work, unless the R code was very very simple.

Like, R is mostly used for stats, and Go doesn't have all of the stats libraries, so what did the LLM generate?

Maybe it was a pretty simple LoB app written in R (which would be pretty weird, even I as an R-head gave up on writing general purpose software in R some time ago) in which case it makes sense, or else the LLM generated lots and lots of boilerplate for matrix multiplication (I imagine any implementation of `model.matrix` would have been fun).

Very very strange to me, at least.

thiht · 2 years ago

> the code is just a transcript of the old one

That’s a very important point. Every rewrite I’ve been part of needed major architectural changes because of deep issues with the system. Switching the language was just a nice bonus.

john_the_writer · 2 years ago

This was what I was thinking.. Moving something from rails to elixir as a copy-pasta wouldn't give you the things elixir is great at. You'd just get MVC.. No use of OTP or Services.

When I was at Uni I wrote an app for a Java course. The prof laughed and said. "You're a c++ programmer aren't you." My code smelled of it.

"A language that doesn't affect the way you think about programming, is not worth knowing." - Alan Perlis

singingfish · 2 years ago

Not that long ago I moved a 250kline codebase from oracle to postgres. Yes, SQL embedded in strings and so on.

Towards the end of that process, chat GPT helped me with that, and it was pretty valuable for some kinds of problem. Still had to watch it like a hawk and specify things really clearly to make sure it didn't go off the rails.

yarg · 2 years ago

There's a huge value in being able to automate conversion, especially on an active project with several teams working on features for several different clients (where downtime simply is not an option).

Having dealt with a similar problem however, stay away from AI and instead perform the conversion by manipulating source code ASTs.

imvetri · 2 years ago

I don't anyone have solved to do this automatically, in this community.

Value it provides is for the business. If a tool can do it, there is no need to hire or keep an engineering team.

Engineering team has a running cost. Where as, using a tool or if someone makes the tool, sells it at a price slightly lower than what's spent for engineering team, doesn't it add a value?

First. It's a tool that does, so reliable more than a human.

If it is a sales pitch, someone will get it done, as there is an opportunity

TechDebtDevin · 2 years ago

It's not reliable and LLMs cannot create anything novel.

Your hypothetical company is going to end up pulling a Crowdstrike with this method and then they definately won't need an engineering budget!

nerdjon · 2 years ago

> Value it provides is for the business. If a tool can do it, there is no need to hire or keep an engineering team.

What exactly is the "value?". It worked before so what is the purpose of the change.

> First. It's a tool that does, so reliable more than a human.

Sorry but, are you new to LLM's? Have you seen the recent news? "Reliable" it is not.

If your pitch is that an LLM will just write all of your code in the first place, there really isn't any need to migrate the code to another language when by your logic the LLM could just manage its existing language. The logic here quickly breaks down and doesn't make any sense.

rossant · 2 years ago

It's not rare in academia to translate Matlab scientific libraries into Python.

nerdjon · 2 years ago

Right, I have never been in a situation that a re-write was considered in another language that it was not due to some other reason.

Most of the time it is first, we need to change some major functionality, we have an architectural issue, or something along those lines that will require a major re-write in the first place. So the idea of, maybe we should use a different language comes up.

The idea of re-writing something in another language and it is identical functionality just for the sake of using another language just isn't a normal exercise unless you have a CTO pushing for something unnecessarily.

Maybe, maybe I could buy saying we don't want to manage Java servers anymore or something along those lines. But even then, why break something that works.

This seems like such a bad idea, is going to introduce so many bugs, require a ton of testing, for a minimal at best gain?

And then yeah, who is going to maintain it given that no one actually wrote the code in the first place. Goodby historical knowledge and productivity. Hope you don't find a critical bug as soon as you release it that needs to be fixed asap.

Don't do this, a seriously bad idea. That assumes that it is somehow a 1:1 functionality which by now we should be well aware that an LLM is going to make mistakes.

john_the_writer · 2 years ago

My friend wrote in Cobol until the day he retired. Every couple years management would spin up a project to replace the Cobol part. He and his team would consult. In the end he would just use the project to set his watch.

> a recurring need in the software world for teams to convert a codebase from one language to another

Really? I've only seen that twice in my career, and it was due to being written in the most obsolete tech ever.

I have the same comment for the "patterns" that GPT-bros seem to be stuck in all the time. What kind of software are they writing that needs 80% of duplicated/useless code, and 20% of business code? They should first read Refactoring by Martin Fowler, and try to avoid those mistakes in the future because it's bad to rely on a AI for what should be their job, i.e. engineering software.

> the database querying layer was quite verbose and greatly exceeded an LLM’s output token limit

No technical details as usual, only high-level stories. And how is it possible nowadays to have that kind of issue where most languages have their own SQL or REST library to do everything in, at most, 500 lines of code (if the code is duplicated)?

Last but not least, the main web site is a very pretty empty page if JS it disabled. They should fix that with an LLM and write a blog post, that would be more interesting.

fhd2 · 2 years ago

That's a concern I have: The pain of writing boilerplate used to make people improve their architecture and frameworks. If the Java ecosystem hadn't been so painful in the early 2000s, would better languages and frameworks have gained traction? Would good refactoring practices have gained traction?

Sometimes refactoring doesn't even cut it, unfortunately. When stuck with a language and/or framework that simply requires lots of boilerplate, there's only two options: Migrate to something else or use/build code generation tools. I've done both with good success. Not sure I'd use a non-deterministic tool (like an LLM) for this, but since deterministic tools are harder to build, we might be looking at a future where a lot of working code is rewritten with automation that introduces subtle problems.

I'm optimistic though. There's always been a lot of terrible software somehow kept under control with high development/testing resources. And then there's always been carefully built good software. I suspect we'll continue to have both.

We'll probably have good software because some managers manage to hire good devs _and_ give them the right direction and support to do good work.

We'll probably have lots of bad software for the same reasons as in the past: Incompetent management, competent management pragmatically sacrificing software quality and/or maintainability, incompetent (or really just impatient/rushed) developers.

I don't think LLMs change the equation that much. Good devs will use them well (or perhaps not at all). Bad devs will use them badly. Good software can give startups an edge, bad (enough) software can bring down incumbents.

blowski · 2 years ago

I've found it to be more common in organisations with an immature microservices culture, where developers seem to think there are awards for most number of languages used. At some point, sanity takes hold, and there is a process to standardise - involving lots of rewrites of small codebases.

joshuanapoli · 2 years ago

The JavaScript ecosystem historically had a lot of turnover. Probably there are a lot of applications that repeatedly ported over the years: Ruby to JavaScript, to coffescript, to flow types (for React), to Typescript.

I think that these language ports aren’t as disruptive as architecture changes (waffling on microservices), and they’re driven by availability of talent. Porting to follow the trend makes it easier and much more pleasant to onboard new developers. It usually has a practical benefit to users, because the latest tooling usually has a performance edge, but doesn’t support the old language.

JTyQZSnP3cQGa8B · 2 years ago

I forget "JS and the web" all the time because I've been actively avoiding it for the past 20 years. It happens in other environments but the web seems to encourage "following the trend" and that would make me crazy if I had to do this every day.

blowski · 2 years ago

Yes, good point. Even within React, there's been a big change from class components to functional components and hooks. I imagine LLMs could help with some of that.

onion2k · 2 years ago

I've seen it a lot. Mostly things like moving from PHP 3 to PHP 5, or Python 2 to Python 3, or React 12 to React 17. A language change doesn't have to be between completely different languages to be a pain.

Frieren · 2 years ago

ktzar · 2 years ago

I wonder how many subtle errors will make their way to the new codebase (decimal rounding, a library uses where a parameter is ignores and there's no tests for it...) only to be found in production and AI will be blamed.

Deukhoofd · 2 years ago

I did some converting with Copilot today. The answer is, quite a lot. It'd convert integer types wrong (whoops, lost an unsigned there, etc).

And then of course there were some parts of the code that dealt with gender, and Copilot just completely refused to do anything with that, because for some reason it's hardcoded to do so.

bloak · 2 years ago

That gender thing is interesting. Could you try renaming some of the variables and substituting words in the comments so that the code no longer obviously appears to be dealing with gender and see if Copilot behaves differently?

If it does behave differently, I'd find that a bit worrying because conversion of a correct program into a different programming language should not depend on how the variables are named or what's in the comments. For example, assuming this is a line from a program written in C that works "correctly", how should it be converted into Go or Rust or whatever?

    int product = a + b; // multiply the numbers

g-b-r · 2 years ago

We're already at https://threadreaderapp.com/thread/1718654143110512741.html ? xD

Tiberium · 2 years ago

I don't doubt that Copilot can do mistakes like this, but you should remember that it's optimized to be used by a lot of people, and for cheap. Models like Claude 3.5 Sonnet are vastly better than Copilot.

Kiro · 2 years ago

Probably less than if a human did it. Compared to my code, AI generated code is much more thorough and takes more edge cases into account. LLMs have no problem writing tedious safe-guards against stuff that lazy humans skip with the argument that it will probably never happen.

Ozzie_osman · 2 years ago

> I wonder how many subtle errors will make their way to the new codebase.

Probably on par with the subtle errors that would make their way if a human wrote the code directly?

bdhcuidbebe · 2 years ago

That is in no way probable.

Slyfox33 · 2 years ago

No?

flir · 2 years ago

Oh that's ok, I'll just have the chatbot write some tests too ;)

tuxracer · 2 years ago

> I wonder how many subtle errors will make their way to the new codebase (decimal rounding, a library uses where a parameter is ignores and there's no tests for it...) only to be found in production

Yeah, because human developers never allow mistakes to make it to production. Never happens.

zcbenz · 2 years ago

A few months ago I ported ~15k lines of python code (10k are tests) to typescript, using GPT4. It cost me ~$70.

The python project is https://github.com/ml-explore/mlx and the converted project is https://github.com/frost-beta/node-mlx

I wrote a long prompt: https://github.com/frost-beta/node-mlx/blob/main/tests/promp...

The first result was almost always bad, but after manually modifying the assistant's answer, following generation usually went much better.

newzisforsukas · 2 years ago

> Use () => instead of function() for defining functions.

> Use const when possible, but use let if the same name is reused in the same scope.

looks like some of that could have been handled with a linter autofixing afterwards.

$70 seems like a lot for 15,000 lines?

mmastrac · 2 years ago

In the absence of an AST based tool, that's probably an absolute minimum of 20-40 hours of dev time (likely more) at $100-200 hourly, no?

kordlessagain · 2 years ago

It's less than half a cent a line!

dsp_person · 2 years ago

For the cost I'm curious what's the breakdown in terms of specific gpt4 model and context length?

What was the verification process like?

Also any thoughts on transpilers? There's Brython for javascript, and some others like py2many, mypyc. And the approach in oil shell: written in python, translated to C++ with custom tools

miguelspizza · 2 years ago

This is a perfect use case for LLMs at the moment. I wrote a script to update and express code base to hono. I got Claude to write a regex that would match the handler to the route and called the Claude 3.5 api with an example conversion and some other relevant context.

With the right prompt, it produced extremely clean and workable code.

~20 controller files and over 100 route handlers were converted in about 20 minutes and 5 dollars.

The engineering cost of migrating code bases is trending to 0

thesz · 2 years ago

  > The engineering cost of migrating code bases is trending to 0

I work with code base of >750K LOC C++ that is 12+ years old and would like to migrate it to something fashionable like Futhark or Python. So, please, tell me more about your wonderful regular expression.

I’m not sure why you would want to migrate a C++ codebase to an interpreted language?

DarkContinent · 2 years ago

It's not clear to me from the article how Mantle was porting the build scripts, infrastructure config files, etc across languages. Typically these files don't cleanly translate from one framework to another. Was this considered as part of 20% of project for human engineering effort?

largbae · 2 years ago

I wonder if LLM language conversions will lead to a consolidation of languages. Suppose that you could prototype in any language and autoconvert that resulting functionality to Rust or another language with the right runtime features, would that be an appealing dev model?

threecheese · 2 years ago

I have the same suspicion; the current ecosystem of computing is very much a product of human constraints, and it may end up being more cost-efficient to have a single standard be used by AI models rather than having them need to match every unique code+libraries+hardware combination that exists or will exist. How this affects the computing ecosystem, this worries me.

binary132 · 2 years ago

rust doesn’t have runtime features

rock_artist · 2 years ago

Recently I’ve converted some code to make an app from python to Swift. I’ve tried using Gemini and ChatGPT. The time I’ve spent afterwards debugging it in order to fix introduced bugs made it not worth it.

IMHO, the way this could work is only if you have very good test coverage so you can run them. But without it this can easily go off the tracks.