That is an even lower level API that lets you manipulate the bytecode as a byte array. You still need to parse it to do anything useful, hence libraries like ASM. And if you want to compile more code at runtime (or generate bytecode), you'll need some way to do that.
It is complete, and I’ve found it extremely usable when writing code to trawl over a large number of class files. Looks like it should be good for code generation as well but I haven’t used that yet.
Reminds me of a side project I did when first starting CS! The Java byte code specification is absolutely approachable and if you've never looked at it before I recommend it (although this project says you can still use it without that knowledge)
It seems like micronaut has been able to avoid runtime bytecode generation by doing everything at compile-time. I wonder if there’s things that you can’t do the micronaut way.
- There are how many computer architectures? A compile-once-run-anywhere binary looks closer to shipping a fancy interpreter with your code than shipping a compiled project. Runtime bytecode generation is one technique for making that fast.
- More generally, anything you don't know till runtime generates a huge amount of bloat if you handle it at compile-time. Imagine, e.g., a UI for dragging and dropping ML components to create an architecture. For as much compute as you're about to pour into training, even for very simple problems, it's worth something that looks like a compilation pass to appropriately fuse everything together. You could probably get away with literally shipping a compiler, but bytecode generation is a reasonable solution too.
- Some things are literally impossible at compile-time without boxing and other overhead. E.g., once upon a time I made a zero-cost-abstraction library allowing you to specify an ML computational graph using the type system (most useful for problems where you're not just doing giant matmuls all day). It was in a language where mutually recursive generics are lazily generated, so you're able to express arbitrary nth derivatives still in the type system, still with zero overhead. What you can't do though is create a runtime program capable of creating arbitrary derivatives; there must be an upper bound for any finite-sized binary (for sufficiently complex starting functions) -- you could cap it at 2nd derivatives or 10th or whatever, but there would have to be a cap. If you move that to runtime though then you can have your cake and eat it too, less the cost of compiling (i.e., bytecode generation) at runtime.
Etc. It's a tradeoff between binary size (which might have to be infinite in the compiled case) and runtime overhead (having to "compile" for each new kind of input you find).
I haven't used micronaut specifically, but I remember using Quarkus when it was rather new. It also does a lot at compile-time compared to, say, spring. The one big disadvantage I noticed that it's had to eject if you need to defer something to runtime for some reason. Don't know if it's still an issue, but that's really the only disadvantage I remember
> The better question is why use Java for anything these days.
Java (the language) is pretty much "C for the JVM." By that, I mean frameworks/libraries intended for maximum potential use in languages running on the JVM (such as Kotlin, Scala, and of course Java) all support Java (the language) interoperability. Many written in alternate languages targeting the JVM, such as Akka[0], typically have some degree of Java (the language) support as well.
While I prefer to program in one of the alternate programming languages targeting the JVM, I understand why many OSS projects are implemented in Java (the language) for the reasons outlined above.
The problem is, if you are trying to optimize for the JVM, you are already down the wrong path. The JVM is useful in a very small niche when you want something that is faster than Python/Node , but still want cross platform support and somewhat rapid development. The cases where this applies are very niche.
It may allow closer to JVM access, however the entire ecosystem is a colossal mess. The main() implementation in having a class that wraps it is pretty dumb, standard stuff like Lombok hacks the AST (not to mention in general the annotation preprocessors work by printing code strings to file), and the whole dependency injection frameworks are very much separated from actual processing with how much stuff they do in the background.
And then there is the whole Apache foundation with its software being used widely as standard. The same foundation where someone wrote the code that allows log statements to pull arbitrary code from the internet and execute it, and that change made its way past multiple eyes before being merged to production without a single person realizing how crazy it is.
If you want speed, write stuff in C/Rust/Clean C++ (without templates, no C style memory access, e.t.c). If you want to be efficient, write stuff in Python/Node.
Kotlin is fatter, compiler is slower, code completion is slow as hell on large projects, but other than building small applications - there's really no reason to not use kotlin except for the fact that you need to actually learn the language or else you're going to end up with very very slow codebase where opening a file and waiting for syntax highlighting takes 2-3 seconds and typing autocomplete is just painfully slow.
A recent good reason for using Java is that frontier LLMs are trained with very large amounts of high quality enterprise Java source code. Claude Code for example loves Java and its static type system.
I constrain my LLM-generated Java code to only static methods of 20 LOC or less, and limit data types to those that are JSON compatible. Both of these lead to more reliable code and data that Claude Code fully understands and generates.
I am preparing to auto-generate an agent-based application that might reach 1.5 million Java LOC. Hard to imagine accomplishing that with Javascript or Python or C++.
https://openjdk.org/jeps/484
bytebuddy predates it by at least a decade.
This came to be, because Oracle noticed everyone, including themselves, were depending on ASM, so the JEP was born.
https://asm.ow2.io/
https://github.com/square/javapoet
I've used it to do a mass refactoring of an annotation-based library. Worked pretty great.
Deleted Comment
https://medium.com/@davethomas_9528/writing-hello-world-in-j...
- There are how many computer architectures? A compile-once-run-anywhere binary looks closer to shipping a fancy interpreter with your code than shipping a compiled project. Runtime bytecode generation is one technique for making that fast.
- More generally, anything you don't know till runtime generates a huge amount of bloat if you handle it at compile-time. Imagine, e.g., a UI for dragging and dropping ML components to create an architecture. For as much compute as you're about to pour into training, even for very simple problems, it's worth something that looks like a compilation pass to appropriately fuse everything together. You could probably get away with literally shipping a compiler, but bytecode generation is a reasonable solution too.
- Some things are literally impossible at compile-time without boxing and other overhead. E.g., once upon a time I made a zero-cost-abstraction library allowing you to specify an ML computational graph using the type system (most useful for problems where you're not just doing giant matmuls all day). It was in a language where mutually recursive generics are lazily generated, so you're able to express arbitrary nth derivatives still in the type system, still with zero overhead. What you can't do though is create a runtime program capable of creating arbitrary derivatives; there must be an upper bound for any finite-sized binary (for sufficiently complex starting functions) -- you could cap it at 2nd derivatives or 10th or whatever, but there would have to be a cap. If you move that to runtime though then you can have your cake and eat it too, less the cost of compiling (i.e., bytecode generation) at runtime.
Etc. It's a tradeoff between binary size (which might have to be infinite in the compiled case) and runtime overhead (having to "compile" for each new kind of input you find).
Java (the language) is pretty much "C for the JVM." By that, I mean frameworks/libraries intended for maximum potential use in languages running on the JVM (such as Kotlin, Scala, and of course Java) all support Java (the language) interoperability. Many written in alternate languages targeting the JVM, such as Akka[0], typically have some degree of Java (the language) support as well.
While I prefer to program in one of the alternate programming languages targeting the JVM, I understand why many OSS projects are implemented in Java (the language) for the reasons outlined above.
0 - https://github.com/akka/akka
It may allow closer to JVM access, however the entire ecosystem is a colossal mess. The main() implementation in having a class that wraps it is pretty dumb, standard stuff like Lombok hacks the AST (not to mention in general the annotation preprocessors work by printing code strings to file), and the whole dependency injection frameworks are very much separated from actual processing with how much stuff they do in the background.
And then there is the whole Apache foundation with its software being used widely as standard. The same foundation where someone wrote the code that allows log statements to pull arbitrary code from the internet and execute it, and that change made its way past multiple eyes before being merged to production without a single person realizing how crazy it is.
If you want speed, write stuff in C/Rust/Clean C++ (without templates, no C style memory access, e.t.c). If you want to be efficient, write stuff in Python/Node.
I constrain my LLM-generated Java code to only static methods of 20 LOC or less, and limit data types to those that are JSON compatible. Both of these lead to more reliable code and data that Claude Code fully understands and generates.
I am preparing to auto-generate an agent-based application that might reach 1.5 million Java LOC. Hard to imagine accomplishing that with Javascript or Python or C++.
Where did it get it from?