Suboptimal choice. According to AutoCodeBench, for equivalent problem complexity, LLMs generate correct Kotlin code ~70% of the time versus ~40% for Python, and Go scores lower than Python. Kotlin can be executed as a script while providing super fast compilation phase next to evaluation phase, which is further reducing a chance of mistakes. I don't use tools anymore. I just let my LLMs output Kotlin script directly together with DSLs tailored to the problem space, reducing cognitive load for the machine. It works like a charm as a Claude Code replacement, not only coding autonomously in any language, but directly scripting DB data science, Playwright, etc., while reducing context window bloat.
My advice - embrace TDD. Work with AI on tests, not implementation - your implementation is disposable, to be regenerated, tests fully specify your system through contracts. This is more tricky for UI than for logic. Embracing architectures allowing to test view model in separation might help. I general anything reducing cognitive load during inference time is worth doing.