For CGI endpoints that are called infrequently—daily, weekly, or even monthly—the script startup delay is negligible. Even a persistent process waiting to handle a request can have its memory swapped to disk by the OS under varying server load. When a request finally arrives, the delay from swapping that memory back in can negate any theoretical advantage over a cold-starting CGI script.
I've been impressed with the memory feature, but I do see how it could be dangerous, especially if you use it for hobby projects and serious projects. For work, I use ChatGPT to generate my weekly status reports for work, and that will come in handy when doing my annual review or updating my resume. But I would not want that context to spill over into a personal project