There seems to be a strong "instrument everything" culture that, I think, misses the point. You want simple metrics (machine and service) for everything, but if your service gets an error every million requests or so, it might be overkill to trace every request. And, for the errors, you usually get a nice stack dump telling you where everything went wrong (and giving you a good idea of what was wrong).
At that point - and only at that point, I'd say it's worth to TEMPORARILY add increased logging and tracing. And yes, it's OK to add those and redeploy TO PRODUCTION.
The ideal setup is that you trace as much for some given time frame, if your stack supports compression and tiered storage it becomes cheap er
Doesn't matter how hard you guys worked to be profitable, it's about building trust and relationship with your audience.