Has Cloud Functions solved their logging issues yet?
Last I checked, the lifetime of the function was bound to the lifetime of the request. So you have to fully flush your logs to their log ingress service before you respond to a request, otherwise GCP will suspend the function and eat your logs.
Originally, this resulted in me having - to my surprise - very few logs coming from my functions.
Once I discovered what was going on, it resulted in me having to increase the latency of responding to the request while I waited for logging to flush to GCP's log ingress service.
Are you using a log service directly, or stdout? Stdout has no such issue. On Run (and therefore functions 2nd gen) you get a SIGINT before the container is shut down. Open Telemetry is pretty easy to configure to flush on SIGINT in the background.
My guess would be that their app isn’t running with PID 1. This can happen when services are launched from an entrypoint script in docker without using exec. If the request is fulfilled and the system sends a sigint to the container, docker will relay that to whatever is running on pid 1. If that’s their app, it’ll flush logs and quit. If their app is using another pid, docker will kill the container by force.
I built a service with a few functions (golang based) that ran just fine for over a year with no log flushing issues.
I did actually have an issue where it was logging two blank lines, which was annoying. It definitely wasn't coming from my code and it got fixed at one point, then reappeared later on. I gave up trying to resolve it.
That said, their logging system is fantastic. I found it really easy to deal with.
Last I checked, the lifetime of the function was bound to the lifetime of the request. So you have to fully flush your logs to their log ingress service before you respond to a request, otherwise GCP will suspend the function and eat your logs.
Originally, this resulted in me having - to my surprise - very few logs coming from my functions.
Once I discovered what was going on, it resulted in me having to increase the latency of responding to the request while I waited for logging to flush to GCP's log ingress service.