Thundra has more advanced features such as
- Automated instrumentation for tracing (no code change is required)
- Aggregating metrics and logs with traces
- Async monitoring over CloudWatch logs with almost zero overhead
- Distributed tracing support by AWS X-Ray integration
- and others ...
This is cool! Strategically are you worried about being built on top of an AWS specific solution? The only reason I ask is its pretty inevitable that AWS will be providing instrumentation / apm capabilities native to their offering.
Hi,
Thundra is not designed just for AWS Lambda. Even though currently we are supporting it on AWS Lambda platform, it will be available for other environments such as containers soon.
This would be super timely for me if it supported Python, but even without that it's interesting. Coming from the world of servers (virtual or physical) I find the logging latency and the lack of "observability" in general quite maddening in Lambda.
Two things I wonder about this:
1) The "no code changes" part implies it's adding another layer of code somewhere, which means another thing to trust and probably not be able to audit, which is not automatically OK outside the "move fast break things" world. Am I misreading it?
2) How can you add metrics with "no overhead?" Surely there must be some overhead... and then it would be nice to be able to measure how much.
For me personally, #1 is a real issue and #2 is just hypothetical, but I would expect it to be the other way around for a fast-growing startup.
You are right about lack of observability tools for AWS Lambda. Thundra was born because of same maddening at our side. You have very solid questions that we are enthusiastic to tell more and more:
1) Automatic instrumentation is what creates no code change. Once you add Thundra to your environment variables, you can change the monitoring settings with annotation or additional environment variables with no code change for Java. For Node.js and Go, You simply wrap your functions with our agents. With manual instrumentation on the other hand, you can also add code blocks to inspect your variables.
You can check our codes here: https://github.com/thundra-io. We are open to share more, if you want any further questions.
2) Zero overhead is one of other strong points of us. You need to switch to async. monitoring for this. This means you need to add "our" lambda to your environment variables (Please check for it: https://github.com/thundra-io/serverless-plugin-thundra-moni...). Then this lambda sends logs of your function to us. Overhead of sending and retrying for failed logs will not happen. Only overhead can occur because of make your code to gather more logs for our lambda to read. But this is truly negligible.
1) The "no code changes" means that, no code change is required by you. Because Thundra makes it for you automatically. This is not the new approach for existing APM and instrumentation tools but new for AWS Lambda which makes Thundra unique (and first) here at the moment. Additionally, Thundra can make method level CPU profiling without instrumenting the user code. It is supported by Java at the moment but we will support it for other languages Node.js, Go, Python soon.
2) "No overhead" means that almost no overhead. As far as we measured it is just a few microseconds in average (note that it is not milliseconds) for executing our interceptors and printing trace data to console. In respect to publishing monitor data outside of container synchronously, publishing them over CloudWatch logs asynchronously can be considered as no overhead in practice
I think it implies little overhead and no overhead if your servers is busy performing actual requests. Think UDP based communication and temporarily disabling or batching traces when the network or server is busy. That said, I don't know how Thundra does it.
If you're in a rush, IOpipe[1] already has Python support offering profiling, tracing, and other debugging and observability tools for AWS Lambda. Co-founder of IOpipe here, feel free to ask anything.
I've built a cloudwatch metrics stack that reads out of lambda's logs. There's no other way to get data out of your lambda for "free". XRAY is an option but not everything you'd want.
I went through the trouble of setting up X-Ray for three of our Serverless projects (~50 Lambda functions) and can't say that we've gotten much value from it. The UI is... lacking.
You can select a method and see the logs generated for that method. Idea is to have all the information - request parameters, return values, logs, metrics all in one place to make it easier to troubleshoot problems.
You can watch it here: https://www.youtube.com/watch?v=bmAw24uquK0
We tried using a similar service "IOPipe" last year and really had trouble with their offering. https://www.iopipe.com/
It would be nice to see more companies providing this kind of observability services ( especially for multiple cloud providers )
Thundra has more advanced features such as - Automated instrumentation for tracing (no code change is required) - Aggregating metrics and logs with traces - Async monitoring over CloudWatch logs with almost zero overhead - Distributed tracing support by AWS X-Ray integration - and others ...
Check www.thundra.io out for more
Two things I wonder about this:
1) The "no code changes" part implies it's adding another layer of code somewhere, which means another thing to trust and probably not be able to audit, which is not automatically OK outside the "move fast break things" world. Am I misreading it?
2) How can you add metrics with "no overhead?" Surely there must be some overhead... and then it would be nice to be able to measure how much.
For me personally, #1 is a real issue and #2 is just hypothetical, but I would expect it to be the other way around for a fast-growing startup.
1) Automatic instrumentation is what creates no code change. Once you add Thundra to your environment variables, you can change the monitoring settings with annotation or additional environment variables with no code change for Java. For Node.js and Go, You simply wrap your functions with our agents. With manual instrumentation on the other hand, you can also add code blocks to inspect your variables.
You can check our codes here: https://github.com/thundra-io. We are open to share more, if you want any further questions.
2) Zero overhead is one of other strong points of us. You need to switch to async. monitoring for this. This means you need to add "our" lambda to your environment variables (Please check for it: https://github.com/thundra-io/serverless-plugin-thundra-moni...). Then this lambda sends logs of your function to us. Overhead of sending and retrying for failed logs will not happen. Only overhead can occur because of make your code to gather more logs for our lambda to read. But this is truly negligible.
1) The "no code changes" means that, no code change is required by you. Because Thundra makes it for you automatically. This is not the new approach for existing APM and instrumentation tools but new for AWS Lambda which makes Thundra unique (and first) here at the moment. Additionally, Thundra can make method level CPU profiling without instrumenting the user code. It is supported by Java at the moment but we will support it for other languages Node.js, Go, Python soon.
2) "No overhead" means that almost no overhead. As far as we measured it is just a few microseconds in average (note that it is not milliseconds) for executing our interceptors and printing trace data to console. In respect to publishing monitor data outside of container synchronously, publishing them over CloudWatch logs asynchronously can be considered as no overhead in practice
I thought aws integrated Lambda with X Ray. Is that enough for your use cases along with Cloudwatch?
https://docs.aws.amazon.com/xray/latest/devguide/xray-servic...
Re: No overhead
I think it implies little overhead and no overhead if your servers is busy performing actual requests. Think UDP based communication and temporarily disabling or batching traces when the network or server is busy. That said, I don't know how Thundra does it.
Also see:
opencensus.io
https://research.google.com/pubs/pub36356.html
[1] - https://www.iopipe.com
This looks really interesting!
Also, the title here says "Python", but I don't see Python in the docs. I assume it is similar to Go, where you just import a library?
Can you try by enabling 3rd party cookies or whitelist Auth0's domain in your browser?
Any info on the Python question? Thanks!
Now PLEASE tell me your logo will be a cat?? (Yes I can see that you seem to going for 'bird' sigh)