Thundra: AWS Lambda Observability for Java, Go and Node.js

emrahsamdan · 8 years ago

Hey! We have made our first webinar! It includes a demo showing solving a real life problem in Java with RDS.

You can watch it here: https://www.youtube.com/watch?v=bmAw24uquK0

_Marak_ · 8 years ago

This looks promising!

We tried using a similar service "IOPipe" last year and really had trouble with their offering. https://www.iopipe.com/

It would be nice to see more companies providing this kind of observability services ( especially for multiple cloud providers )

dankohn1 · 8 years ago

CNCF is current tracking 8 projects/products in the serverless tools space, including Thundra and IOPipe: https://landscape.cncf.io/grouping=landscape&landscape=tools

sozal · 8 years ago

Hi,

Thundra has more advanced features such as - Automated instrumentation for tracing (no code change is required) - Aggregating metrics and logs with traces - Async monitoring over CloudWatch logs with almost zero overhead - Distributed tracing support by AWS X-Ray integration - and others ...

Check www.thundra.io out for more

thinkingkong · 8 years ago

This is cool! Strategically are you worried about being built on top of an AWS specific solution? The only reason I ask is its pretty inevitable that AWS will be providing instrumentation / apm capabilities native to their offering.

sozal · 8 years ago

Hi, Thundra is not designed just for AWS Lambda. Even though currently we are supporting it on AWS Lambda platform, it will be available for other environments such as containers soon.

biztos · 8 years ago

This would be super timely for me if it supported Python, but even without that it's interesting. Coming from the world of servers (virtual or physical) I find the logging latency and the lack of "observability" in general quite maddening in Lambda.

Two things I wonder about this:

1) The "no code changes" part implies it's adding another layer of code somewhere, which means another thing to trust and probably not be able to audit, which is not automatically OK outside the "move fast break things" world. Am I misreading it?

2) How can you add metrics with "no overhead?" Surely there must be some overhead... and then it would be nice to be able to measure how much.

For me personally, #1 is a real issue and #2 is just hypothetical, but I would expect it to be the other way around for a fast-growing startup.

emrahsamdan · 8 years ago

You are right about lack of observability tools for AWS Lambda. Thundra was born because of same maddening at our side. You have very solid questions that we are enthusiastic to tell more and more:

1) Automatic instrumentation is what creates no code change. Once you add Thundra to your environment variables, you can change the monitoring settings with annotation or additional environment variables with no code change for Java. For Node.js and Go, You simply wrap your functions with our agents. With manual instrumentation on the other hand, you can also add code blocks to inspect your variables.

You can check our codes here: https://github.com/thundra-io. We are open to share more, if you want any further questions.

2) Zero overhead is one of other strong points of us. You need to switch to async. monitoring for this. This means you need to add "our" lambda to your environment variables (Please check for it: https://github.com/thundra-io/serverless-plugin-thundra-moni...). Then this lambda sends logs of your function to us. Overhead of sending and retrying for failed logs will not happen. Only overhead can occur because of make your code to gather more logs for our lambda to read. But this is truly negligible.

sozal · 8 years ago

Hi,

1) The "no code changes" means that, no code change is required by you. Because Thundra makes it for you automatically. This is not the new approach for existing APM and instrumentation tools but new for AWS Lambda which makes Thundra unique (and first) here at the moment. Additionally, Thundra can make method level CPU profiling without instrumenting the user code. It is supported by Java at the moment but we will support it for other languages Node.js, Go, Python soon.

2) "No overhead" means that almost no overhead. As far as we measured it is just a few microseconds in average (note that it is not milliseconds) for executing our interceptors and printing trace data to console. In respect to publishing monitor data outside of container synchronously, publishing them over CloudWatch logs asynchronously can be considered as no overhead in practice

ignoramous · 8 years ago

Re: Lambda

I thought aws integrated Lambda with X Ray. Is that enough for your use cases along with Cloudwatch?

https://docs.aws.amazon.com/xray/latest/devguide/xray-servic...

Re: No overhead

I think it implies little overhead and no overhead if your servers is busy performing actual requests. Think UDP based communication and temporarily disabling or batching traces when the network or server is busy. That said, I don't know how Thundra does it.

Also see:

opencensus.io

https://research.google.com/pubs/pub36356.html

adjohn · 8 years ago

If you're in a rush, IOpipe[1] already has Python support offering profiling, tracing, and other debugging and observability tools for AWS Lambda. Co-founder of IOpipe here, feel free to ask anything.

[1] - https://www.iopipe.com

reconbot · 8 years ago

I've built a cloudwatch metrics stack that reads out of lambda's logs. There's no other way to get data out of your lambda for "free". XRAY is an option but not everything you'd want.

jelder · 8 years ago

I went through the trouble of setting up X-Ray for three of our Serverless projects (~50 Lambda functions) and can't say that we've gotten much value from it. The UI is... lacking.

emrahsamdan · 8 years ago

Thanks for giving a try! We are here to continuously improve our product with your feedbacks. What specifically can you recommend to improve our UI?

jedberg · 8 years ago

I tried creating an account but it just says: "Error: server_error. Check the console for further details." :(

This looks really interesting!

Also, the title here says "Python", but I don't see Python in the docs. I assume it is similar to Go, where you just import a library?

sozal · 8 years ago

Hi,

Can you try by enabling 3rd party cookies or whitelist Auth0's domain in your browser?

jedberg · 8 years ago

I switched browsers from Safari to Chrome and it worked.

Any info on the Python question? Thanks!

adevine · 8 years ago

I'm not quite sure I understand what the log support entails. Also, it says for Node.js that this is coming soon. Any idea of the ETA on this one?

berkay · 8 years ago

You can select a method and see the logs generated for that method. Idea is to have all the information - request parameters, return values, logs, metrics all in one place to make it easier to troubleshoot problems.

bap · 8 years ago

This is very timely for me and Good Work!

Now PLEASE tell me your logo will be a cat?? (Yes I can see that you seem to going for 'bird' sigh)

emrahsamdan · 8 years ago

Sorry for make you sad but Thundra is a bird. Inspired by Thundra (a.k.a queen of rainforest :)) in Aladdin: http://disney.wikia.com/wiki/Thundra.