Guillermo Ojeda
Cloudy Things: How to build on AWS

Follow

Cloudy Things: How to build on AWS

Follow
How to Use AWS X-Ray to Monitor and Trace an Event-Driven Architecture on AWS

How to Use AWS X-Ray to Monitor and Trace an Event-Driven Architecture on AWS

Guillermo Ojeda's photo
Guillermo Ojeda
·Dec 20, 2022·

7 min read

An event-driven architecture is a design pattern in which events are the central component of the system. These events are typically generated by external sources and trigger a sequence of actions within the system. In an event-driven architecture, components don't directly invoke other components. Instead, components emit events with no knowledge of what other components might be subscribed to those events. This reduces coupling between components, allowing them to scale independently.

Challenges in Event-Driven Architectures

Managing an event-driven architecture can be quite a challenge due to its inherent complexity and distributed nature. Because of that distributed nature, tracking and debugging issues becomes much more difficult in event-driven architectures. What you need is proper tracing and monitoring.

Monitoring means collecting data and metrics about the system, so you can better understand its performance. This can include things like:

  • Tracking the number of events processed per second

  • Monitoring the response time of the system

  • Monitoring the overall health of the system

Tracing is the process of tracking a specific request or event as it travels through the system. This helps you understand how the system is processing a particular event, and identify bottlenecks and other issues.

Since event-driven architectures are complex and distributed, identifying the root cause of an issue becomes especially difficult. With no proper tracing mechanisms in place, this ranges from very hard to nearly impossible. I mean, what are you going to do? Combine logs from a dozen systems, hoping to match them for the same event based on timestamps? That can work for systems processing 1 request per minute, but when you're handling multiple requests per second, it's not going to get you anywhere.

Monitoring an event-driven architecture is another problem altogether. Individual components can be monitored just fine, but a holistic view of the system is much harder to achieve. System-wide metrics such as time to order completion become a guessing game if you can't identify which order is being processed at every component.

This is where AWS X-Ray comes in.

What is AWS X-Ray

AWS X-Ray is a service that enables you to monitor and trace applications and microservices, including serverless applications. It provides detailed visibility into the request and event flow of the system, giving you the tools you need to troubleshoot issues in event-driven architectures.

With AWS X-Ray you can trace requests and events as they travel through the system, view performance metrics in real-time, and generate reports. This is especially useful for event-driven architectures, where understanding how requests travel through the system is particularly difficult.

Let's consider an example event-driven architecture built on AWS using Node.js. We're gonna make a system that processes user orders. When a user places an order, it triggers an event that is picked up by an AWS Lambda function. This function then writes the order to a DynamoDB table and sends a message to an SNS topic. The SNS topic triggers another Lambda function, which sends a confirmation email to the user.

But wait! Let's imagine now that there's a problem with this system. Most orders are being processed successfully, but sometimes a user is not receiving the confirmation email. Without proper monitoring and tracing, it can be very difficult to identify the root cause of this problem.

Let's add AWS X-Ray and see what happens.

How to set up AWS X-Ray

First, we'll set up AWS X-Ray in our order-processing Lambda function.

How to set up AWS X-Ray for Node.js Lambda functions

Install the AWS X-Ray SDK for Node.js:

npm install aws-xray-sdk

Require the AWS X-Ray SDK at the top of your Lambda function code:

const AWSXRay = require('aws-xray-sdk');

Wrap the code in an X-Ray Segment:

const segment = new AWSXRay.Segment('my-function-name');

AWSXRay.captureAsyncFunc('my-function-name', function(callback) {
  // Your code
  callback();
});

If you're using the AWS SDK (for example to write to our DynamoDB table), you need to wrap it with the captureAWS function:

const AWS = AWSXRay.captureAWS(require('aws-sdk'));

If you are using HTTP requests, you can wrap the https package with the captureHTTPsGlobal function:

AWSXRay.captureHTTPsGlobal(require('https'));

Here's a complete example of a Lambda function that has been modified to use the AWS X-Ray SDK:

const AWSXRay = require('aws-xray-sdk');
const AWS = AWSXRay.captureAWS(require('aws-sdk'));
AWSXRay.captureHTTPsGlobal(require('https'));

exports.handler = async (event) => {
  const segment = new AWSXRay.Segment('my-function-name');

  try {
    // Your code goes here
  } catch (error) {
    console.error(error);
    throw error;
  } finally {
    segment.close();
  }
};

After we've added AWS X-Ray to our Node.js functions, we need to modify our DynamoDB table and SNS topic to be X-Ray enabled. This can be done through the AWS Management Console or the AWS CLI.

How to set up AWS X-Ray for DynamoDB

  1. In the list of tables, click on the name of the table.

  2. On the Table Details page, click the "Actions" dropdown and select "Modify".

  3. In the "Advanced settings" section, click the "Edit" button that's next to the "AWS X-Ray tracing" setting.

  4. In the "AWS X-Ray Tracing" dialog, select "Yes".

  5. Click "Save".

How to set up AWS X-Ray for SNS

  1. In the list of topics, click on the name of the SNS Topic.

  2. On the topic details page, click the "Actions" dropdown and select "Modify".

  3. In the "Advanced settings" section, click the "Edit" button next to the "AWS X-Ray tracing" setting.

  4. In the "AWS X-Ray Tracing" dialog, select "Yes".

  5. Click "Save".

Finally, we need to modify our email-sending function to use the X-Ray SDK. To do this, repeat the steps used for the first function.

Once AWS X-Ray is set up in every component, we can use it to solve the problem of some users not receiving the confirmation email.

We can use the console to view a trace of a request that is experiencing problems and see exactly where the request is getting stuck. We might find that the message sent to the SNS topic is not reaching its intended destination. Or perhaps the second Lambda function is encountering an error when trying to send the confirmation email.

Once we know where the problem is, we can set out to fix it. We can modify the Lambda function to retry the request if it fails, or add additional error handling logic for uncaught errors.

AWS X-Ray Best Practices

  • Enable X-Ray for all relevant components: To get the most out of X-Ray, it's important to enable it for all relevant components of your system. This includes Lambda functions, DynamoDB tables, and SNS topics.

  • Use the X-Ray console and API to view and analyze trace data: Use the console to view a visual representation of your trace data. Use the API to programmatically access and manipulate trace data, and automate tasks such as performance analysis or error reporting.

  • Use segments and subsegments to add context to your trace data: Use segments to represent the overall flow of a request through the system. Use subsegments to represent specific actions within that flow. By adding context to your trace data, you can more easily understand and troubleshoot issues that may be occurring.

  • Enable sampling to reduce the overhead of tracing: Tracing can add overhead to your system, particularly if you are instrumenting a large number of functions or if your functions are being invoked frequently. To reduce this overhead, you can enable sampling in X-Ray. This will cause X-Ray to only trace a subset of requests, which can significantly reduce the overhead of tracing, as well as the costs.

  • Use custom attributes to add context to your trace data: Custom attributes are key-value pairs that you can add to your trace data to provide additional context. Use custom attributes to add metadata about your requests, such as user IDs or request IDs.

  • Use the X-Ray SDK to add error handling to your Lambda functions: The X-Ray SDK provides a captureFunc method that you can use to capture errors and add them to trace data.

In conclusion, AWS X-Ray is an essential tool for monitoring and tracing event-driven architectures in AWS. As we've seen, event-driven architectures are distributed, and can get pretty complex. That makes it difficult to understand how requests and events travel through the system. AWS X-Ray provides detailed visibility into the request and event flow of the system. That way you can troubleshoot issues more easily, and better understand the performance of your system.

Keep in mind that AWS X-Ray is not just for Lambda functions. It also integrates with AWS services such as SNS and DynamoDB. This enables you to monitor and trace your whole event-driven architecture on AWS.

If you're working with an event-driven architecture on AWS, give AWS X-Ray a try. Don't let the complexity and distributed nature of event-driven architectures hold you back.


Thanks for reading!

If you're interested in building on AWS, check out my newsletter: Simple AWS.

It's free, runs every Monday, and tackles one use case at a time, with all the best practices you need.

If you want to know more about me, visit my website, www.guilleojeda.com