Hacker Noon: Serverless App: AWS CloudTrail Log Analytics using Amazon Elasticsearch Service

  • Monday, 12 February 2018 12:36
In this article, I’m will talk about how you can build a Serverless application using AWS Serverless Application Model (SAM) to perform Log Analytics on AWS CloudTrail data using Amazon Elasticsearch Service. The AWS Serverless Application will help you analyze AWS CloudTrail Logs using Amazon Elasticsearch Service. The application creates CloudTrail trail, sets the log delivery to an s3 bucket that it creates and configures SNS delivery whenever the CloudTrail log file has been written to s3. The app alsocreates an Amazon Elasticsearch Domain and creates an Amazon Lambda Function which gets triggered by the SNS message, get the s3 file location, read the contents from the s3 file and write the data to Elasticsearch for analytics.Let’s learn about what is AWS CloudTrail, Elasticsearch, Amazon Elasticsearch Service, AWS Lambda and AWS SAM.What is AWS CloudTrail?AWS CloudTrail is a service that enables governance, compliance, operational auditing, and risk auditing of your AWS account. With CloudTrail, you can log, continuously monitor, and retain account activity related to actions across your AWS infrastructure. CloudTrail provides event history of your AWS account activity, including actions taken through the AWS Management Console, AWS SDKs, command line tools, and other AWS services. This event history simplifies security analysis, resource change tracking, and troubleshooting.AWS CloudTrailWhat is Elasticsearch?Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected.Elasticsearch: RESTful, Distributed Search & Analytics | ElasticWhat is Amazon Elasticsearch Service?Amazon Elasticsearch Service makes it easy to deploy, secure, operate, and scale Elasticsearch for log analytics, full text search, application monitoring, and more. Amazon Elasticsearch Service is a fully managed service that delivers Elasticsearch’s easy-to-use APIs and real-time analytics capabilities alongside the availability, scalability, and security that production workloads require.Amazon Elasticsearch Service - Amazon Web Services (AWS)What is AWS Lambda?AWS Lambda lets you run code without provisioning or managing servers. You pay only for the compute time you consume — there is no charge when your code is not running. With Lambda, you can run code for virtually any type of application or backend service — all with zero administration. Just upload your code and Lambda takes care of everything required to run and scale your code with high availability. You can set up your code to automatically trigger from other AWS services or call it directly from any web or mobile app.AWS Lambda - Serverless Compute - Amazon Web ServicesWhat is AWS Serverless Application Model?AWS Serverless Application Model (AWS SAM) prescribes rules for expressing Serverless applications on AWS. The goal of AWS SAM is to define a standard application model for Serverless applications.awslabs/serverless-application-modelNow let’s look at how we can build a Serverless App to perform Log Analytics on AWS CloudTrail data using Amazon Elasticsearch Service.This is the architecture of the CloudTrail Log Analytics Serverless Application:Architecture for Serverless Application: CloudTrail Log Analytics using ElasticsearchAWS Serverless Application Model is a AWS Cloudformation template. Before we look at the code for SAM template, let’s work on packaging our AWS Lambda.On your workstation, create a working folder for building the Serverless Application.Create a file called index.py for the AWS Lambda:""" This module reads the SNS message to get the S3 file location for cloudtrail log and stores into Elasticsearch. """from __future__ import print_functionimport jsonimport boto3import loggingimport datetimeimport gzipimport urllibimport osimport tracebackfrom StringIO import StringIOfrom exceptions import *# from awses.connection import AWSConnectionfrom elasticsearch import Elasticsearch, RequestsHttpConnectionfrom requests_aws4auth import AWS4Authlogger = logging.getLogger()logger.setLevel(logging.INFO)s3 = boto3.client('s3', region_name=os.environ['AWS_REGION'])awsauth = AWS4Auth(os.environ['AWS_ACCESS_KEY_ID'], os.environ['AWS_SECRET_ACCESS_KEY'], os.environ['AWS_REGION'], 'es', session_token=os.environ['AWS_SESSION_TOKEN'])es = Elasticsearch( hosts=[{'host': os.environ['es_host'], 'port': 443}], http_auth=awsauth, use_ssl=True, verify_certs=True, connection_class=RequestsHttpConnection)def handler(event, context): logger.info('Event: ' + json.dumps(event, indent=2)) s3Bucket = json.loads(event['Records'][0]['Sns']['Message'])['s3Bucket'].encode('utf8') s3ObjectKey = urllib.unquote_plus(json.loads(event['Records'][0]['Sns']['Message'])['s3ObjectKey'][0].encode('utf8')) logger.info('S3 Bucket: ' + s3Bucket) logger.info('S3 Object Key: ' + s3ObjectKey) try: response = s3.get_object(Bucket=s3Bucket, Key=s3ObjectKey) content = gzip.GzipFile(fileobj=StringIO(response['Body'].read())).read() for record in json.loads(content)['Records']: recordJson = json.dumps(record) logger.info(recordJson) indexName = 'ct-' + datetime.datetime.now().strftime("%Y-%m-%d") res = es.index(index=indexName, doc_type='record', id=record['eventID'], body=recordJson) logger.info(res) return True except Exception as e: logger.error('Something went wrong: ' + str(e)) traceback.print_exc() return FalseCreate a file called requirements for the python packages that are needed:elasticsearch>=5.0.0,<6.0.0requests-aws4authWith the above requirements file created in your workspace, run the below command to install the required packages:python -m pip install -r requirements.txt -t ./Create a file called template.yaml that will store the code for AWS SAM:AWSTemplateFormatVersion: '2010-09-09'Transform: 'AWS::Serverless-2016-10-31'Description: > This SAM example creates the following resources: S3 Bucket: S3 Bucket to hold the CloudTrail Logs CloudTrail: Create CloudTrail trail for all regions and configures it to delivery logs to the above S3 Bucket SNS Topic: Configure SNS topic to receive notifications when the CloudTrail log file is created in s3 Elasticsearch Domain: Create Elasticsearch Domain to hold the CloudTrail logs for advanced analytics IAM Role: Create IAM Role for Lambda Execution and assigns Read Only S3 permission Lambda Function: Create Function which get's triggered when SNS receives notification, reads the contents from s3 and stores them in Elasticsearch DomainOutputs: S3Bucket: Description: "S3 Bucket Name where CloudTrail Logs are delivered" Value: !Ref S3Bucket LambdaFunction: Description: "Lambda Function that reads CloudTrail logs and stores them into Elasticsearch Domain" Value: !GetAtt Function.Arn ElasticsearchUrl: Description: "Elasticsearch Domain Endpoint that you can use to access the CloudTrail logs and analyze them" Value: !GetAtt ElasticsearchDomain.DomainEndpointResources: SNSTopic: Type: AWS::SNS::Topic SNSTopicPolicy: Type: "AWS::SNS::TopicPolicy" Properties: Topics: - Ref: "SNSTopic" PolicyDocument: Version: "2008-10-17" Statement: - Sid: "AWSCloudTrailSNSPolicy" Effect: "Allow" Principal: Service: "cloudtrail.amazonaws.com" Resource: "*" Action: "SNS:Publish" S3Bucket: Type: AWS::S3::Bucket S3BucketPolicy: Type: "AWS::S3::BucketPolicy" Properties: Bucket: Ref: S3Bucket PolicyDocument: Version: "2012-10-17" Statement: - Sid: "AWSCloudTrailAclCheck" Effect: "Allow" Principal: Service: "cloudtrail.amazonaws.com" Action: "s3:GetBucketAcl" Resource: !Sub |- arn:aws:s3:::${S3Bucket} - Sid: "AWSCloudTrailWrite" Effect: "Allow" Principal: Service: "cloudtrail.amazonaws.com" Action: "s3:PutObject" Resource: !Sub |- arn:aws:s3:::${S3Bucket}/AWSLogs/${AWS::AccountId}/* Condition: StringEquals: s3:x-amz-acl: "bucket-owner-full-control" CloudTrail: Type: AWS::CloudTrail::Trail DependsOn: - SNSTopicPolicy - S3BucketPolicy Properties: S3BucketName: Ref: S3Bucket SnsTopicName: Fn::GetAtt: - SNSTopic - TopicName IsLogging: true EnableLogFileValidation: true IncludeGlobalServiceEvents: true IsMultiRegionTrail: true FunctionIAMRole: Type: "AWS::IAM::Role" Properties: Path: "/" ManagedPolicyArns: - "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole" - "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess" AssumeRolePolicyDocument: Version: "2012-10-17" Statement: - Sid: "AllowLambdaServiceToAssumeRole" Effect: "Allow" Action: - "sts:AssumeRole" Principal: Service: - "lambda.amazonaws.com" ElasticsearchDomain: Type: AWS::Elasticsearch::Domain DependsOn: - FunctionIAMRole Properties: DomainName: "cloudtrail-log-analytics" ElasticsearchClusterConfig: InstanceCount: "2" EBSOptions: EBSEnabled: true Iops: 0 VolumeSize: 20 VolumeType: "gp2" AccessPolicies: Version: "2012-10-17" Statement: - Sid: "AllowFunctionIAMRoleESHTTPFullAccess" Effect: "Allow" Principal: AWS: !GetAtt FunctionIAMRole.Arn Action: "es:ESHttp*" Resource: !Sub |- arn:aws:es:${AWS::Region}:${AWS::AccountId}:domain/cloudtrail-log-analytics/* - Sid: "AllowFullAccesstoKibanaForEveryone" Effect: "Allow" Principal: AWS: "*" Action: "es:*" Resource: !Sub |- arn:aws:es:${AWS::Region}:${AWS::AccountId}:domain/cloudtrail-log-analytics/_plugin/kibana ElasticsearchVersion: "5.5" Function: Type: 'AWS::Serverless::Function' DependsOn: - ElasticsearchDomain - FunctionIAMRole Properties: Handler: index.handler Runtime: python2.7 CodeUri: ./ Role: !GetAtt FunctionIAMRole.Arn Events: SNSEvent: Type: SNS Properties: Topic: !Ref SNSTopic Environment: Variables: es_host: Fn::GetAtt: - ElasticsearchDomain - DomainEndpointPacking Artifacts and uploading them to s3:Run the following command to upload your artifacts to S3 and output a packaged template that can be readily deployed to CloudFormation.aws cloudformation package \ --template-file template.yaml \ --s3-bucket bucket-name \ --output-template-file serverless-output.yamlDeploying this AWS SAM to AWS CloudFormation:You can use aws cloudformation deploy CLI command to deploy the SAM template. Under-the-hood it creates and executes a changeset and waits until the deployment completes. It also prints debugging hints when the deployment fails. Run the following command to deploy the packaged template to a stack called cloudtrail-log-analytics:aws cloudformation deploy \ --template-file serverless-output.yaml \ --stack-name cloudtrail-log-analytics \ --capabilities CAPABILITY_IAMRefer to the documentation for more details.I recommend reading about Elasticsearch Service Access Policies using the documentation and modify the Access policy of the Elasticsearch domain to further fine tune the access policy.Once the Serverless application is deployed in your AWS account, It will automatically store the AWS CloudTrail data into Amazon Elasticsearch Service as soon as the log is delivered to s3. With the data in Elasticsearch, you can use Kibana to visualize the data in Elasticsearch and create the dashboards that you need on the AWS CloudTrail data.The above Serverless Application Model app is available at the below Github repo:ExpediaDotCom/cloudtrail-log-analyticsServerless App: AWS CloudTrail Log Analytics using Amazon Elasticsearch Service was originally published in Hacker Noon on Medium, where people are continuing the conversation by highlighting and responding to this story.

Additional Info

Leave a comment

Make sure you enter all the required information, indicated by an asterisk (*). HTML code is not allowed.

Disclaimer: As a news and information platform, also aggregate headlines from other sites, and republish small text snippets and images. We always link to original content on other sites, and thus follow a 'Fair Use' policy. For further content, we take great care to only publish original material, but since part of the content is user generated, we cannot guarantee this 100%. If you believe we violate this policy in any particular case, please contact us and we'll take appropriate action immediately.

Our main goal is to make crypto grow by making news and information more accessible for the masses.