In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article introduces you how to use Python to grab AWS log data, the content is very detailed, interested friends can refer to, hope to be helpful to you.
Today is the age of cloud, and many companies deploy their IT architecture on the infrastructure cloud (IaaS). Famous IaaS providers are Amazon, Azure, IBM, etc., and there are also domestic providers such as Aliyun. Amazon is undoubtedly the market leader here.
AWS offers a wide range of services, far ahead of its competitors. And AWS provides a very rich API, its API is based on Rest, so it is easy to be called by different language platforms.
In today's big data era, the use of data in decision-making is the core value of big data. AWS provides many services to obtain its operating data cloudtrail and cloudwatch are often used. CloudTrail is the log of all API calls to AWS, and CloudWatch is the performance data that monitors the AWS service. (the new Config service can be used to monitor resource changes in AWS)
Today we'll take a look at how to use Python (Boto AWS's open source Python SDK) to automatically configure ClouTrail's service and get the log content.
Let's first look at the concept of CloudTrail and the related configuration.
S3 Bucket
When opening CloudTrail's service, you need to specify that the relevant S3 Bucket,S3 is a storage service provided by Amazon, and you can think of it as a cloud-based file system. The API call log of CloudTrail will be stored in the Bucket you specified as a compressed file.
SNS
SNS is a notification service provided by Amazon, which uses the subscription / publish (Subsrcibe/Publish) model. When you create a CloudTrail, you can associate a Topic of the SNS (optional). The advantage of this is that you can be notified as soon as there is an API call. You can use different clients to subscribe to SNS notifications, such as Email,Mobile 's Notification Service,SQS, etc.
SQS
SQS is a queuing service provided by Amazon. In this article, we use SQS to subscribe to the content of SNS so that our Python program can get notifications from the queue of SQS.
Configure CloudTrail
First we need to create the SNS and specify the appropriate policy. The code is as follows:
Import boto.snsimport jsonkey_id='yourawskeyid'secret_key='yourawssecretkey'region_name= "eu-central-1" trail_topic_name= "topicABC" sns_policy_sid= "snspolicy0001" sns_conn = boto.sns.connect_to_region (region_name, aws_access_key_id=key_id Aws_secret_access_key=secret_key) sns_topic = sns_conn.create_topic (trail_topic_name) # Get ARN of SNS topicsns_arn = sns_topic ['CreateTopicResponse'] [' CreateTopicResult'] ['TopicArn'] # Add related policyattrs = sns_conn.get_topic_attributes (sns_arn) policy = attrs [' GetTopicAttributesResponse'] ['GetTopicAttributesResult'] [' Attributes'] ['Policy'] policy_obj = json.loads (policy) statements = policy_obj ['Statement'] default_statement = statements [0] new_statement = default_statement.copy () new_statement ["Sid"] = sns_policy_sidnew_statement ["Action"] = "SNS:Publish" new_statement ["Principal"] = {"AWS": ["arn:aws:iam::903692715234:root" "arn:aws:iam::035351147821:root", "arn:aws:iam::859597730677:root", "arn:aws:iam::814480443879:root", "arn:aws:iam::216624486486:root", "arn:aws:iam::086441151436:root", "arn:aws:iam::388731089494:root", "arn:aws:iam::284668455005:root" "arn:aws:iam::113285607260:root"} new_statement.pop ("Condition", None) statements.append (new_statement) new_policy = json.dumps (policy_obj) sns_conn.set_topic_attributes (sns_arn, "Policy", new_policy)
CloudTrail is related to Region, and different Region has different CloudTrail services, so when creating the corresponding SNS, you need to ensure that the same Region is used.
Note here that we created a new policy to give CloudTrail permission to publish messages (Action= "SNS:Publish") to the SNS we created. Our approach is to copy a copy from the default policy, modify the corresponding Action and Sid (any name that does not repeat), the Principal section is a default account list, here is hard-coded, AWS may change the value of the list, but in the current environment, the value is fixed. Finally, remove the value of Condition. Just add the newly created Policy fragment to the original Policy.
Then we need to create a queue for SQS and subscribe to the Topic of the SNS we created. This step is relatively simple.
Import boto.sqssqs_queue_name= "sqs_queue" sqs_conn = boto.sqs.connect_to_region (region_name, aws_access_key_id=key_id, aws_secret_access_key=secret_key) sqs_queue = sqs_conn.create_queue (sqs_queue_name) sns_conn.subscribe_sqs_queue (sns_arn, sqs_queue)
Then, we need to create a Bucket of S3 to store the log files generated by CloudTrail. Similarly, you need to specify a response policy to ensure that CloudTrail has permission to write to the corresponding log file.
Import botobucket_name= "bucket000" policy_sid= "testpolicy000" s3_conn = boto.connect_s3 (aws_access_key_id=key_id,aws_secret_access_key=secret_key) bucket = s3_conn.create_bucket (bucket_name) bucket_policy =''{"Version": "2012-10-17", "Statement": [{"Sid": "% Sid%GetPolicy" "Effect": "Allow", "Principal": {"AWS": ["arn:aws:iam::903692715234:root", "arn:aws:iam::035351147821:root" "arn:aws:iam::859597730677:root", "arn:aws:iam::814480443879:root", "arn:aws:iam::216624486486:root", "arn:aws:iam::086441151436:root" "arn:aws:iam::388731089494:root", "arn:aws:iam::284668455005:root", "arn:aws:iam::113285607260:root"]} "Action": "s3:GetBucketAcl", "Resource": "arn:aws:s3:::%bucket_name%"}, {"Sid": "% Sid%PutPolicy", "Effect": "Allow" "Principal": {"AWS": ["arn:aws:iam::903692715234:root", "arn:aws:iam::035351147821:root" "arn:aws:iam::859597730677:root", "arn:aws:iam::814480443879:root", "arn:aws:iam::216624486486:root", "arn:aws:iam::086441151436:root" "arn:aws:iam::388731089494:root", "arn:aws:iam::284668455005:root", "arn:aws:iam::113285607260:root"]} "Action": "s3:PutObject", "Resource": "arn:aws:s3:::%bucket_name%/*" "Condition": {"StringEquals": {"s3:x-amz-acl": "bucket-owner-full-control"} ]}''bucket_policy = bucket_policy.replace ("% bucket_name%" Bucket_name) bucket_policy = bucket_policy.replace ("% Sid%", policy_sid) bucket.set_policy (bucket_policy)
Here we use a default Policy file to replace the response field.
Finally, we create a service for CloudTrail:
Import boto.cloudtrailtrail_name= "Trailabc" log_prefix= "log" cloudtrail_conn=boto.cloudtrail.connect_to_region (region_name, aws_access_key_id=key_id, aws_secret_access_key=secret_key) # # cloudtrail_conn.describe_trails () cloudtrail_conn.create_trail (trail_name,bucket_name, s3_key_prefix=log_prefix Sns_topic_name=trail_topic_name) cloudtrail_conn.start_logging (trail_name)
OK, now that the CloudTrail is configured and the associated SNS is subscribed by the SQS queue we created, we can crawl the log.
Get log data
Whenever there is an API call, CloudTrail writes the log file of the response to the Bucket we created in S3, and publishes a message in the topic of the SNS we created, because we subscribe to the message using SQS's queue, so we can get the log data by reading the SQS message.
First connect to the queue of SQS and read messages from it
Import boto.sqssqs_queue_name= "sqs_queue" sqs_conn = boto.sqs.connect_to_region (region_name, aws_access_key_id=key_id Aws_secret_access_key=secret_key) sqs_queue = sqs_conn.get_queue (sqs_queue_name) notifications = sqs_queue.get_messages ()
Then we get the address of the response log file in S3 from the message, and use this address to obtain the corresponding log file from S3.
For notification in notifications: envelope = json.loads (notification.get_body ()) message = json.loads (envelope ['Message']) bucket_name = message [' s3Bucket'] s3_bucket = s3_conn.get_bucket (bucket_name) for key in message ['s3ObjectKey']: s3_file = s3_bucket.get_key (key) with io.BytesIO (s3_file.read () as bfile: With gzip.GzipFile (fileobj=bfile) as gz: logjson = json.loads (gz.read ())
Logjson is the JSON format of the corresponding diary content. Here's an example.
{"Records": [{"eventVersion": "1.0"," userIdentity ": {" type ":" IAMUser "," principalId ":" EX_PRINCIPAL_ID "," arn ":" arn:aws:iam::123456789012:user/Alice "," accessKeyId ":" EXAMPLE_KEY_ID "," accountId ":" 123456789012 " "userName": "Alice"}, "eventTime": "2014-03-06T21:22:54Z", "eventSource": "ec2.amazonaws.com", "eventName": "StartInstances", "awsRegion": "us-west-2", "sourceIPAddress": "205.251.233.176", "userAgent": "ec2-api-tools 1.6.12.2" "requestParameters": {"instancesSet": {"items": [{"instanceId": "i-ebeaf9e2"}]}} "responseElements": {"instancesSet": {"items": [{"instanceId": "i-ebeaf9e2", "currentState": {"code": 0, "name": "pending"} "previousState": {"code": 80, "name": "stopped"}]},. Additional entries...]}
You can use the above code to monitor all cloudtrail logs, and the logs in JSON format can be put in your database (Mongo is good), and then use your BI tool for analysis.
Note that you can not create SNS and SQS, directly scan the contents of bucket, the advantage of this is easier to configure, the disadvantage is poor real-time, scan Bucket requires additional calculation, and need to save the file scan state locally, code will be more complex.
With CloudTrail logs, you can do a lot of things, such as whether there are illegal logins and how often each service is used. in short, when you have enough data, you can find enough value in it.
On how to use Python to grab AWS log data is shared here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.