Detecting Recon and Scanning Activity in AWS: A Crash Course
With the increasing transitions of various infrastructures into the cloud, blue teams can be left with a huge blind spot when it comes to finding various nefarious activities within cloud environments. Combine this with the rapid and instant deployment of services or instances, and things can get sticky fairly quick. In this post, I will go over four AWS-specific detections you can use to monitor potentially malicious activity within your AWS environment using Splunk, and our risk-based ShadowHawk platform. The first three are built using AWS’s Best Practices recommendations, with the last being a custom detection. You will also see how each of these detections add to the risk score, if you’re using risk-based alerting.
Our first three detections are based on GuardDuty ‘Findings’, where activity performed by a principal within the AWS environment appears suspicious. This activity is compared to a baseline within GuardDuty, which triggers a finding when an anomaly is found. In most cases, this means the principal (or user, in our case) has never performed said activity in the past. Following AWS best practices, we will alert off of recon activity for Network Permissions, Resource Permissions, and User Permissions. By combining these three we will be able to see who is poking around the environment, what they are poking at, and where they are poking from.
We start our search by looking in our AWS index for two different detail types (note the difference) - detail-type=”GuardDuty Finding” and detail.type=”Recon:IAMUser/NetworkPermissions” - and filter out any principals that are instances. We then rename a lot of fields because JSON nesting is fun, as well as regex out the src-user from our principalId field. Next we want to coalesce the src-user and userType fields. This is because we want to add extra risk later in the search to this event should the principalId performing the activity be the Root account. We won’t go into why seeing Root activity is bad (sometimes it’s legit), but just know that if you don’t have an uneasy feeling when you see it, you should ;) . After all that mess, we will upper the src-user for later formatting purposes, perform a DNS lookup on the src-ip, replace/remove any internal domain names from our lookup, and eval out the src-system using coalesce once more. The next block starting at the join, uses a custom lookup-table to map different internal information about the AWS tenant using the tenant-id field (renamed from detail.accountID). This will tell us the tenant’s name, what it’s used for, who owns it, and the support queue in which to direct any frustrations. We then normalize the src-user against LDAP to give us the user’s internal ID, department, and title.
If you’re not using risk-based alerting you could stop here, and work these as a one-off alert. We do use risk-based alerting however, so the remainder of the search is used to attribute points to the risk objects. We can see that our Root account would increase the impact and confidence ratings, as mentioned previously, and we attribute a score to the src-user (user performing actions), dest-system (AWS tenant), and src-system (where user is performing actions from). It’s important to note that we do not attribute any risk to the src-system when the external IP is one of ours. If you don’t do this, you may find your external IPs generating a lot of alerts as the risk object, depending on how you have different pieces of your infrastructure configured.
I won’t provide the search for each of the three Recon:IAMUser findings since it stays the exact same as below for each. The only difference is changing the detail.type field (for example, replace Recon:IAMUser/NetworkPermissions with Recon:IAMUser/ResourcePermissions). You can find all of the Recon detail.types in the AWS documentation located here.
Once we run our search, we will see the results below (hope ya’ll like redactions!). These results are from the Recon:IAMUser/ResourcePermissions finding, and tells us which API calls each user was performing and where it’s coming from. What is important to look at in our results is the source location and if the named user should be performing these API calls. These GuardDuty findings are essentially looking for the possibility of credential sharing/theft, and should be investigated as such.
High Connections Ingress - Web
The last detection we will look at is high ingress connections on ports 80 or 443 to our AWS instances. This is a custom detection, meaning there is not an AWS service such as GuardDuty or CloudTrail generating the initial alert, and has been found to help compliment a detection of the same type for connections to on-prem systems. High numbers in the results could mean possible communication with a C2 server, or <insert-nation-state-here> knocking on the door, but will normally consist of shotgun-style scanning activity from various sources. This type of mass-scanning can usually be determined by seeing your on-prem firewalls alert off the same source. However, you can see some fairly concerning events with this search.
For this search, we stay in our AWS index and sourcetype, filter out any internal source IPs, and look at only ports 80 and 443. Next we will stats out the needed fields while creating a few custom fields and renaming others in the process; using the accept-count and reject-count fields we created on-the-fly to do one more filter. We specifically want to look for connections that have an uneven number of accept vs reject counts, as an even number indicates all connections were blocked. Moving on, we perform another DNS lookup and convert our earliest and latest times to a readable format.
The remainder of the search is used to attribute risk to the source system based on connection count as described with the previous searches above. Something to note in this section is the variable impact rating. Depending on the frequency of the search, you can (and should) adjust these accordingly, as this will affect your risk scoring and may never bubble up the the surface and alert you to the potential malicious activity.
Running our search provides the results below. Re-sorting by descending accept-count shows a few interesting events that we may want to act upon. The first IP in our list shows 21 ACCEPT and 4 REJECT counts, over 17 different AWS IPs, from a time-frame of roughly 6am to 3pm on the same day. We can also see that this source IP performed this activity over 4 different AWS tenants (in my experience, this is classic scanner activity). All of these data points together in one event look a bit suspicious, but nothing screams BAD!. However, when you start digging into the first IP you’ll find that it is part of the China Unicom Backbone and, at the time of writing this, comes across multiple different analysis sites as ‘actively hostile’ and/or is listed on various blacklists (trust me, look it up). That’s enough BAD! for me to warrant the sending of a Block Request to whoever I need to get it blocked. Especially since, in the case of this IP, it can also be seen performing the same type of scanning activity on the on-prem firewalls.
By the way, the other 2 IPs are super dirty too.
As mentioned in the title, this post was simply a crash course on a few things worth looking at within your AWS environment. If nothing else, I hope you’ve reached this sentence with a few ideas on how to look for bad stuff in the cloud. These four detections are far from the amount recommended by AWS as Best Practice, and far from the amount we currently have in place, but they should provide a good start to locking down the monitoring side of your cloud services.
Thanks for checking out my first blog post with my current team, and keep your eyes open for more cool cloud stuff. I gots lots!