Just this week, AWS added support for IAM Roles to be able to report on when they were last used. This feature is particularly useful in environments with a lot of activity and where you want to manage the usage of roles and limit the number of “stale” roles.

As the number of roles within your AWS estate grow, without appropriate governance on how they are being used - the potential for a security breach is likely to increase where roles with lots of access are left lying around way beyond when they’re needed.

So today’s post is going to take a look into how we can make use of this new feature to detect roles older than a certain date and delete them.

We are also going to make use of AWS X-Ray as this is a service I’ve been meaning to dig into for a little while and think this sort of exercise might be the perfect pet-project to have a go at finding out more and seeing what we can do with it.

Getting “Last Used” Date for your Roles

As the API feature is fairly new - you’ll want to make sure you’ve upgraded to the latest CLI/SDK you would normally use. For the purpose of this post, I will be using python - Boto3 is currently at 1.10.25 for me.

To dig into what kind of information we get - I first got a list of all IAM roles and then printed out what keys I got back with Boto3 IAM list_roles() API

import boto3 client = boto3 . client ( 'iam' ) resource = boto3 . resource ( 'iam' ) roles = client . list_roles ( ) for r in roles [ 'Roles' ] : role = resource . Role ( name = r [ 'RoleName' ] ) print ( role . role_last_used ( ) )

From this code, we see the data for each role’s Last Used data

➜ role-destroyer python destroy.py { 'LastUsedDate' : datetime.datetime ( 2019 , 8 , 21 , 11 , 54 , tzinfo = tzutc ( )) , 'Region' : 'eu-west-1' } { 'LastUsedDate' : datetime.datetime ( 2019 , 11 , 22 , 10 , 19 , tzinfo = tzutc ( )) , 'Region' : 'eu-west-1' } { 'LastUsedDate' : datetime.datetime ( 2019 , 3 , 4 , 17 , 43 , tzinfo = tzutc ( )) , 'Region' : 'eu-west-1' } { 'LastUsedDate' : datetime.datetime ( 2019 , 3 , 3 , 20 , 3 , tzinfo = tzutc ( )) , 'Region' : 'eu-west-1' } { 'LastUsedDate' : datetime.datetime ( 2019 , 3 , 3 , 19 , 48 , tzinfo = tzutc ( )) , 'Region' : 'eu-west-1' } { 'LastUsedDate' : datetime.datetime ( 2019 , 3 , 3 , 19 , 30 , tzinfo = tzutc ( )) , 'Region' : 'eu-west-1' } { 'LastUsedDate' : datetime.datetime ( 2019 , 3 , 3 , 19 , 22 , tzinfo = tzutc ( )) , 'Region' : 'eu-west-1' } { } { }

Perfect! A timestamp of last use and even the region it was last used in - we won’t do anything with the region just now as we are writing this code to clean up but you could easily turn this around into a detective tool and alert on region usage outside a list you want to restrict against.

Something to note - is if a role has no usage info, we get an empty dictionary.

Next, we want to work out for each role if it is older than a certain time range - for this, I am going to say 100 days. As we already have the date of usage in a Python timestamp, this becomes fairly simple - we need the date/time right now, then we simply subtract the two to figure out how long ago the role usage was from today.

First up, let us fix our imports and create a variable for today’s date/time and a number of days we will check for

from datetime import datetime , timedelta import boto3 time_now = datetime . now ( ) days_to_delete = 100

Now within our roles loop:

. . . for r in roles [ 'Roles' ] : role = resource . Role ( name = r [ 'RoleName' ] ) try : if role . role_last_used : time_diff = time_now - \ role . role_last_used [ 'LastUsedDate' ] . replace ( tzinfo = None ) if time_diff . days >= days_to_delete : role . delete ( ) except Exception as e : print ( e )

If you run the code now you’ll likely see we get a bunch of errors because we are trying to Delete roles that still have IAM policies in place.

An error occurred ( DeleteConflict ) when calling the DeleteRole operation: Cannot delete entity, must detach all policies first. An error occurred ( DeleteConflict ) when calling the DeleteRole operation: Cannot delete entity, must detach all policies first. An error occurred ( DeleteConflict ) when calling the DeleteRole operation: Cannot delete entity, must detach all policies first. .. .

To fix this - we need to first detach any policies from the roles we want to delete. To save us some pain and time too, I know we need to remove the role from any IAM Instance Profiles as well as delete any inline policies - let’s get cleaning!

. . . for r in roles [ 'Roles' ] : role = resource . Role ( name = r [ 'RoleName' ] ) try : if role . role_last_used : time_diff = time_now - \ role . role_last_used [ 'LastUsedDate' ] . replace ( tzinfo = None ) if time_diff . days >= days_to_delete : print ( f"Attempting to delete Role { role . name } ." ) for policy in role . attached_policies . all ( ) : print ( f"Removing Managed Policy from { role . name } " ) role . detach_policy ( PolicyArn = policy . arn ) for profile in role . instance_profiles . all ( ) : print ( f"Removing role from InstanceProfile { profile . name } " ) profile . remove_role ( RoleName = role . name ) for role_policy in role . policies . all ( ) : print ( f"Deleting Policy { role_policy . name } " ) role_policy . delete ( ) role . delete ( ) print ( f" { role . name } deleted

" ) except Exception as e : print ( e )

Looking good!

Before we move much further let’s pause here and start looking at AWS X-Ray.

AWS X-Ray

AWS X-Ray is a collection of SDKs, Daemons and an AWS Service that allows you to setup up application code tracing. X-Ray can provide a lot of insight into what is going on in your apps and makes troubleshooting when something goes wrong much easier.

I would highly recommend reading the docs on X-Ray and checking out some videos (See the bottom of this post for more links) but I thought it would be cool to get some data into X-Ray with this role-destroyer script.

Setting Up X-Ray

Before we can get any data into X-Ray from our code when running locally, we need to get the X-Ray Daemon running locally which will act as a proxy between our code and the X-Ray service.

The install and setup of the X-Ray Daemon are clearly covered here.

Once you’ve followed the steps and have the Daemon Running - we can continue!

You should have something like this once the Daemon is working:

Adding X-Ray To Our Code

Now we have the daemon running - time to start sending some data to AWS.

Integration is super simple - we first need to import the python SDK which you can install with pip install aws-xray-sdk

Then add this to the top of your code:

from datetime import datetime , timedelta from aws_xray_sdk . core import patch_all from aws_xray_sdk . core import xray_recorder import boto3 time_now = datetime . now ( ) days_to_delete = 200 . . .

Now we have to tell X-Ray we would like it to include events from supported modules within our X-Ray traces - boto3 being one of them See here for more info

from datetime import datetime , timedelta from aws_xray_sdk . core import patch_all from aws_xray_sdk . core import xray_recorder import boto3 patch_all ( )

Now it’s just a case of updating our code to start recording segments within our code as we see fit.

I have updated the code from before to include this new capability and refactored it slightly to make it a bit more DRY (see if you can spot the difference!)

from datetime import datetime , timedelta from aws_xray_sdk . core import patch_all from aws_xray_sdk . core import xray_recorder import boto3 patch_all ( ) time_now = datetime . now ( ) days_to_delete = 200 client = boto3 . client ( 'iam' ) resource = boto3 . resource ( 'iam' ) segement = xray_recorder . begin_segment ( 'Old Roles Destroyer' ) roles = client . list_roles ( ) for r in roles [ 'Roles' ] : role = resource . Role ( name = r [ 'RoleName' ] ) subsegment = xray_recorder . begin_subsegment ( role . name ) subsegment . put_annotation ( 'RoleArn' , role . arn ) try : if role . role_last_used : subsegment . put_annotation ( 'RoleLastUsed' , str ( role . role_last_used [ 'LastUsedDate' ] ) ) time_diff = time_now - \ role . role_last_used [ 'LastUsedDate' ] . replace ( tzinfo = None ) if time_diff . days >= days_to_delete : print ( f"Attempting to delete Role { role . name } ." ) print ( f"Removing Managed Policies from { role . name } " ) [ role . detach_policy ( PolicyArn = policy . arn ) for policy in role . attached_policies . all ( ) ] print ( f"Removing role from InstanceProfiles" ) [ profile . remove_role ( RoleName = role . name ) for profile in role . instance_profiles . all ( ) ] print ( f"Deleting Inline Policies" ) [ role_policy . delete ( ) for role_policy in role . policies . all ( ) ] role . delete ( ) print ( f" { role . name } deleted

" ) except Exception as e : print ( e ) xray_recorder . end_subsegment ( ) xray_recorder . end_segment ( )

Nice! You should see some new additions to the code:

segement = xray_recorder.begin_segment('Old Roles Destroyer')

Setup the Code Segement - Our app is tiny so this is basically going to contain all trace data

subsegment = xray_recorder.begin_subsegment(role.name)

For each role I am working with, I create a new subsegment with a name of the Role name - useful for later…

subsegment.put_annotation('RoleArn', role.arn)

Adding the RoleArn as an Annotation to the segment. Because I can? Annotations are searchable too as they’re indexed

xray_recorder.end_subsegment()

End the Subsegment for this role

xray_recorder.end_segment()

Finish up our recording for the segment - we’re done.

What Can we see?

If you take a look at the AWS X-Ray console, you should see something similar to below:

This shows that (within the time filter) what Services have been active. Old Roles Destroyer shows up for me with a few extra segments - One being our code, the other being the IAM service which shows up as it does because of boto3 being compatible with X-Ray SDKs.

If you click on a Segment, you will see the option to View Traces which will then show a trace for our code run.

Drilling into the trace, you’ll see all of the roles we processed (my view will obviously be different to what you might see)

Here we see each role as a sub-segment and the different IAM calls as sub-segments of that role. The view summarises each Sub-segments status, how long it took to run and where in the trace it ran.

Have a play around - You can dig into each sub-segment and see what’s going on. When they’re successful, all is green. When any calls fail - we can see those as red with a bit of extra info:

This shows the actual Exception that occurred, when it occurred and most importantly where (what line etc). This gives us plenty of information to dig into the issue and fix it if need be.

Lastly, our annotation we added to each Sub Segment shows up too - You can see RoleLastUsed and RoleArn have been added appropriately to each sub-segment.

Wrapping Up

So that’s all for now. A quick run into a couple of new APIs with some funkiness of AWS X-Ray thrown in.

Hopefully, you’ve enjoyed this post - Let me know through the usual channels what you think and what you’d like to see more of.

In future posts, I’ll dive deeper into X-Ray and show you some of the nifty analysis features as well as how we might integrate more parts of a bigger application into our traces - Stay tuned!

More importantly, we have cleaned up old roles and started the building blocks for something we can refactor later to run periodically and even act as a monitor for tracking role usage between accounts - I’d like to keep at this idea in future posts and show how we can take things to the next level.

Useful Links:

 Neil