An Intrusion on Cornell AWS Turf

By Paul Allen

So, it’s finally happened. Earlier this week Cornell University had an intrusion into one of the 80+ AWS accounts that Cornell uses. With that kind of scale, it was only a matter of time until a human made a mistake.

What Happened?

The short story is that a Cornell developer accidentally included AWS credentials in source code committed to a public git repository. Within a matter of minutes, the intruder had found those credentials and begun using them to fire up large AWS EC2 instance types across all AWS regions. Presumably, the goal was to steal compute cycles for cryptocurrency mining. The  developer working with the account noticed a problem when the intruder terminated the EC2 instances hosting development, test, and production deployments.

The Good

The good news is that it could have been worse.

Once the intrusion was noticed, it took only minutes to identify the IAM credentials that were being used by the intruder and deactivate them. Then cleanup began—shutting down EC2 instances that the intruder had launched to stop the AWS charges from mounting further.

More good news is that the leaked credentials had privileges limited to the AWS EC2 service. That limited the blast radius so forensics were not required for the myriad of other AWS services that are available.

A few other ways we were lucky:

  • The affected AWS account wasn’t being used for holding or processing any Cornell confidential data. This would have complicated things immensely.
  • The intruder terminated existing instances, triggering a look at what was going on in the account.
  • Beyond terminating those instances, the intruder didn’t seem to be bent on destroying existing resources, like EBS volumes or snapshots. This made recovery a bit easier.
  • The soft limits that AWS imposes on numbers of instance types that can be running at any given time restricted the number of instances that the intruder was able to start up. No doubt the bill would have been much larger if those limits weren’t in place.
  • Because the intruder only had privileges to EC2, he couldn’t create new IAM policies, users, or credentials thus saving us from playing whack-a-mole with a myriad of other IAM principals or credentials beyond the one originally leaked.
  • The affected AWS account had CloudTrail turned on so it was easy to see how the leaked credentials were being used. This is part of the standard audit configuration we enforce for Cornell AWS accounts.

The Bad

The total AWS bill resulting  from the intrusion was about $6,500. The normal daily spend in the affected AWS account was about $200/day and the bill ballooned to about $6,700 on the day of the intrusion. Ouch!

The Ugly

This intrusion resulted from a simple mistake, and mistakes happen whether we are newbies or experts. The challenge we face is determining what practical mechanisms should be in place to reduce the likelihood of mistakes or to limit the impact of those mistakes.

The leaked credentials granted a lot of privileges

The credentials that were compromised had privileges associated with the AmazonEC2FullAccess managed policy. That’s a lot of privileges. The blast radius of the intrusion could have been limited by better scoping those privileges in some way:

  • Often, privileges can be scoped reasonably to a specific AWS region without impairing their usability. It’s easy to restrict privileges to a specific region. Our intruder was able to launch instances in every public AWS region. The cost impact of our intrusion could have been reduced by over 90% if the privileges had been restricted to a specific region of interest.
  • The privileges granted by the compromised credentials could have been scoped to work just from a limited set of source IPs. Most of us at Cornell are using AWS from a limited set of IP addresses (Cornell campus IPs and maybe our home IP address) so scoping to that limited set of IPs wouldn’t have hindered legitimate use of the credentials. Here’s an example policy that you can use to help limit access to Cornell IP addresses.
  • Was full access to all EC2 operations required for the intended use of these credentials? Maybe, maybe not. While it can be challenging to figure out the itemized list of specific AWS operations that an IAM principal should have access to, it may have helped limit the scope of this intrusion. For example, our intruder was able to create new EC2 key-pairs with which to connect to his stolen instances. Did the intended use of these credentials require the ec2:CreateKeyPair permission? Were other limitations possible?

Avoid leaking credentials in the first place

It is well known that bad guys watch commits to public git repos for credentials and other secrets accidentally committed.  However, there is help available.

  • Tools like git-secrets can be used to avoid accidentally committing secrets like AWS credentials to git repos.
  • It is often possible to use temporary credentials instead of static, fixed credentials. Tools like aws-vault make it possible to improve your AWS credential hygiene in a variety of scenarios. In fact, one team at Cornell has integrated aws-vault into their development processes.
  • You can avoid AWS credentials entirely if you are operating from an EC2 instance. In that case, you can use IAM instance profiles to give your EC2 instance privileges to perform AWS operations without having to configure AWS access key credentials.

Better monitoring of activity in AWS accounts

The intrusion could have been recognized earlier.

  • We use CloudCheckr at Cornell for campus AWS accounts, giving our AWS users access to reports on billing, inventory, utilization, security, and best practices. CloudCheckr has a lot of information and could have alerted us about unexpected spending (among other things), but it isn’t designed as a realtime tool. So that wouldn’t have provided timely alerts in this situation.
  • AWS account users can set up billing alarms in CloudWatch to get alerted based on the billing parameters they choose. Unlike CloudCheckr alerts, these are based on real-time data so could have provided a warning that something untoward was happening in this situation.
  • The free tier of AWS Trusted Advisor provides Service Limit checks that could have told us that the EC2 instance limits had been reached in AWS regions around the globe.  CloudWatch Alarms can be combined with Trusted Advisor metrics to get notifications about that situation.

Conclusion

It is easy to come up with a list of things that could have been done differently in a situation like this. The items above are just a starting point, and there are a gabillion compilations of best practices for AWS security. It can be overwhelming.

Even though this intrusion happened elsewhere on campus, I am going to take Cornell’s brush with an intruder as an opportunity to critically review the practices that I and my team use. I won’t let myself get overwhelmed with all that could be done. Instead we’ll focus on incremental improvement to the practices we have in place and assess where more effort might have a disproportionately great payoff. If you or your team at Cornell would like help doing the same in AWS, please reach out to the Cloud Team.

Using Cornell Shibboleth for Authentication in your Custom Application.

by Shawn Bower

Introduction

When working on your custom application at Cornell you have multiple options for integrating authentication with the Cornell central store.  For a long time the default choice was to use Cornell Web Authentication (CUWA) which is a module that can be plugged into either Apache or IIS.  This is a perfectly fine solution but often leads to running an Apache server as a reverse proxy just to supply CUWA.  Another method that can be employed is integrating your application directly with Cornell’s Shibboleth IDP.  Using Shibboleth can help reduce the complexity of your infrastructure and as an added bonus you can enable access to your site to users from other institutions that are members of the InCommon Federation.

How Shibboleth Login Works

Key Terms

  • Service Provider (SP) – requests and obtains an identity assertion from the identity provider. On the basis of this assertion, the service provider can make an access control decision – in other words it can decide whether to perform some service for the connected principal.
  • Identity Provider (IDP) – also known as Identity Assertion Provider, is responsible for (a) providing identifiers for users looking to interact with a system, and (b) asserting to such a system that such an identifier presented by a user is known to the provider, and (c) possibly providing other information about the user that is known to the provider.
  • Single Sign On (SSO) – is a property of access control of multiple related, but independent software systems.
  • Security Assertion Markup Language (SAML) – is an XML-based, open-standard data format for exchanging authentication and authorization data between parties, in particular, between an identity provider and a service provider.

samlsp

  1. A user request a resource from your application which is acting as a SP
  2. Your application constructs an authentication request to the Cornell IDP an redirects the user for login
  3. The user logins in using Cornell SSO
  4. A SAML assertion is sent back to your application
  5. Your application handles the authorization and redirects the user appropriately.

Shibboleth Authentication for Your Application

Your application will act as a service provider.  As an example I have put together a Sinatra application in ruby that can act as a service provider and will use the Cornell Test Shibboleth IDP.  The example source can be found here.  For this example I am using the ruby-saml library provided by One Login, there are other libraries that are compatible with Shibboleth such as omniauth.  Keep in mind that Shibboleth is SAML 2.0 compliant so any library that speaks SAML should work.  One reason I choose the ruby-saml library is that the folks at One Login provide similar libraries in various other languages, you can get more info here.  Let’s take a look at the code:

samlsp-demo

First it is important to configure the SAML environment as this data will be needed to send and receive information to the Cornell IDP.  We can auto configure the IDP settings by consuming the IDP metadata.  We then have to provide an endpoint for the Cornell IDP to send the SAML assertion to,  in this case I am using “https://shib.srb55.cs.cucloud.net/saml/consume.”  We also need to provide the IDP with an endpoint that allows it to consume our metadata.  Finally by adding PasswordProtectedTransport in your request, the IDP knows it has to authenticate the user through login/password, protected by SSL/TLS.

We will need to create an endpoint that contains the metadata for our service provider.  With OneLogin::RubySaml this is super simple as it will create it based on the SAML settings we configured earlier.  We simply create “/saml/metadata” which will listen for get request and provided the auto generate metadata.

Next lets create an endpoint in our application that will redirect the user to the Cornell IDP.  We create “/saml/authentication_request” which will listen for get requests and then use the  OneLogin::RubySaml to create an authentication request.  This is done by reading the  SAML settings which include information on the IDP’s endpoints.

Then we need a call back hook that the IDP will send the SAML assertion to after the user has authenticated.  We create “/saml/consume” which will listen for a post from the IDP.  If we receive a valid response from the IDP then we will create a page that will display a success message along with the first name of the authenticate user.  You might be wondering where “urn:oid:2.5.4.42” comes from.  The information returned in the SAML assertion will contain attributes agreed upon by InCommon as well as some attributes provided by Cornell.  The full list is:

AttributeNameInEnterpriseDirectory
AttributeNameInSAMLAssertion
edupersonprimaryaffiliation urn:oid:1.3.6.1.4.1.5923.1.1.1.5
commonName urn:oid:2.5.4.3
eduPersonPrincipalName (netid@cornell.edu) urn:oid:1.3.6.1.4.1.5923.1.1.1.6
givenName (first name) urn:oid:2.5.4.42
surname(last name) urn:oid:2.5.4.4
displayName urn:oid:2.16.840.1.113730.3.1.241
uid (netid) urn:oid:0.9.2342.19200300.100.1.1
eduPersonOrgDN urn:oid:1.3.6.1.4.1.5923.1.1.1.3
mail urn:oid:0.9.2342.19200300.100.1.3
eduPersonAffiliation urn:oid:1.3.6.1.4.1.5923.1.1.1.1
eduPersonScopedAffiliation urn:oid:1.3.6.1.4.1.5923.1.1.1.9
eduPersonEntitlement urn:oid:1.3.6.1.4.1.5923.1.1.1.7

Conclusion

We covered the basic concepts of using Shibboleth and SAML creating a simple demo application.  Shibboleth is a great choice when looking at architecting your custom application in the cloud, as it drastically simplifies the authentication infrastructure while still using Cornell SSO.  An important note is that in this example we used the Cornell Test Shibboleth IDP which allows us to create an anonymous service provider.  When moving your application into production you will need to register your service provider with the Identity Management team (idmgmt@cornell.edu).  For more information about Shibboleth at Cornell pleasse take a look at IDM’s Confluence page.

Using Shibboleth for AWS API and CLI access

by Shawn Bower

This post is heavily based on “How to Implement Federated API and CLI Access Using SAML 2.0 and AD FS” by Quint Van Derman, I have used his blueprint to create a solution that works using Shibboleth at Cornell.

TL;DR

You can use Cornell Shibboleth login for both API and CLI access to AWS.  I built docker images that will be maintained by the Cloud Services team that can be used for this and it is as simple as running the following command:

docker run -it --rm -v ~/.aws:/root/.aws dtr.cucloud.net/cs/samlapi

After this command has been run it will prompt you for your netid and password.  This will be used to login you into Cornell Shibboleth. You will get a push from DUO.  Once you have confirmed the DUO notification, you will be prompted to select the role you wish to use for login, if you have only one role it will choose that automatically.  The credentials will be placed in the default credential file (~/.aws/credentials) and can be used as follows:

aws --profile saml s3 ls

NOTE: In order for the script to work you must have at least two roles, we can add you to a empty second role if need be.  Please contact cloud-support@cornell.edu if you need to be added to a role.

If there are any problems please open an issue here.

Digging Deeper

All Cornell AWS accounts that are setup by the Cloud Services team are setup to use Shibboleth for login to the AWS console. This same integration can be used for API and CLI access allowing folks to leverage AD groups and aws roles for users. Another advantage is this eliminates the need to monitor and rotate IAM access keys as the credentials provided through SAML will expire after one hour. It is worth noting the non human user ID will still have to be created for automating tasks where it is not possible to use ec2 instance roles.

When logging into the AWS management console the federation process looks likesaml-based-sso-to-console.diagram

  1. A user goes to the URL for the Cornell Shibboleth IDP
  2. That user is authenticated against Cornell AD
  3. The IDP returns a SAML assertion which includes your roles
  4. The data is posted to AWS which matches roles in the SAML assertion to IAM roles
  5.  AWS Security Token Services (STS) issues a temporary security credentials
  6. A redirect is sent to the browser
  7. The user is now in the AWS management console

In order to automate this process we will need to be able to interact with the Shibboleth endpoint as a browser would.  I decided to use Ruby for my implementation and typically I would use a lightweight framework like ruby mechanize to interact with webpages.  Unfortunately the DUO integration is done in an iframe using javascript, this makes things gross as it means we need a full browser. I decided to use selenium web driver to do the heavy lifting. I was able to script the login to Shibboleth as well as hitting the button for a DUO push notification:
duo-push

In development I was able to run this on mac just fine but I also realize it can be onerous to install the dependencies needed to run selenium web driver.  In order to make the distribution simple I decided to create a docker images that would have everything installed and could just be run.  This meant I needed a way to run selenium web driver and firefox inside a container.  To do this I used Xvfb to create a virtual frame buffer allowing firefox to run with out a graphics card.  As this may be useful to other projects I made this a separate image that you can find here.  Now I could create a Dockerfile with the dependencies necessary to run the login script:

saml-api-dockerfile

The helper script starts Xvfb and set the correct environment variable and then launches the main ruby script.  With these pieces I was able to get the SAML assertion from Shibboleth and the rest of the script mirrors what Quint Van Derman had done.  It parses the assertion looking for all the role attributes.  Then it presents the list of roles to the user where they can select which role they wish to assume.  Once the selection is done a call is made to the Simple Token Service (STS) to get the temporary credentials and then the credentials are stored in the default AWS credentials file.

Conclusion

Now you can manage your CLI and API access the same way you manage your console access. The code is available and is open source so please feel free to contribute, https://github.com/CU-CloudCollab/samlapi. Note I have not tested this on Windows but it should work if you change the volume mount to the default credential file on Windows. I can see the possibility to do future enhancements such as adding the ability to filter the role list before display it, so keep tuned for updates. As always if you have any questions with this or any other Cloud topics please email cloud-support@cornell.edu.

The Cornell “Standard” AWS VPC

by Paul Allen

This post describes the standard AWS Virtual Private Cloud (VPC) provisioned for Cornell AWS customers by the Cornell Cloudification Service Team. This “standard” VPC is integrated with Cornell network infrastructure and provides several benefits over the default VPC provisioned to all AWS customers when a new AWS account is created.

So we don’t get confused, let’s call the VPC provisioned by the Cornell Cloudification Service Team the “Cornell VPC” and the VPC automatically provisioned by AWS the “default  VPC”. AWS itself calls this latter VPC by the same name (i.e. default VPC). See AWS documentation about default VPCs (more…)

Automatic OS Patching

by Shawn Bower

One of the challenges as you move your infrastructure to the cloud is keeping your ec2 machines patched.  One highly effective strategy is to embrace ephemerality  and  keep rotating machines in and out of service.  This strategy is especially easy to employ using Amazon Linux. check this out from the FAQ (https://aws.amazon.com/amazon-linux-ami/faqs/):

On first boot, the Amazon Linux AMI installs from the package repositories any user space security updates that are rated critical or important, and it does so before services, such as SSH, start.

Another strategy, the one we will focus on in this post, is to be able to patch your ec2 instances in place.  This can be useful when your infrastructure has long running 24×7 machines.  This can be accomplished using Amazon’s SSM service and the AWS CLI or SDK.  First we have to install the SSM agent on all the machines we wish to patch.  For example to install the agent (as root) on an Amazon Linux machine in us-east-1 you would:

  1. curl https://amazon-ssm-us-east-1.s3.amazonaws.com/latest/linux_amd64/amazon-ssm-agent.rpm -o /tmp/amazon-ssm-agent.rpm
  2. rpm -i /tmp/amazon-ssm-agent.rpm
  3. rm /tmp/amazon-ssm-agent.rpm

This will download the agent, install it and turn on the service.  The agent will contact AWS over the public internet so makes sure your ec2 instance can send outbound traffic.  Once your machine registers with SSM you should see it listed in the Run Command GUI (ec2>commands>run a command).

run-command-gui

 

You can also verify that your instance has correctly registered using the AWS CLI, for example:

aws ssm describe-instance-information –instance-information-filter-list key=InstanceIds,valueSet=instance ID

If everything goes well you will get json like below:

srb55_—_ec2-user_dev__sync_—_-bash_—_142×27

If the instance can not be found you will receive an error, if the agent had registered but is no longer responding then you will see the PingStatus is Inactive.  Once we have the SSM agent installed we are good to send commands to our ec2 machines including patching them.  For our implementation I decided to use tags in ec2 to denote which machines should be auto patched.  I also added a tag called “os” so we could support patching multiple OSes.  Our ec2 console looks like:

ec2-tags

With the tags in place and the SSM agent installed I used the Ruby SDK to create a script that would look for ec2 instances that had the auto_patch tag and then determine the patching strategy based on the OS.  This script uses the cucloud module that is being collaboratively developed by folks all over campus.  The main script is:

auto_patch_rb_—__Users_srb55_projects_aws-examples_aws-ruby-sdk_ec2

Finally the function that sends the command via SSM is:

send-patch

This is just one example of how to keep your machines up to date and to improve your security posture.  If you have any questions about patching in AWS or using the cloud here at Cornell please send a note to the Cloudification team.