AWSCLI S3 Backup via AWS Direct Connect

By Ken Lassey, Cornell EMCS / IPP

Problem

AWSCLI tools default to using the Internet for its connection when reaching out to services like S3. This is due to the fact that S3 only provides public endpoints for network access to the service.  This is an issue if your device is in Cornell 10-space as it cannot get on the Internet.  It also could be a security concern depending on the data, although AWSCLI S3 commands do use HTTPS for their connections.

Another concern is the potential data egress charges for transferring large amounts of data from AWS back to campus. If you need to restore an on-premise system directly from S3, this transfer would accrue egress charges.

Solution

Assuming you have worked with Cloudification to enable Cornell’s AWS Direct Connect service in your VPC, you can force the connection through Direct Connect with a free tier eligible t2.micro Linux instance. By running the AWSCLI commands from this EC2 instance, it can transfer data from your on-premise system, through Direct Connect and drop the data in S3 all in one go.

Note that if your on-premise systems have only public routable IP Addresses, and no 10-Space addresses, your on-premise systems will not be able to route over Direct Connect (unless Cloudification has worked with you to enable special routing). In most campus accounts, VPC’s are only configured to route 10-Space traffic over Direct Connect. If the systems you want to back up have a 10-Space address, you are good to go!

Example Cases

  1. You have several servers that backup to a local storage server. You want to copy the backup data into AWS S3 buckets for disaster recovery.
  2. You want to backup several servers directly into AWS S3 buckets.

In either case, you do not want your data going in or out over the internet, or it cannot due to the servers being in Cornell 10-space.

AWSCLI Over Internet

awscli s3 sync <backupfolder> <s3bucket>

By running the awscli command from the backup server or individual server the data goes out over the internet.

Example Solution

By utilizing a free tier eligible t2.micro Linux server you can force the traffic over Cornell direct connect to AWS.

AWS CLI Over Direct Connect

Running the awscli commands from the t2.micro instance you force the data to utilize AWS direct connect.

On your local Windows server (backup or individual) you need to ‘share’ the folder you want to copy to S3.  You should be using a service account and not your own NetID. You can alternatively enable NFS on your folder if you are copying files from a Linux server, or even just tunnel through SSH. The following examples are copying from a Windows share.

In AWS:

  1. Create an s3 bucket to hold your data

On your systems:

  1. Share the folders to be backed up
  2. Create a service account for the backup process
  3. Give the service account at least read access to the shared backup folder
  4. If needed allow access through the Managed Firewall from your AWS VPC to the server(s) or subnet where the servers reside

On the t2.micro instance you need to:

  1. Install the CIFS Utilities:
    sudo yum install cifs-utils
  2. Create a mount point for the backup folder:
    mkdir /mnt/<any mountname you want>
  3. Mount the shared backup folder:
    sudo mount.cifs //<servername or ip>/<shared folder> /mnt/<mountname> -o \ 
    name=”<service account>”,password=”<service account password>”,domain=”yourdomain>”
  4. Run the s3 sync from the t2.micro instance:
    aws s3 sync \\<servername/IP>/\<share> s3:\\<s3-bucket-name>

 

Instead of manually running this I created on the t2.micro instance a script to perform the backup.  The script can be added to a CRON task and run on a schedule.

Sample Script 1

  1. I used nano to create backup.sh
       #!/bin/sh
       backupfolder=/mnt/<sharedfoldername>
       s3path=s3://<s3bucketname>
       aws s3 sync $backupfolder $s3path
  1. The script needs to be execuatable so run this command on the t2.micro instance
    sudo chmod +x <scriptname>

Sample Script 2

This script mounts the share and backups up multiple folders on my backup server then unmounts the shared folder

#!/bin/bash
sudo mount.cifs //128.253.109.11/ALC -o user="<serviceaccount>",password="<serviceaccount pwd>",domain=”<your domain"
srvr=(FOLDER1 FOLDER2 FOLDER3 FOLDER4 FOLDER5 FOLDER6)
for s in "${srvr[@]}"; do
    bufolder=/mnt/<mountname>/$s
    s3path=s3://<s3bucketname>/$s
    aws s3 sync $bufolder $s3path
done
sudo umount /mnt/<mountname> -f

Disaster Recovery in AWS

By Scott Ross

In this post we are going to cover considerations a unit implementing AWS should be aware of when moving into the cloud.  Developing a disaster recovery (DR) plan for the cloud encompass the same considerations that on premise DR plans.  But obviously there are differences between on premise vs. cloud environments.  These difference usually manifest themselves in questions like who is responsible, how does a unit implement a DR solution, and how does ones mindset change when thinking about DR in the cloud.

It is true that DR in the cloud is more robust then on campus (another blog post…), but with this power comes more responsibility.  More tools are at the disposal of your development/operations team, and those tools are more flexible and dynamic. There are more architectural patterns, and some of those patterns differ dramatically from on premise solutions (of which we have a long history with).  And, of course, you are now dependent on more factors – networking, amazon itself, and many, many others.

Let’s start by reintroducing what disaster recovery is, and what it is not.

Disaster Recovery (DR) is defined as the process, policies, and procedures that are related to preparing for recovery or continuation of technology infrastructure, which are vital to an organization after a natural or human induced crisis.

DR is not backups.
DR complements other high availability (HA) services, but while HA deals with disaster prevention, DR is for those times when prevention’s have failed.

Disaster Recovery planning is always a trade-off between Recovery Time Objective (RTO) and Recovery Point Objective (RPO) vs. cost/complexity.  At a very basic level, DR cannot be done without input from your business partners – they will need to be aware of Risks/Rewards, and thus will drive appropriate IT strategy for disaster recovery (in another post, we will discuss ways you can categorize your services for DR).

RTO, RPO and Data Replication are key conceptual terms when dealing with DR.  Here is a brief introduction:

Recovery Point Objective (RPO) – It is the maximum targeted period in which data might be lost from an IT service due to a major incident.

Recovery Time Objective (RTO) – Is the targeted duration of time and a service level with which a business process must be restored after a disaster (or disruption) in order to avoid unacceptable consequences associated with a break in business continuity.

Data Replication – sharing information so as to ensure consistency between redundant resources.

 

AWS Basics 

  • AWS Regions
    • 9 Regions worldwide (see map)
    • Completely independent with different teams and infrastructure
  • AWS Availability Zones (AZ)
    • Each region contains 1 or more AZs
    • Physically separated, but in the same geographical location
    • Shared teams and software infrastructure
  • US-EAST-1 (region)
    • Located in Virginia
    • 5 availability zones (as of October 2016)
  • AWS is dynamic resource allocation
    • Pay for resources as your go
    • Create/Destroy resources quickly using dashboard or various CLI tools

Example AWS Disasters

  • Single-Resource Disaster

What is it?  A single resource stops functioning … (EBS, ELB, EC2..)

How to prepare?  Make sure that no single resource is a point of failure.  Use AMI’s and autoscaling to help you make stateless instances (or use docker!).  Configure RAIDs for volumes.  Use AWS managed services (like RDS).

  • Single-AZ Disaster

What is it?  A whole AZ goes down

How to prepare?  Build your system so that it’s spread across multiple AZs and can survive downtime to any single AZ.  Connect subnets in different AZs to your ELB and turn on multi-AZ for RDS.

  • Single-service Disaster

What is it?  A whole AWS service (such as RDS) goes down across the entire region.

How to prepare?  First, resist the urge to use AWS resources for everything (but remember this is a cost/complexity question as well).  Be ready to recreate your system in a different region (by using automation).  Invest in the upfront preparation to do this (VPC setup, networking, etc al that need to be in place for to turn on a region)

  • Whole-Region disaster

What is it?  A whole AWS region goes down taking all the applications running on it with it. 

How to prepare? Implement a cross region DR methodology.  Take snapshots of your instances and copy them to different regions.  Use automation tools (cloudformation/ansible/chef etc al) to define your application stack.  Copy AMIs to a different region.  Understand AWS services you are using, and their ability to be used in different regions (RDS for example)

Disaster Recovery Patterns 

  • Backup and Restore
    • Advantage
      • Simple to get started
      • Very cost effective
    • Disadvantage
      • Likely not real time
      • Downtime will occur
    • Preparation Phase
      • Take backups of current systems (and schedule these backups)
      • Use S3 as the storage mechanism
      • Describe procedure to restore
    • Process…
      • Retrieve backups from S3
      • Bring up required infrastructure
        • EC2 instances with prepared AMIs, Load balancing, etc al
      • Restore system from backup
      • Switch over to the new system
    • Objectives…
      • RTO: As long as it takes to bring up infrastructure and restore system from backups
      • RPO: From last backup
  • “Pilot Light” in different regions
    • Advantages:
      • Reduces RTP and RPO
      • Resources just need to be turned on
    • Disadvantage:
      • Cost
    • Preparation
      • Enable replication of data across regions
      • Automation of services in backup region
      • Switch over DNS to other region when downtime occurs
  • Fully working low capacity standby’s
    • Advantages:
      • Reduces RTP and RPO
      • Resources just need to be turned on
      • Can handle traffic or production
    • Disadvantage:
      • Cost
      • All necessary components are running 24/7
    • Preparation
      • Enable replication of data across regions
      • Automation of services in backup region
      • Switch over DNS to other region when downtime occurs
  • Multi-Site hot backup
    • Same as low capacity, but full scaled… 

Tooling and Consistency

There lots of tooling in the industry to help implement HA and DR.  Here, we are going to cover some of tools used on campus to build infrastructure as code and the automation that’s critical to successful DR plans.  People on campus know these tools:

  • AWS Cloudformation
  • Configuration Management: Ansible, Chef, Puppet
  • Continuous Integration/deployment: Jenkins
  • Containerization: Docker
  • Preferred Languages / AWS SDK: Ruby, Python

For more, see https://confluence.cornell.edu/display/CLOUD/Cloud+Service+Matrix

AWS services DR checklists

Finally, these are some basic checklists that might help you as you discover the best strategies to implement DR.  

RDS DR Check List:

  • Setup production instances in a multi-AZ architecture
  • Enable RDS Daily Backups
  • Create cross region Read Replication in the recovery region of your choice (remember to consider your application in this! Data, without an app, doesn’t work!)
  • Maintain DB settings in both regions (better yet – script those settings! Use infrastructure as code!)
  • Make sure you have the appropriate alerting / monitoring setup; especially to monitor sync errors between the regions
  • Schedule drills!  Create failovers!  Practice!

Basic Checklist:

  • Design for failure
  • Placing services in multiple availability zones
  • Have a de-coupled architecture, to reduce single points of failure
  • Use load balancers, autoscaling, and health monitoring rules for HA
  • Use CloudFormation templates
  • Maintain up to date AMI’s
  • Backing up critical data sets to S3
  • Running “warm” servers in other regions for recovery
  • A written plan
  • Testing that plan often! 

Some Takeaways – 

  1. Design DR (& HA) into your architecture when moving to the cloud. You can “lift and shift”, but you are likely not taking advantage of cloud and are still susceptible to the same disaster risks as on premise.
  2. Take advantage of what AWS offers you and use those as your building blocks.
  3. However, understand the impact of relying on these services and building blocks. Don’t treat AWS as a ‘black box’.
  4. Exercise your DR solution!!!! (that’s one of the many benefits of the cloud; we can do this now)

Class Roster – Launching Scheduler in the Cloud

by Eric Grysko

Introduction

Class Roster
Class Roster – classes.cornell.edu

In Student Services IT, we develop applications that support the student experience. Class Roster, classes.cornell.edu, was launched in late 2014 after several months of development in coordination with the Office of the University Registrar. It was deployed on-premises and faced initial challenges handling load just after launch. Facing limited options to scale, we provisioned for peak load year-round, despite predictable cyclical load following the course-enrollment calendar.

By mid-2015, Cornell had signed a contract with Amazon Web Services, and Cornell’s Cloudification Team was inviting units to collaborate and pilot use of AWS. Working with the team was refreshing, and Student Services IT dove in. We consumed training materials, attended re:Invent, threw away our old way of doing things, and began thinking cloud and DevOps.

By late 2015, we were starting on the next version of Class Roster – “Scheduler”. Display of class enrollment status (open, waitlist, closed) with near real-time data, meant we couldn’t rely on long cache lifetimes. And new scheduling features, were expected to grow peak concurrent usage significantly. We made the decision, Class Roster would be our unit’s first high-profile migration to AWS. (more…)

Using Docker Datacenter to launch load test

by Shawn Bower

As we at Cornell move more of our workloads to the cloud an important step in this process is to run load test against our on premise infrastructure and the proposed AWS infrastructure. There are many tools that can aide in load testing one which we use frequently is called Neustar. This product is based on selenium and allows up to spin up a bunch automated browser users. It occurred to me that a similar solution could be developed using docker and Docker Datacenter.

To get started I took a look at the docker containers provided by Selenium, I love companies that ship their product in Containers!  I was able to get a quick test environment up and running locally.  I decided to use Selenium grid that provides a hub server which nodes can register with.  Each node registers and lets the hub know what kind of traffic it can accept.  In my case I used nodes running firefox on linux.  To test the setup I created a simple ruby script using the Selenium ruby bindings to send commands to a node.

sample-test_rb_—__Users_srb55_projects_docker-selenium-load-test

This simple test will navigate to google and search for my name then wait for the window title to include my name.  While testing locally I was able to get the hub and node up and running with the following commands:

docker run -d -p 4444:4444 --name selenium-hub selenium/hub:2.53.0
docker run --rm --name=fx --link selenium-hub:hub selenium/node-firefox:2.53.0

I was then able to run my script (exporting HUB_IP=localhost) and life is good.  This approach could be great for integration tests in your application but in my case I wanted to be able to throw a bunch of load at an application.  Since we have some large Docker Datacenter clusters it seemed to make sense to use that spare capacity to generate load.  In order to deploy the grid/hub to our cluster I created a docker-compose.yaml file.


docker-compose_yaml_—_selenium_infrastructure_—__Users_srb55_projects_docker-selenium-load-test

One thing to note is that I’m using a customized version of the node container, I will come back to this later.  Now I am able to bring up the grid and node as such:

selenium_infrastructure_—_-bash_—_159×41

I can now use the docker-compose ps command to find out where my hub is running.

selenium_infrastructure_—_-bash_—_159×41

Now I’m ready to launch my test.  Since all the things must run inside containers I created a simple Dockerfile to encapsulate my test script.

Then I can kick off the test and when it finishes I want to grab the logs from the firefox node.

selenium_infrastructure_—_-bash_—_159×41

docker build -t st .
docker run -e “HUB_IP=10.92.77.33” st .
docker logs ldtest_ff_1

We can see the how Selenium processes the script on firefox. Note that “get title” is executed multiple times.  This is because of the waiter that is looking for my name to show up in the page title.  Sweet!  Now that we have it up and running we can scale out the firefox nodes, this is super easy using Docker Datacenter!

selenium_infrastructure_—_-bash_—_159×41

Now we can ramp up our load!  I took the script above and ran it on a loop with a small sleep at the end then spun up 20 threads to run that script.  In order to get everything working in Docker Datacenter I did have to modify the node startup script to register using the IP for the container on the overlay network.  It turns out this is a simple modification by adding an environment variable for the IP

export IP=`hostname -I | perl -lne ‘print $1 if /(10.\d.\d.\d+)/’`

Then when the node is launched you need to add “-host $IP”

When are finished we can quickly bring everything down.

selenium_infrastructure_—_-bash_—_159×41

 

Conclusion

It is relatively simple to setup a load driver using Docker Datacenter.  The code used for this example can be found here: https://github.com/sbower/docker-selenium-load-test.  This is super bare bones.  Some neat ideas for extensions would be, to add a mechanism to ramp the load, a mechanism to create a load profile comprised of multiple scripts, and a mechanism to store response time data.  Some useful links for using selenium with ruby and Docker.

  • https://github.com/SeleniumHQ/selenium/wiki/Ruby-Bindings
  • https://github.com/SeleniumHQ/docker-selenium
  • https://gist.github.com/kenrett/7553278

Configure Jenkins to use Cornell Shibboleth Authentication

by Brett Haranin

Introduction

At RAIS, Jenkins has become integral to our development workflow.  Developers use it throughout the day to view the output of CI build/test jobs and to deploy application changes.  The basic Jenkins user management tools worked great when only one or two developers were using it, however, as we grow our team and usage of Jenkins, we wondered if there might be a way to integrate with Cornell central authentication.  We’re happy to report that there is a SAML plugin for Jenkins that works well with the Cornell Shibboleth service.

This post will focus on the “how” of setting up Shibboleth with Jenkins.  For a deeper look at Shibboleth and how it works, see Shawn’s post here: https://blogs.cornell.edu/cloudification/2016/07/11/using-cornell-shibboleth-for-authentication-in-your-custom-application

Getting Started – Basic Test IdP Setup

Add and configure plugin

The first step is to install the Jenkins SAML plugin under Manage Jenkins -> Manage Plugins. The plugin information page is available here: https://wiki.jenkins-ci.org/display/JENKINS/SAML+Plugin. Note, v0.6 or higher is required for compatibility with Cornell Shibboleth.

2016-08-04_05-53-43

Next, let’s configure the plugin. Go to Manage Jenkins -> Configure Global Security. Scroll to “Access Control” as shown here:

2016-08-04_06-02-56

For this step, the IdP metadata field should be filled with the Cornell IdP test metadata available here: https://shibidp-test.cit.cornell.edu/idp/shibboleth.  Just copy it and paste it in the text box.

Fill in the Cornell Shibboleth attributes that best map to the Display Name, Group and Username attributes — the attributes we used are:

displayName: urn:oid:2.16.840.1.113730.3.1.241
edupersonprimaryaffiliation (for group): urn:oid:1.3.6.1.4.1.5923.1.1.1.5
uid (netid): urn:oid:0.9.2342.19200300.100.1.1

Finally, if you are using matrix/project based authorization (we recommend it!) ensure that usernames are lowercase netids.  Also, make sure that you have at least one user (presumably yours, if you are doing administration) with administrative rights to Jenkins.  After saving these settings, you will no longer be able to login with Jenkins usernames/passwords.

Save and Test

To save these settings, click “Apply”.  Then, navigate to your Jenkins instance in a fresh browser (perhaps Incognito/Private Browsing).  You should be redirected to a CUWebAuth screen for the test shibboleth instance.

2016-08-04_06-46-16

After logging in, you should be redirected back to Jenkins and your session should be mapped to the user in Jenkins that matches your netid.  There are some notable limitations to the Shibboleth test system: newer employees (hired in the last several months) will not yet be synced to the TEST directory (i.e., they won’t be able to login), and the login screen will show “TEST INSTANCE” as pictured above.  In the next section, we’ll move to the PROD system for full functionality.

Next Steps – Move to Production IdP

Now that you’ve implemented and tested against the TEST instance of Shibboleth, the next step is to register your application with Cornell Identity Management so that you can use the PROD login systems.

Generate Metadata

First, you’ll need to output the metadata for your Jenkins SAML service provider.  To do that, go back to Manage Jenkins -> Configure Global Security.  Then click “Service Provider Metadata” (highlighted below).

2016-08-04_06-02-56 2

This will output a block of XML that looks like this, which is the metadata for your Jenkins SAML service provider:

2016-08-04_20-14-01

Register metadata with Cornell Identity Management

Next, we need to register this metadata with Identity Management (IDM).

Go to: https://shibrequest.cit.cornell.edu/

Fill out the initial page of contact information and choose the scope of netid roles (Staff, Faculty, etc) that should be allowed to use the tool.  Note – ultimately Jenkins will authorize users based on their netid — the settings here are simply another gate to prevent folks that aren’t in specified roles or groups from logging in at all (i.e., their assertion will never even be sent to Jenkins).  You can safely leave this blank if you don’t have specific restrictions you want to apply at the CUWebAuth level.

On the second page, specify that the metadata has not been published to InCommon, then paste your metadata into the provided box:

2016-08-04_07-09-58

Click Next and submit your request.  Submitting the request will open a ticket with Identity Management and you should receive confirmation of this via email.  Once your application is fully registered with Cornell Shibboleth, you will receive confirmation from IDM staff (normally a quick process ~1 day).

Tell Jenkins to use PROD IdP

Finally, once you receive confirmation from IDM that your application is registered, you’ll need to update Jenkins to use the production IDP metadata.

Go here to get the PROD Shibboleth metadata: https://shibidp.cit.cornell.edu/idp/shibboleth

Then, in Jenkins, return to Manage Jenkins -> Configure Global Security.  Paste the metadata in the IdP metadata block (everything else can stay the same):

2016-08-04_06-02-35

Save, then test that you are able to login.  This time you should see the regular CUWebAuth screen, rather than the “TEST INSTANCE” version.

Advanced Setup Options/Notes

  • It is possible to configure your endpoint metadata to request additional attributes from the Cornell directory.  If you would like to map a different value for group and then use Jenkins group permissions, you can work with IDM to get the desired attribute included in SAML assertions sent to your Jenkins endpoint, then specify those attributes on the Configure Global Security screen.
  • The underlying SAML library can be very sensitive (with good reason) about any mismatch between the stated URL in a user’s browser, and the URL that the webserver believes it is running at.  For example, if you terminate SSL at an ELB, a user may be visiting https://yourjenkins/jenkins/, but the webserver will be listening on port 80 and using the http scheme (i.e., http://yourjenkins/jenkins/).  This manifests with an error like “SAML message intended destination endpoint did not match recipient endpoint”.  Generally, the fix for this is to tell the Tomcat connector that it is being proxied (proxyPort and scheme attributes).  More here: http://beckje01.com/blog/2013/02/03/saml-matching-endpoints-with-tomcat/

That’s It

Now your team should be able to login to Jenkins with their Cornell NetID and Password.  Additionally, if using DUO, access will be two-factor, which is a great improvement.

For more information about Cornell Shibboleth, see the Confluence page here: https://confluence.cornell.edu/display/SHIBBOLETH/Shibboleth+at+Cornell+Page

Docker + Puppet = Win!

by Shawn Bower

On many of our Cloudification projects we use a combination of Docker and Puppet to achieve Infrastructure as code. We use a Dockerfile to create the infrastructure; all the packages required to run the application along with the application code itself. We run puppet inside the container to put down environment specific configuration. We also use a tool called Rocker that adds some handy directives for use in our Docker file.  Most importantly Rocker adds a directive called MOUNT which is used to share volumes between builds.  This allows us to mount local disk space which is ideal for secrets which we do not want copied into the Docker image.  Rocker has to be cool so they us a default filename of Rockerfile.   Let’s take a look at a Rockerfile for one of our PHP applications, dropbox:

dorpbox-dockerfile

 

This image starts from our standard PHP image which is kept up to date a patched weekly by the Cloud Services team.  From there we enable a couple of Apache modules that are needed by this application.  Then the application is copied into the directory ‘/var/www/.’

Now we mount the a local directory that contains our ssh key and encryption keys.   After which we go into the puppet setup.  For our projects we use a masterless puppet setup which relies on the librarian puppet module.  The advantage is we do not need to run a puppet server and we can configure the node at the time we build the image.  For librarian puppet we need to use a Puppetfile, for this project is looks like this:

dropbox-puppetfile

The Puppetfile list all the modules that use wish puppet to have access to and the git path to those modules.  In our case we have a single module for the dropbox application.  Since the dropbox module is stored in a private github repository we will use ssh key we mounted earlier to access it.  In order to do this we will need to add github to our known host file.  Running the command ‘librarian-puppet install’ will read the Puppetfile and install the module into /modules.  We can then use puppet to apply the module to our image.  We can control the environment specific config to install using the “–environment” flag, you can see in our Rockerfile the variable is templated out with “{{ .environment }}”.  This will allow us to specify the environment at build time.  After puppet is run we clean up some permissions issues then copy in our image startup script.  Finally we specify the ports that should be exposed when this image is run.  The build is run with a command like “rocker -var environment=development.”

It is outside the scope of this article to detail how puppet is used, you can find details on puppet here. The puppet module is laid out like this:

dropbox-puppet-layout

The files directory is used to store static files, hier-data is used to store our environment specific config, manifest stores the puppet manifests, spec is for spec test, templates is for storing dynamically generated files and tests is for test that are run to check for compilation errors.  Under hiera-data we will find an eyaml (encrypted yaml) file for each environment.  For instance let us look at the one for dev:

dev_eyaml

You can see that the file format is that of a typical yaml file with the exception of the fields we wish to keep secret.  These are encrypted by the hiera-eyaml plugin.  Early in the Rockerfile we mounted a “keys” folder wich contains the private key to decrypt these secrets when puppet runs.  In order for the hiera-eyaml to work correctly we have to adjust the hiera config, we store the following in our puppet project:

hiera_yaml1

The backends are the order in which to prefer files, in our case we want to give precedence to eyaml.  Under the eyaml config we have to specify where the data files live as well as where to find the encryption keys.  When we run the pupp apply we have to specify the path to this config file with the “–hiera_config” flag.

With this process we can use the same basic infrastructure to build out multiple environments for the dropbox application.  Using the hiera-eyaml plugin we can store the secrets in our puppet repository safely in github as they are encrypted.  Using Rocker we can keep our keys out of the image which limits the exposure of secrets if this image were to be compromised.  Now we can either build this image on the host it will run or push it to our private repository for later distribution.  Given that the images contains secrets like the database password you should give careful consideration on where the image is stored.

Using Cornell Shibboleth for Authentication in your Custom Application.

by Shawn Bower

Introduction

When working on your custom application at Cornell the primary option for integrating authentication with the Cornell central identity store is using Cornell’s Shibboleth Identity Provider (IDP).  Using Shibboleth can help reduce the complexity of your infrastructure and as an added bonus you can enable access to your site to users from other institutions that are members of the InCommon Federation.

How Shibboleth Login Works

Key Terms

  • Service Provider (SP) – requests and obtains an identity assertion from the identity provider. On the basis of this assertion, the service provider can make an access control decision – in other words it can decide whether to perform some service for the connected principal.
  • Identity Provider (IDP) – also known as Identity Assertion Provider, is responsible for (a) providing identifiers for users looking to interact with a system, and (b) asserting to such a system that such an identifier presented by a user is known to the provider, and (c) possibly providing other information about the user that is known to the provider.
  • Single Sign On (SSO) – is a property of access control of multiple related, but independent software systems.
  • Security Assertion Markup Language (SAML) – is an XML-based, open-standard data format for exchanging authentication and authorization data between parties, in particular, between an identity provider and a service provider.

samlsp

  1. A user request a resource from your application which is acting as a SP
  2. Your application constructs an authentication request to the Cornell IDP an redirects the user for login
  3. The user logins in using Cornell SSO
  4. A SAML assertion is sent back to your application
  5. Your application handles the authorization and redirects the user appropriately.

Shibboleth Authentication for Your Application

Your application will act as a service provider.  As an example I have put together a Sinatra application in ruby that can act as a service provider and will use the Cornell Test Shibboleth IDP.  The example source can be found here.  For this example I am using the ruby-saml library provided by One Login, there are other libraries that are compatible with Shibboleth such as omniauth.  Keep in mind that Shibboleth is SAML 2.0 compliant so any library that speaks SAML should work.  One reason I choose the ruby-saml library is that the folks at One Login provide similar libraries in various other languages, you can get more info here.  Let’s take a look at the code:

First it is important to configure the SAML environment as this data will be needed to send and receive information to the Cornell IDP.  We can auto configure the IDP settings by consuming the IDP metadata.  We then have to provide an endpoint for the Cornell IDP to send the SAML assertion to,  in this case I am using “https://shib.srb55.cs.cucloud.net/saml/consume.”  We also need to provide the IDP with an endpoint that allows it to consume our metadata.

We will need to create an endpoint that contains the metadata for our service provider.  With OneLogin::RubySaml this is super simple as it will create it based on the SAML settings we configured earlier.  We simply create “/saml/metadata” which will listen for get request and provided the auto generate metadata.

Next lets create an endpoint in our application that will redirect the user to the Cornell IDP.  We create “/saml/authentication_request” which will listen for get requests and then use the  OneLogin::RubySaml to create an authentication request.  This is done by reading the  SAML settings which include information on the IDP’s endpoints.

Then we need a call back hook that the IDP will send the SAML assertion to after the user has authenticated.  We create “/saml/consume” which will listen for a post from the IDP.  If we receive a valid response from the IDP then we will create a page that will display a success message along with the first name of the authenticate user.  You might be wondering where “urn:oid:2.5.4.42” comes from.  The information returned in the SAML assertion will contain attributes agreed upon by InCommon as well as some attributes provided by Cornell.  The full list is:

AttributeNameInEnterpriseDirectory
AttributeNameInSAMLAssertion
edupersonprimaryaffiliation urn:oid:1.3.6.1.4.1.5923.1.1.1.5
commonName urn:oid:2.5.4.3
eduPersonPrincipalName (netid@cornell.edu) urn:oid:1.3.6.1.4.1.5923.1.1.1.6
givenName (first name) urn:oid:2.5.4.42
surname(last name) urn:oid:2.5.4.4
displayName urn:oid:2.16.840.1.113730.3.1.241
uid (netid) urn:oid:0.9.2342.19200300.100.1.1
eduPersonOrgDN urn:oid:1.3.6.1.4.1.5923.1.1.1.3
mail urn:oid:0.9.2342.19200300.100.1.3
eduPersonAffiliation urn:oid:1.3.6.1.4.1.5923.1.1.1.1
eduPersonScopedAffiliation urn:oid:1.3.6.1.4.1.5923.1.1.1.9
eduPersonEntitlement urn:oid:1.3.6.1.4.1.5923.1.1.1.7

Conclusion

We covered the basic concepts of using Shibboleth and SAML creating a simple demo application.  Shibboleth is a great choice when looking at architecting your custom application in the cloud, as it drastically simplifies the authentication infrastructure while still using Cornell SSO.  An important note is that in this example we used the Cornell Test Shibboleth IDP which allows us to create an anonymous service provider.  When moving your application into production you will need to register your service provider from https://shibrequest.cit.cornell.edu.  For more information about Shibboleth at Cornell please take a look at IDM’s Confluence page.


This article was updated May 2021 to remove outdated information.

Benchmarking Network Speeds for Traffic between Cornell and “The Cloud”

by Paul Allen

As Cornell units consider moving various software and services to the cloud, one of the most common questions the Cloudification Services Team gets is “What is the network bandwidth between cloud infrastructure and campus?” Bandwidth to cloud platforms like Amazon Web Services and Microsoft Azure seems critical now, as units are transitioning operations. It’s during that transition that units will have hybrid operations–part on-premise and part in-cloud–and moving or syncing large chunks of data is common.

(more…)

Using Shibboleth for AWS API and CLI access

by Shawn Bower


Update 2019-11-06: We now recommend using awscli-login to obtaining temporary AWS credentials via SAML. See our wiki page Access Keys for AWS CLI Using Cornell Two-Step Login (Shibboleth)


This post is heavily based on “How to Implement Federated API and CLI Access Using SAML 2.0 and AD FS” by Quint Van Derman, I have used his blueprint to create a solution that works using Shibboleth at Cornell.

TL;DR

You can use Cornell Shibboleth login for both API and CLI access to AWS.  I built docker images that will be maintained by the Cloud Services team that can be used for this and it is as simple as running the following command:

docker run -it --rm -v ~/.aws:/root/.aws dtr.cucloud.net/cs/samlapi

After this command has been run it will prompt you for your netid and password.  This will be used to login you into Cornell Shibboleth. You will get a push from DUO.  Once you have confirmed the DUO notification, you will be prompted to select the role you wish to use for login, if you have only one role it will choose that automatically.  The credentials will be placed in the default credential file (~/.aws/credentials) and can be used as follows:

aws --profile saml s3 ls

NOTE: In order for the script to work you must have at least two roles, we can add you to a empty second role if need be.  Please contact cloud-support@cornell.edu if you need to be added to a role.

If there are any problems please open an issue here.

Digging Deeper

All Cornell AWS accounts that are setup by the Cloud Services team are setup to use Shibboleth for login to the AWS console. This same integration can be used for API and CLI access allowing folks to leverage AD groups and aws roles for users. Another advantage is this eliminates the need to monitor and rotate IAM access keys as the credentials provided through SAML will expire after one hour. It is worth noting the non human user ID will still have to be created for automating tasks where it is not possible to use ec2 instance roles.

When logging into the AWS management console the federation process looks likesaml-based-sso-to-console.diagram

  1. A user goes to the URL for the Cornell Shibboleth IDP
  2. That user is authenticated against Cornell AD
  3. The IDP returns a SAML assertion which includes your roles
  4. The data is posted to AWS which matches roles in the SAML assertion to IAM roles
  5.  AWS Security Token Services (STS) issues a temporary security credentials
  6. A redirect is sent to the browser
  7. The user is now in the AWS management console

In order to automate this process we will need to be able to interact with the Shibboleth endpoint as a browser would.  I decided to use Ruby for my implementation and typically I would use a lightweight framework like ruby mechanize to interact with webpages.  Unfortunately the DUO integration is done in an iframe using javascript, this makes things gross as it means we need a full browser. I decided to use selenium web driver to do the heavy lifting. I was able to script the login to Shibboleth as well as hitting the button for a DUO push notification:
duo-push

In development I was able to run this on mac just fine but I also realize it can be onerous to install the dependencies needed to run selenium web driver.  In order to make the distribution simple I decided to create a docker images that would have everything installed and could just be run.  This meant I needed a way to run selenium web driver and firefox inside a container.  To do this I used Xvfb to create a virtual frame buffer allowing firefox to run with out a graphics card.  As this may be useful to other projects I made this a separate image that you can find here.  Now I could create a Dockerfile with the dependencies necessary to run the login script:

saml-api-dockerfile

The helper script starts Xvfb and set the correct environment variable and then launches the main ruby script.  With these pieces I was able to get the SAML assertion from Shibboleth and the rest of the script mirrors what Quint Van Derman had done.  It parses the assertion looking for all the role attributes.  Then it presents the list of roles to the user where they can select which role they wish to assume.  Once the selection is done a call is made to the Simple Token Service (STS) to get the temporary credentials and then the credentials are stored in the default AWS credentials file.

Conclusion

Now you can manage your CLI and API access the same way you manage your console access. The code is available and is open source so please feel free to contribute, https://github.com/CU-CloudCollab/samlapi. Note I have not tested this on Windows but it should work if you change the volume mount to the default credential file on Windows. I can see the possibility to do future enhancements such as adding the ability to filter the role list before display it, so keep tuned for updates. As always if you have any questions with this or any other Cloud topics please email cloud-support@cornell.edu.

How to run Jenkins in ElasticBeanstalk

by Shawn Bower

The Cloud Services team in CIT maintains docker images for common pieces of software like apache, java, tomcat, etc.  One of these images that we maintain is Cornellized Jenkins images.  This image contains Jenkins with the oracle client and Cornell OID baked in.  One of the easiest way to get up and running in AWS with this Jenkins instance is to use Elastic Beanstalk which will manage the infrastructure components.  Using Elastic Beanstalk you don’t have to worry about patching as it will manage the underlying OS of your ec2 instances.  The Cloud Services team releases patched version of Jenkins image on a weekly basis. If you want to stay current the you just need to kick off a new deploy in Elastic Beanstalk.  Let’s walk through the process of getting this image running on Elastic Beanstalk!

A.) Save Docker Hub credentials to S3

INFO:

Read about using private Docker repos with Elastic Beanstalk.

We need to make our DTR credentials available to Elastic Beanstalk, so automated deployments can pull the image from the private repository.

  1. Create an S3 bucket to hold Docker assets for your organization— we use  cu-DEPT-docker 
  2. Login to Docker docker login dtr.cucloud.net
  3. Upload the local credentials file ~/.docker/config.json to the S3 bucket cu-DEPT-docker/.dockercfg 

    Unfortunately, Elastic Beanstalk uses an older version of this file named  .dockercfg.json  The formats are slightly different. You can read about the differences here.

    For now, you’ll need to manually create  .dockercfg & upload it to the S3 bucket  cu-DEPT-docker/.dockercfg

B.) Create IAM Policy to Read The S3 Bucket

    1. Select Identity and Access Management for the AWS management consoleIAM-step-1
    2. Select Policies IAM-step-2
    3. Select “Create Policy” IAM-step-3
    4. Select “Create Your Own Policy” IAM-step-4
    5. Create a policy name “DockerCFGReadOnly,” see the example policy provided. IAM-step-5
Below is an example Policy for reading from a S3 bucket.
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1466096728000",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::cu-DEPT-dockercfg"
            ]
        },
        {
            "Sid": "Stmt1466096728001",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:HeadObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::cu-DEPT-dockercfg/.dockercfg"
            ]
        }
    ]
}

 

C.) Setup the Elastic Beanstalk environment

  1. Create a Dockerrun.aws.json fileHere’s an example.
  2. {
      "AWSEBDockerrunVersion": "1",
      "Image": {
        "Name": "dtr.cucloud.net/cs/jenkins:latest"
     },
     "Ports": [
       {
         "ContainerPort": "8080"
       }
     ],
     "Authentication": {
       "Bucket": "cu-DEPT-dockercfg",
       "Key": ".dockercfg"
     },
     "Volumes": [
       {
         "HostDirectory": "/var/jenkins_home",
         "ContainerDirectory": "/var/jenkins_home"
       },
       {
         "HostDirectory": "/var/run/docker.sock",
         "ContainerDirectory": "/var/run/docker.sock"
       }
     ]
    }
    

    The Authentication section refers to the Docker Hub credentials that were saved to S3.

    The Image section refers to the Docker image that was pushed to Docker Hub.

  3. We will also need to do some setup to instance using .ebextenstions.  Create a folder called “.ebextensions” and inside that folder create a file called “instance.config”  Add the following to the file:
  4.  

    container_commands:
     01-jenkins-user:
     command: useradd -u 1000 jenkins || echo 'User already exist!'
     02-jenkins-user-groups:
     command: usermod -aG docker jenkins
     03-jenkins-home:
     command: mkdir /var/jenkins_home || echo 'Directory already exist!'
     04-changeperm:
     command: chown jenkins:jenkins /var/jenkins_home
    

     

  5. Finally create a zip file with the Dockerrun.aws.json file and the .ebextenstions folder.
    zip -r jenkins-stalk.zip Dockerrun.aws.json .ebextensions/ 
    

 

 

D.) Setup Web Server Environment

  1. Choose Docker & Load balancing, autoscaling
    create_environment
  2. Select your local zip file that we created earlier ( jenkins-stalk.zip ) as the “Source” for the application version section application
  3. Set the appropriate environment name, for example you could use jenkins-prodenvironment
  4. Complete the configuration details

    NOTE: There are several options beyond the scope of this article.

    We typically configure the following:deployment

  5. Complete the Elastic Beanstalk wizard and launch.  If you are working with a standard Cornell VPC configuration, make sure the ELB is in the two public subnets while the EC2 instances are in the private subnets.
  6. NOTE: You will encounter additional AWS features like security groups etc… These topics are beyond the scope of this article.  If presented with a check box for launching inside a VPC you should check this box.

    Create_Application

    The container will not start properly the first time. Don’t panic.  
     
    We need to attach the IAM Policy we built earlier to the instance role used by Elastic Beanstalk.jerkins-prod_-_Dashboard_and__5__Twitter

  7. Select Identity & Access Management for the AWS management console
  8. IAM-step-1

  9.  Select “Roles” then select “aws-elasticbeanstalk-ec2-role”

IAM-step-6

  • Attach the “DockerCFGReadOnly” Policy to the role IAM-step-7

 

E.) Re-run the deployment in Elastic Beanstalk.  You can just redeploy the current version.

 

  1. Now find the URL to your Jenkins environment
  2. jenkins-prod-url

  3. And launch Jenkins

jenkins-running

SUCCESS !

 

F.) (optional) Running docker command inside Jenkins

The Jenkins image comes with docker preinstalled so you can run docker build and deploys from Jenkins.  In order to use it we need to make a small tweak to the Elastic Beanstalk Configuration.  This is because we are keeping the docker version inside the image patched and on the latest commercially supported release however Elastic Beanstalk currently supports docker 1.9.1. To get things working we need to add an environment variable to use an older docker API.  First go to configurations and select the cog icon under Software Configuration.

jenkins-prod_-_Configuration
Now we need to add a new environment variable, DOCKER_API_VERSION and set its value to 1.21 .
jenkins-env-var

That is it! Now you will be able to use the docker CLI in your Jenkins jobs

 

Conclusion

Within a few minutes you can have a managed Jenkins environment hosted in AWS.
There are a few changes you may want to consider for this environment.

  • Changing the autoscaling group to min 1 and max 1 makes sense since the Jenkins state data is stored on a local volume.  Having more than one instance in the group would not be useful.
  • Also considering the state data, including job configuration, is stored on a local volume you will want to make sure to backup the EBS volume for this instances.  You could also look into a NAS solution such as Elastic File Service to store state for Jenkins, this would require a modification to /var/jenkins_home path.
  • It is strongly encouraged that an HTTP SSL listener is used for the Elastic Load Balancer(ELB) and that the HTTP listener is turned off, to avoid sending credentials in plain text.

 

The code used in this blog is available at: https://github.com/CU-CloudCollab/jenkins-stalk, Please free to use and enhance it.

If you have any questions or issues please contact the Cloud Services team at cloud-support@cornell.edu