Cloud

How To: Launch A Jump Host In AWS Using Terraform

I have been a Hashicorp fan boy for a couple of years now. I am impressed, and happy with pretty much everything they have done from Vagrant to Consul and more. In short they make the DevOps world a better place. That being said this article is about the aptly named Terraform product. Here is how Hashicorp describes Terraform in their own words…

“Terraform enables you to safely and predictably create, change, and improve production infrastructure. It is an open source tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.”

Interestingly, enough it doesn’t point out, but in a way implies it by omitting anything about providers that Terraform is multi-cloud (or cloud agnostic). Terraform works with AWS, GCP, Azure and Openstack. In this article we will be covering how to use Terraform with AWS.

Step 1, download Terraform, I am not going to cover that part 😉
https://www.terraform.io/downloads.html

Step 2, Configuration…

Configuration

Hashicorp uses their own configuration language for Terraform, it is fully JSON compatible, which is nice.. The details are covered here https://github.com/hashicorp/hcl.

After downloading and installing Terraform, its time to start generating the configs.

AWS IAM Keys

AWS keys are required to do anything with Terraform. You can read about how to generate an access key / secret key for a user here : http://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html#Using_CreateAccessKey

Terraform Configuration Files Overview

When you execute the terraform commands, you should be within a directory containing terraform configuration files. Those files ending in a ‘.tf’ extension will be loaded by Terraform in alphabetical order.

Before we jump into our configuration files for our Jump box, it may be helpful to review a quick primer on the syntax here https://www.terraform.io/docs/configuration/syntax.html

& some more advanced features such as lookup here https://www.terraform.io/docs/configuration/interpolation.html

Most users of Terraform choose to make their configs modular resulting in multiple .tf files with names like main.tf, variables.tf and data.tf… This is not required…you can choose to put everything in one big Terraform file, but I caution you modularity/compartmentalization is always a better approach than one monolithic file. Let’s take a look at our main.tf

Main.tf

Typically if you only see one terraform config file, it is called main.tf, most commonly there will be at least one other file called variables.tf used specifically for providing values for variables used in other TF files such as main.tf. Let’s take a look at our main.tf file section by section.

Provider

The provider keyword is used to identify the platform (cloud) you will be talking to, whether it is AWS, or another cloud. In our case it is AWS, and define three configuration items, an access key, secret key, and a region all of which we are inserting variables for, which will later be looked up / translated into real values in variables.tf

provider "aws" {
        access_key = "${var.aws_access_key_id}"
        secret_key = "${var.aws_secret_access_key}"
        region = "${var.aws_region}"
}

Resource aws_instance

This section defines a resource, which is an “aws_instance” that we are calling “jump_box”. We define all configuration requirements for that instance. Substituting variable names where necessary, and in some cases we just hard code the value. Notice we are attaching two security groups to the instance to allow for ICMP & SSH. We are also tagging our instance, which is critical an AWS environment so your administrators/teammates have some idea about the machine that is spun up and what it is used for.

resource "aws_instance" "jump_box" {
        ami = "${lookup(var.ami_base, "centos-7")}"
        instance_type = "t2.medium"
        key_name = "${var.key_name}"
        vpc_security_group_ids = [
                "${lookup(var.security-groups, "allow-icmp-from-home")}",
                "${lookup(var.security-groups, "allow-ssh-from-home")}"
        ]
        subnet_id = "${element(var.subnets_private, 0)}"
        root_block_device {
                volume_size = 8
                volume_type = "standard"
        }
        user_data = <<-EOF
                                #!/bin/bash
                                yum -y update
                                EOF
        tags = {
                Name = "${var.instance_name_prefix}-jump"
                ApplicationName = "jump-box"
                ApplicationRole = "ops"
                Cluster = "${var.tags["Cluster"]}"
                Environment = "${var.tags["Environment"]}"
                Project = "${var.tags["Project"]}"
                BusinessUnit = "${var.tags["BusinessUnit"]}"
                OwnerEmail = "${var.tags["OwnerEmail"]}"
                SupportEmail = "${var.tags["SupportEmail"]}"
        }
}

Btw, the this resource type is provided by an AWS module found here https://www.terraform.io/docs/providers/aws/r/instance.html

You have to download the module using terraform get (which can be done once you write some config files and type terraform get 🙂 ).

Also, note the usage of ‘user_data’ here to update the machines packages at boot time. This is an AWS feature that is exposed through the AWS module in terraform.

Resource aws_security_group

Next we define a new security group (vs attaching an existing one in the above section). We are creating this new security group for other VM’s in the environment to attach later, such that it can be used to allow SSH from the Jump host to the VM’s in the environment.

Also notice under cidr_blocks we define a single IP address a /32 of our jump host…but more important is to notice how we determine that jump hosts IP address. Using .private_ip to access the attribute of the “jump_box” aws_instance we are creating/just created in AWS. That is pretty cool.

resource "aws_security_group" "jump_box_sg" {
        name = "${var.instance_name_prefix}-allow-ssh-from-jumphost"
        description = "Allow SSH from the jump host"
        vpc_id = "${var.vpc_id}"

        ingress {
                from_port = 22
                to_port = 22
                protocol = "tcp"
                cidr_blocks = ["${aws_instance.jump_box.private_ip}/32"]
        }

        tags = "${var.tags_infra_default}"
}

Resource aws_route53_record

The last entry in our main.tf creates a DNS entry for our jump host in Route53. Again notice we are specifying a name of jump. has a prefix to an entry, but the remainder of the FQDN is figured out by the lookup command.  The lookup command is used to lookup values inside of a map. In this case the map is defined in our variables.tf that we will review next.

resource "aws_route53_record" "jump_box_dns" {
        zone_id = "${lookup(var.route53_zone, "id")}"
        type = "A"
        ttl = "300"
        name = "jump.${lookup(var.route53_zone, "name")}"
        records = ["${aws_instance.jump_box.private_ip}"]
}

Variables.tf

I will attempt to match the section structure I used above for main.tf when explaining the variables in variables.tf though it is not really in as clear of a layout using sections.

Provider variables

When terraform is run it compiles all .tf files, and replaces any key that equals a variable, with the value it finds listed in the variables.tf file (in our case) with the variable keyword as a prefix. Notice that the first two variables are empty, they have no value defined. Why ? Terraform supports taking input at runtime, by leaving these values blank, Terraform will prompt us for the values. Region is pretty straight forward, default is the value returned and description in this case is really an unused value except as a comment.

variable "aws_access_key_id" {}
variable "aws_secret_access_key" {}

variable "aws_region" {
        description = "AWS region to create resources in"
        default = "us-east-1"
}

I would like to demonstrate the behavior of Terraform as described above, when the variables are left empty

➜  jump terraform plan
var.aws_access_key_id
  Enter a value: aaaa

var.aws_secret_access_key
  Enter a value: bbbb

Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but
will not be persisted to local or remote state storage.

...

At this phase you would enter your AWS key info, and terraform would ‘plan’ out your deployment. Meaning it would run through your configs, and print to your screen it’s plan, but not actually change any state in AWS. That is the difference between Terraform plan & Terraform apply.

Resource aws_instance variables

Here we define the values of our AMI, SSH Key, Instance prefix for the name, Tags, security groups, and subnets. Again this should be pretty straight forward, no magic here, just the use of string variables & maps where necessary.

variable "ami_base" {
        description = "AWS AMIs for base images"
        default = {
                "centos-7" = "ami-2af1ca3d"
                "ubuntu-14.04" = "ami-d79487c0"
        }
}

variable "key_name" {
        default = "tuxninja-rsa-2048"
}

variable "instance_name_prefix" {
        default = "tuxlabs-"
}

variable "tags" {
        type = "map"
        default = {
                        ApplicationName = "Jump"
                        ApplicationRole = "jump box - bastion"
                        Cluster = "Jump"
                        Environment = "Dev"
                        Project = "Jump"
                        BusinessUnit = "TuxLabs"
                        OwnerEmail = "tuxninja@tuxlabs.com"
                        SupportEmail = "tuxninja@tuxlabs.com"
        }
}

variable "tags_infra_default" {
        type = "map"
        default = {
                        ApplicationName = "Jump"
                        ApplicationRole = "jump box - bastion"
                        Cluster = "Jump"
                        Environment = "DEV"
                        Project = "Jump"
                        BusinessUnit = "TuxLabs"
                        OwnerEmail = "tuxninja@tuxlabs.com"
                        SupportEmail = "tuxninja@tuxlabs.com"
        }
}

variable "security-groups" {
        description = "maintained security groups"
        default = {
                "allow-icmp-from-home" = "sg-a1b75ddc"
                "allow-ssh-from-home" = "sg-aab75dd7"
        }
}

variable "vpc_id" {
        description = "VPC us-east-1-vpc-tuxlabs-dev01"
        default = "vpc-c229daa5"
}

variable "subnets_private" {
        description = "Private subnets within us-east-1-vpc-tuxlabs-dev01 vpc"
        default = ["subnet-78dfb852", "subnet-a67322d0", "subnet-7aa1cd22", "subnet-75005c48"]
}

variable "subnets_public" {
        description = "Public subnets within us-east-1-vpc-tuxlabs-dev01 vpc"
        default = ["subnet-7bdfb851", "subnet-a57322d3", "subnet-47a1cd1f", "subnet-73005c4e"]
}

It’s important to note the variables above are also used in other sections as needed, such as the aws_security_group section in main.tf …

Resource aws_route53_record variables

Here we define the ID & Name that are used in the ‘lookup’ functionality from our main.tf Route53 section above.

variable "route53_zone" {
        description = "Route53 zone used for DNS records"
        default = {
                id = "Z1ME2RCUVBYEW2"
                name = "tuxlabs.com"
        }
}

It’s important to note Terraform or TF files do not care when or where things are loaded. All files are loaded and variables require no specific order consistent with any other part of the configuration. All that is required is that for each variable you try to insert a value for, it has a value listed via the variable keyword in a TF file somewhere.

Output.tf

Again, I want to remind folks you can put these terraform syntax in one file if you wanted to, but I choose to split things up for readability and simplicity. So we have an output.tf file specifically for the output command, there is only one command, which lists the results of our terraform configurations upon success.

output "jump-box-details" {
	value = "${aws_route53_record.jump_box_dns.fqdn} - ${aws_instance.jump_box.private_ip} - ${aws_instance.jump_box.id} - ${aws_instance.jump_box.availability_zone}"
}

Ok so let’s run this and see how it looks…First a reminder, to test your config you can run Terraform plan first..It will tell you the changes its going to make…example

➜  jump terraform plan
var.aws_access_key_id
  Enter a value: blahblah

var.aws_secret_access_key
  Enter a value: blahblahblahblah

Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but
will not be persisted to local or remote state storage.


The Terraform execution plan has been generated and is shown below.
Resources are shown in alphabetical order for quick scanning. Green resources
will be created (or destroyed and then created if an existing resource
exists), yellow resources are being changed in-place, and red resources
will be destroyed. Cyan entries are data sources to be read.

Note: You didn't specify an "-out" parameter to save this plan, so when
"apply" is called, Terraform can't guarantee this is what will execute.

+ aws_instance.jump_box
    ami:                                       "ami-2af1ca3d"
    associate_public_ip_address:               "<computed>"
    availability_zone:                         "<computed>"
    ebs_block_device.#:                        "<computed>"
    ephemeral_block_device.#:                  "<computed>"
    instance_state:                            "<computed>"
    instance_type:                             "t2.medium"

...

Plan: 3 to add, 0 to change, 0 to destroy.

If everything looks good & is green, you are ready to apply.

aws_security_group.jump_box_sg: Creation complete
aws_route53_record.jump_box_dns: Still creating... (10s elapsed)
aws_route53_record.jump_box_dns: Still creating... (20s elapsed)
aws_route53_record.jump_box_dns: Still creating... (30s elapsed)
aws_route53_record.jump_box_dns: Still creating... (40s elapsed)
aws_route53_record.jump_box_dns: Still creating... (50s elapsed)
aws_route53_record.jump_box_dns: Creation complete

Apply complete! Resources: 3 added, 0 changed, 0 destroyed.

The state of your infrastructure has been saved to the path
below. This state is required to modify and destroy your
infrastructure, so keep it safe. To inspect the complete state
use the `terraform show` command.

State path: terraform.tfstate

Outputs:

jump-box-details = jump.tuxlabs.com - 10.10.195.46 - i-037f5b15bce6cc16d - us-east-1b
➜  jump

Congratulations, you now have a jump box in AWS using Terraform. Remember to attach the required security group to each machine you want to grant access to, and start locking down your jump box / bastion and VM’s.

Outro

Remember if you take the above config and try to run it, swapping out only the variables it will error something about a required module. Downloading the required modules is as simple as typing ‘terraform get‘ , which I believe the error message even tells you 🙂

So again this was a brief intro to Terraform it does a lot & is extremely powerful. One of the thing I did when setting up a Mongo cluster using Terraform, was to take advantage of a map to change node count per region. So if you wanted to deploy a different number of instances in different regions, your config might look something like…

main.tf

  count = "${var.region_instance_count[var.region_name]}"

variables.tf

variable "region_instance_count" {
  type = "map"
  default = {
    us-east-1 = 2
    us-west-2 = 1
    eu-central-1 = 1
    eu-west-1 = 1
  }
}

It also supports split if you want to multi-value a string variable.

Another couple things before I forget, Terraform apply, doesn’t just set up new infrastructure, it also can be used to modify existing infrastructure, which is a very powerful feature. So if you have something deployed and want to make a change, terraform apply is your friend.

And finally, when you are done with the infrastructure you spun up or it’s time to bring her down… ‘terraform destroy’

➜  jump terraform destroy
Do you really want to destroy?
  Terraform will delete all your managed infrastructure.
  There is no undo. Only 'yes' will be accepted to confirm.

  Enter a value: yes

var.aws_access_key_id
  Enter a value: blahblah

var.aws_secret_access_key
  Enter a value: blahblahblahblah

...

Destroy complete! Resources: 3 destroyed.
➜  jump

I hope this article helps.

Happy Terraforming 😉

 

How To: Launch A Jump Host In AWS Using Terraform Read More »

How To: Launch EC2 Instances In AWS Using The AWS CLI

 

It occurred to me recently that while I have written articles on Boto for AWS (the Python SDK) I have yet to write articles on how to use the AWS CLI, Terraform and the Go SDK. All of that will come in due time, for starters this article is going to be about the AWS CLI.

To start you will need to install the AWS CLI  following these links:
https://aws.amazon.com/cli/
https://github.com/aws/aws-cli

Note you will need to make sure you have an account with an access key and have setup the required credentials under ~/.aws/ for the CLI to work. How to do this is covered near the end of the second link above to the git repo.

After that is done you are ready to rock and roll. To test it out you can run…

aws ec2 describe-instances

Assuming your default region, and profile settings are correct it should output JSON.

Launching an EC2 instance

To launch an EC2 instance from the command line use the command below replacing the variables preceded with $ with their real values.

aws --profile $account --region $region ec2 run-instances --image-id $image_id --count $count --instance-type $instance_type --key-name $ssh_key_name --subnet-id $subnet_id

(Assuming you have setup the required dependencies like uploading your SSH key to AWS and specifying its name in the command above this should launch your VM).

It should be noted there is a lot more you can to to tweak your instance, such as changing the EBS volume size for your root disk that is launched or tagging. You will see examples of this in my shell script. The purpose of this article is to share a shell script I have written and use whenever I want to quickly launch a test VM (which is common). For more permanent things I use an infrastructure as code approach via Terraform. But the need for launching quick test VM’s never goes away, thus this shell script was born. You will notice my script auto-tags our VM’s…I do this because in our environment if you VM isn’t tagged appropriately it is deleted + it’s courtesy in an AWS environment to tag your resources, otherwise no one will ever what tree to bark up when there is a problem such as ‘are you still using this cause it looks idle?’ 🙂

My Shell Script for Launching EC2 VM’s

#!/bin/bash

# Global Settings
account="my-account"
region="us-east-1"

# Instance settings
image_id="ami-03ebd214" # ubuntu 14.04
ssh_key_name="my_ssh_key-rsa-2048"
instance_type="m4.xlarge"
subnet_id="subnet-b8214792"
root_vol_size=20
count=1

# Tags
tags_Name="my-test-instance"
tags_Owner="tuxninja"
tags_ApplicationRole="Testing"
tags_Cluster="Test Cluster"
tags_Environment="dev"
tags_OwnerEmail="tuxninja@tuxlabs.com"
tags_Project="Test"
tags_BusinessUnit="Cloud Platform Engineering"
tags_SupportEmail="tuxninja@tuxlabs.com"

echo 'creating instance...'
id=$(aws --profile $account --region $region ec2 run-instances --image-id $image_id --count $count --instance-type $instance_type --key-name $ssh_key_name --subnet-id $subnet_id --block-device-mapping "[ { \"DeviceName\": \"/dev/sda1\", \"Ebs\": { \"VolumeSize\": $root_vol_size } } ]" --query 'Instances[*].InstanceId' --output text)

echo "$id created"

# tag it

echo "tagging $id..."

aws --profile $account --region $region ec2 create-tags --resources $id --tags Key=Name,Value="$tags_Name" Key=Owner,Value="$tags_Owner"  Key=ApplicationRole,Value="$tags_ApplicationRole" Key=Cluster,Value="$tags_Cluster" Key=Environment,Value="$tags_Environment" Key=OwnerEmail,Value="$tags_OwnerEmail" Key=Project,Value="$tags_Project" Key=BusinessUnit,Value="$tags_BusinessUnit" Key=SupportEmail,Value="$tags_SupportEmail" Key=OwnerGroups,Value="$tags_OwnerGroups"

echo "storing instance details..."
# store the data
aws --profile $account --region $region ec2 describe-instances --instance-ids $id > instance-details.json

echo "create termination script"
echo "#!/bin/bash" > terminate-instance.sh
echo "aws --profile $account --region $region ec2 terminate-instances --instance-ids $id" >> terminate-instance.sh
chmod +x terminate-instance.sh

After substituting the required variables at the top with your real values you can run this script. Notice that after creating the VM I capture the instance details in a file & the ID in a variable so I can subsequently tag it, and then I create a termination script…this makes for very simple operations when you need to repeatedly start and then kill/destroy/delete a VM.

Using these scripts should come in quite handy. A copy of create-instance.sh can be found on my github here.

One other thing… I use the normal AWS CLI for automation as shown here…but for poking around interactively I use something called ‘aws-shell’ formerly ‘saw’. Check it out and you won’t be disappointed !

My next post will be on Terraform or the Go SDK…but both are coming soon!

How To: Launch EC2 Instances In AWS Using The AWS CLI Read More »

Setting up Netflix’s Edda (CMDB) in AWS on Ubuntu

If you are running any kind of environment with greater than 10 servers, than you need a CMDB (Configuration Management DataBase). CMDB’s are the brain of your fleet & it’s environment. You can store anything in a CMDB, but commonly the metadata in CMDB’s consists of any of the following physical & digital asset inventory, software licenses, software configuration data, policy information, relationships (I.E. This VM—> Compute –> Rack –> Availability Zone –> Datacenter), automation metadata, and more… they also commonly provide change history for changes in your environment.

In the world of infrastructure as code, CMDB is king.

CMDB’s enable endless automation possibilities, without them you are stuck gathering and collecting ‘current’ configuration state about your infrastructure every time you want perform an automated change or run an audit/report . In my career I have built or been a part of CMDB efforts at nearly every company I have worked for. They are simply necessary, and by their nature they tend to require the choice of ‘built by us’ vs ‘buy or run’.

However, if you have the luxury of only running in AWS, you are in luck, because Netflix (The AWS poster child)  open sourced Edda in 2012 for this purpose!

Rather than talk about the specific features of Edda refer to the blog post or documentation, I want to keep this article short and jump right into setting up Edda, which is a bit tricky, because the documentation is out of date!

Setting Up Edda (2016)

First, in AWS you need setup an EC2 VM that has at least.. 6G for OS + dependencies including Mongo, and then however much disk you need to store the metadata for your environment (keep in mind it keeps change history). Personally I just created a root partition with 100G to keep things simple. For instance type I used ‘m4.xlarge’ and the Ubuntu version is 14.04.

After booting the VM, SSH to it and create a directory wherever your storage is allocated partition wise to store Edda & it’s dependencies. I will be using /cmdb/ in my example.

Initial Install Steps

mkdir /cmdb
cd /cmdb
export JAVA_OPTS="-Xmx1g -XX:MaxPermSize=256M"
git clone https://github.com/Netflix/edda.git
sudo add-apt-repository -y ppa:webupd8team/java &> /dev/null
sudo apt-get update
sudo debconf-set-selections <<< 'oracle-java8-installer shared/accepted-oracle-license-v1-1 boolean true'
sudo apt-get install -y oracle-java8-installer
sudo apt-get install -y scala
sudo apt-get install make

cd /cmdb/edda
make build

For the record, the Edda Wiki has the build steps wrong, it appears they no long are using Gradle, but have switch to SBT… which reminds me be aware Edda is written in Scala, which isn’t as popular as Java, Python etc… in addition it’s functional programming, which I don’t personally know a lot about, but I hear it’s got quite the learning curve..so beware if you need to make custom code changes, I would not recommend it, unless you know Scala ! 🙂

After the build of Edda succeeds, install Mongo

apt-get install -y mongodb

That’s it for dependencies

Configuring Mongo

For Edda to use Mongo all we need to do is ‘use’ the database we want to use for Edda & create an associated user. (Mongo will auto-create DB’s upon insert).

mongo

> use edda
> db.addUser({user:'edda',pwd:'t00t0ri4l',Roles: { edda: ['readWrite']}, roles: []})

You can test the user is working by doing… 

$ mongo edda -u edda -p
MongoDB shell version: 2.4.9
Enter password:
connecting to: edda
Server has startup warnings:
Sat Dec 10 00:53:21.093 [initandlisten]
Sat Dec 10 00:53:21.094 [initandlisten] ** WARNING: You are running on a NUMA machine.
Sat Dec 10 00:53:21.094 [initandlisten] **          We suggest launching mongod like this to avoid performance problems:
Sat Dec 10 00:53:21.094 [initandlisten] **              numactl --interleave=all mongod [other options]
Sat Dec 10 00:53:21.094 [initandlisten]
>

Configuring Edda

Under /cmdb/edda/src/main/resources we need to modify ‘edda.properties’ with valid config values for accounts, regions & mongo access.

Relevant Mongo Values

edda.mongo.address=127.0.0.1:27017
edda.mongo.database=edda
edda.mongo.user=edda
edda.mongo.password=t00t0ri4l

Account & Region Values 

edda.accounts=dev.us-east-1
edda.dev.us-east-1.region=us-east-1
edda.dev.us-east-1.aws.accessKey=fakeaccesskey
edda.dev.us-east-1.aws.secretKey=fakesecret

The above example is using one account and only one region. The Edda configuration uses generic labels, they are very flexible, but when using them you might be confused by the name of the label as it’s intent. Don’t fall into that trap, I did, and then I found this post on Google Groups… Check it out to gain more insight on how the configuration works and can be tweaked for  your needs. There is also the standard documentation, but it’s a little light IMO.

Running Edda

Congrats you made it, time to run Edda ! Again the documentation has this wrong (listed as gradle & Jetty)…instead were using SBT + Jetty…

$ cd /cmdb/edda/
$ ./project/sbt
> jetty:start

If everything goes smoothly you will start to see logs about crawling AWS API’s spewing to your screen 🙂 After about 2 minutes you should see data. You can check by doing a curl.

curl http://127.0.0.1:8080/api/v2/view/instances

This API URL should return a JSON object with instance ID’s for the account & region specified.

Additionally, Edda is listening on whatever private IP address you have setup, you will just need to modify the default security group to allow 8080 on your machine.

I get a bit frustrated with out of date documentation..so I hope this helps ! Happy automating !

Setting up Netflix’s Edda (CMDB) in AWS on Ubuntu Read More »

AWS, Google Cloud, Azure, and the centralized future of the Internet

I have just left AWS re-invent and I wanted to give my brief thoughts on the future of cloud computing. I believe in the next few years the shift we have been witnessing will be completed. That is to say that the thousands of enterprises and small businesses alike will finish their migrations to public clouds, simply because the benefits are far too great. Less people, less hardware, less glue code, more functionality, more value etc. AWS will be the dominant public cloud for the next couple of years minimum due to their first to market advantage and If you look at the announcements at re-invent 2016, you see a series of products that solve common problems. In fact a lot of the “innovations” AWS announced today, replace many SaaS solutions who ironically (or maybe not so much) are hosted on AWS. None of this concerns me, this is great disruption. AWS is teaching these businesses to move even further up and away from creating tooling for DevOps as products (they will take care of that) and focus on products that provide value differently, like sifting through massive amounts of data, increasing the quality and providing intelligence from that data. This is all great, and it’s definitely where things are headed, kudos to Amazon for guiding folks.

BUT here is what is disturbing to me, and it’s seems like no one talks about it. It’s as if they can’t see the elephant in the room.

The elephant in the room is that every company in the world is converging on a fewer number/types of physical devices, paths, datacenters etc. This means the global failure domains that should be distributed in nature are actually becoming more centralized therefore the risk of massive security or availability (outage) events is higher.

If you really think about it, Cloud was always the return of Utility computing (mainframes) etc. and as we go down this journey it’s becoming more evident it’s simply a more distributed version of mainframe, and in my opinion, at the moment that is giving people a false sense of comfort.

Early in my career almost 20 years ago, I was working at an ISP, and one of the core services for the Internet (DNS) was directly attacked. There was of course a widespread failure and what we soon realized was the Internet had more of a shared fate than most believed.

Fast forward to present day, and it just happened again with Dyn who hosted DNS for some very critical companies. This problem hasn’t been solved, it is getting worse.

This is the same problem we are going to have with AWS, Google Cloud and Azure.

As companies & governments converge on datacenters, and those datacenters connect to common interconnected fabrics (aka the Internet itself) and resources.

The Internet is becoming far more grouped…far more shared & central…and thus the shared fate of the Internet will lie solely on the shoulders of giants or as we like to call them in our industry monoliths.

Public cloud monoliths, monopolies…etc

Perhaps my worries will be mitigated by fantastic diversification and investment in truly distributed, distinct network paths, independent power plants, etc…BUT my fear is the convergence is happening so fast, the providers won’t be able to make that a reality fast enough and what’s the incentive for them ? They have to invest a tremendous amount of capital when they are already successful and this problem has not publicly and visibly humiliated us yet. But I fear that it will in the next few years…

So Godspeed to journey men of the cloud, as we enjoy the luxuries that AWS, Azure, & Google Cloud offer us. We are entering a beautiful and dangerous time. Beware, and hedge your company & product by distributing it as much as you can to avoid these central dependencies. Avoid these massive shared, global failure domains and ensure you diversified to avoid increased security risk.

AWS, Google Cloud, Azure, and the centralized future of the Internet Read More »

How to use Boto to Audit your AWS EC2 instance security groups

Boto is a Software Development Kit for accessing the AWS API’s using Python.

https://github.com/boto/boto3

Recently, I needed to determine how many of my EC2 instances were spawned in a public subnet, that also had security groups with wide open access on any port via any protocol to the instances. Because I have an IGW (Internet Gateway) in my VPC’s and properly setup routing tables, instances launched in the public subnets with wide open security groups (allowing ingress traffic from any source) is a really bad thing 🙂

Here is the code I wrote to identify these naughty instances. It will require slight modifications in your own environment, to match your public subnet IP Ranges, EC2 Tags, and Account names.

#!/usr/bin/env python
__author__ = 'Jason Riedel'
__description__ = 'Find EC2 instances provisioned in a public subnet that have security groups allowing ingress traffic from any source.'
__date__ = 'June 5th 2016'
__version__ = '1.0'

import boto3

def find_public_addresses(ec2):
    public_instances = {}
    instance_public_ips = {}
    instance_private_ips = {}
    instance_ident = {}
    instances = ec2.instances.filter(Filters=[{'Name': 'instance-state-name', 'Values': ['running'] }])

    # Ranges that you define as public subnets in AWS go here.
    public_subnet_ranges = ['10.128.0', '192.168.0', '172.16.0']

    for instance in instances:
        # I only care if the private address falls into a public subnet range
        # because if it doesnt Internet ingress cant reach it directly anyway even with a public IP
        if any(cidr in instance.private_ip_address for cidr in public_subnet_ranges):
            owner_tag = "None"
            instance_name = "None"
            if instance.tags:
                for i in range(len(instance.tags)):
                    #comment OwnerEmail tag out if you do not tag your instances with it.
                    if instance.tags[i]['Key'] == "OwnerEmail":
                        owner_tag = instance.tags[i]['Value']
                    if instance.tags[i]['Key'] == "Name":
                        instance_name = instance.tags[i]['Value']
            instance_ident[instance.id] = "Name: %s\n\tKeypair: %s\n\tOwner: %s" % (instance_name, instance.key_name, owner_tag)
            if instance.public_ip_address is not None:
                values=[]
                for i in range(len(instance.security_groups)):
                    values.append(instance.security_groups[i]['GroupId'])
                public_instances[instance.id] = values
                instance_public_ips[instance.id] = instance.public_ip_address
                instance_private_ips[instance.id] = instance.private_ip_address

    return (public_instances, instance_public_ips,instance_private_ips, instance_ident)

def inspect_security_group(ec2, sg_id):
    sg = ec2.SecurityGroup(sg_id)

    open_cidrs = []
    for i in range(len(sg.ip_permissions)):
        to_port = ''
        ip_proto = ''
        if 'ToPort' in sg.ip_permissions[i]:
            to_port = sg.ip_permissions[i]['ToPort']
        if 'IpProtocol' in sg.ip_permissions[i]:
            ip_proto = sg.ip_permissions[i]['IpProtocol']
            if '-1' in ip_proto:
                ip_proto = 'All'
        for j in range(len(sg.ip_permissions[i]['IpRanges'])):
            cidr_string = "%s %s %s" % (sg.ip_permissions[i]['IpRanges'][j]['CidrIp'], ip_proto, to_port)

            if sg.ip_permissions[i]['IpRanges'][j]['CidrIp'] == '0.0.0.0/0':
                #preventing an instance being flagged for only ICMP being open
                if ip_proto != 'icmp':
                    open_cidrs.append(cidr_string)

    return open_cidrs


if __name__ == "__main__":
    #loading profiles from ~/.aws/config & credentials
    profile_names = ['de', 'pe', 'pde', 'ppe']
    #Translates profile name to a more friendly name i.e. Account Name
    translator = {'de': 'Platform Dev', 'pe': 'Platform Prod', 'pde': 'Products Dev', 'ppe': 'Products Prod'}
    for profile_name in profile_names:
        session = boto3.Session(profile_name=profile_name)
        ec2 = session.resource('ec2')

        (public_instances, instance_public_ips, instance_private_ips, instance_ident) = find_public_addresses(ec2)

        for instance in public_instances:
            for sg_id in public_instances[instance]:
                open_cidrs = inspect_security_group(ec2, sg_id)
                if open_cidrs: #only print if there are open cidrs
                    print "=================================="
                    print " %s, %s" % (instance, translator[profile_name])
                    print "=================================="
                    print "\tprivate ip: %s\n\tpublic ip: %s\n\t%s" % (instance_private_ips[instance], instance_public_ips[instance], instance_ident[instance])
                    print "\t=========================="
                    print "\t open ingress rules"
                    print "\t=========================="
                    for cidr in open_cidrs:
                        print "\t\t" + cidr

To run this you also need to setup your .aws/config and .aws/credentials file.

http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html#cli-config-files

Email me tuxninja [at] tuxlabs.com if you have any issues.
Boto is awesome 🙂 Note so is the AWS CLI, but requires some shell scripting as opposed to Python to do cool stuff.

The github for this code here https://github.com/jasonriedel/AWS/blob/master/sg-audit.py

Enjoy !

How to use Boto to Audit your AWS EC2 instance security groups Read More »