Systems Administration

Preventing (bind9) DNS Naughty-ness (named.conf & iptables/ufw) on Ubuntu

If you run a DNS server on the Internet with a default configuration many people/robots will take advantage of you. The same is true for Mail, but that is another article. Needless to say if you are running a service on the Internet, the naughty goblins will find you. To thwart these dirty criminals all that’s necessary is to configure your named.conf properly. However, since these robotos are being naughty there is a high degree of certainty they are infected endpoints, and as such I really don’t want them coming anywhere near me or my machines. After all for humanity sake we don’t want to be infected by the deadly plague ! This article is short and sweet, here is how to protect your DNS server & your server in one article using named.conf & ufw (iptables).

 

Named.conf.options

Now a days named.conf is really just a file that inherits 3 other files, named.conf.local, named.conf.options, and named.conf.default-zones. The one we are going to fix is named.conf.options. The configuration below should only be applied in a scenario where you want to run an authorative nameserver, and a caching name server, but the key is you only want to allow people to query the cache that ‘you know personally or are you’ vs. allowing the entire internet, because then bad things happen. If this is not the setup you are going for, don’t do this 🙂 But if it is follow along.

Add the following section with the proper IP’s to the top fo the file

acl "trusted" {
192.241.206.98;
localhost;
localnets;
};

Note you can also add a CIDR for a subnet like 192.168.0.0/16

After that’s done under the options {} section… make it look like this

        allow-query { any; };
        allow-recursion { trusted; };
        allow-query-cache { trusted; };
        allow-transfer { 202.157.182.142; };

Note, allow transfer is necessary if you have a secondary nameserver that needs to receive updates. Now restart bind9

tuxninja@tlprod1:/etc/bind$ sudo service bind9 restart

Ok now all querying including behavior from non-trusted people will not be allowed. If it is working check your /var/log/syslog and you will see some denies like this

Nov 11 16:00:31 tlprod1 named[952]: client 192.163.221.224#80 (hehehey.ru): query (cache) 'hehehey.ru/ANY/IN' denied
Nov 11 16:00:31 tlprod1 named[952]: client 192.163.221.224#80 (hehehey.ru): query (cache) 'hehehey.ru/ANY/IN' denied
Nov 11 16:00:31 tlprod1 named[952]: client 104.37.29.110#4761 (hehehey.ru): query (cache) 'hehehey.ru/ANY/IN' denied

Now the above is from my actual log file. I was quite annoyed that clients are basically abusing the hell out of hehehey.ru… so I decided I don’t want to talk to those people at all. To those people I should be a blackhole. To do this I used UFW which is short for uncomplicated firewall, which essentially makes dealing with Iptables much much nicer. It’s only my 2nd time using UFW, but I’ve been using Iptables for well over a decade. Anyway, here is my simple setup with UFW that I came up with.

tuxninja@tlprod1:/etc/bind$ sudo ufw default deny incoming
Default incoming policy changed to 'deny'
(be sure to update your rules accordingly)

tuxninja@tlprod1:/etc/bind$ sudo ufw default allow outgoing
Default outgoing policy changed to 'allow'
(be sure to update your rules accordingly)

tuxninja@tlprod1:/etc/bind$ sudo ufw allow ssh
Rules updated
Rules updated (v6)

tuxninja@tlprod1:/etc/bind$ sudo ufw allow 80
Rules updated
Rules updated (v6)

So we are configuring the default policy to deny all incoming traffic, allow outgoing, and then allow SSH & Apache/Web traffic basically. Next I created a script called block.sh to add ufw deny rules for bad actors I parsed out of my log, here’s what block.sh looks like

# cat block.sh 
#!/bin/bash

while read line; do
	ufw deny from $line
done

Don’t forget to chmod +x your shell script. Then I did this… blocking all bad actors…

root@tlprod1:~# cat /var/log/syslog | grep hehehey.ru | grep -v repeated | awk -F ' ' '{print $7}' | cut -d '#' -f 1 | ./block.sh

Note, use sudo if you don’t run this as root. This will go through my log and find all these bad requests, and block the requestor. It’s quite aggresive, so be careful, make sure you thoroughly limit your parsing with grep to only block things you really don’t want talking to your server, because this blocks ALL traffic from this requestor to your service, not just DNS.

Once that is complete you need to finally permit good DNS requests by running

ufw allow 53

And then finally enable your firewall

ufw enable

If you are successful you should see entries in your log that look like this

Nov 11 15:10:35 tlprod1 kernel: [1652178.544292] [UFW BLOCK] IN=eth0 OUT= MAC=04:01:63:57:8a:01:3c:8a:b0:0d:3f:f0:08:00 SRC=65.60.18.103 DST=192.241.206.198 LEN=72 TOS=0x00 PREC=0x00 TTL=247 ID=31303 PROTO=UDP SPT=20225 DPT=53 LEN=52

You can also view all your firewall rules by running

sudo ufw status numbered

Happy Blocking !

 

 

 

Preventing (bind9) DNS Naughty-ness (named.conf & iptables/ufw) on Ubuntu Read More »

Runner Features Have Been Updated !

Runner Reminder

Runner is a command line tool for running commands on thousands of devices that support SSH. I wrote Runner and use it every single day, because unlike Ansible, Runner truly has no dependencies on the client or server side other than SSH. I have used Runner to build entire datacenters, so it is proven and tested and has a lot of well thought out features, which brings me to todays post. Since I initially debuted Runner I have added a lot of features, but I had yet to check them into github, until now. Here is a run down of Runner’s features.

Features

  • Runner takes your login credentials & doesn’t require you to setup SSH keys on the client machines/devices.
  • Runner can be used through a bastion/jump host via an SSH tunnel (see prunner.py)
  • Runner reads it’s main host list from a file ~/.runner/hosts/hosts-all
  • Runner can accept custom hosts lists via -f
  • -e can be used to echo a command before it is run, this is useful for running commands on F5 load balancers for example, when no output is returned on success.
  • -T will allow you to tune the number of threads, but be careful you can easily exhaust your system or site resources (I.E. do NOT DOS your LDAP authentication servers by trying to do hundreds of threads across thousands of machines, unless you know they can handle it 😉 ).
  • -s is for sudo for those users who have permissions in the sudoers file.
  • -1 reduces any host list down to one host per pool. It uses a regex, which you will likely have to modify for your own host / device naming standard.
  • -r can be used to supply a regular expression for matching hosts. Remember sometimes you have to quote the regex and/or escape the shell when using certain characters.
  • -c will run a single command on many hosts, but -cf will run a series of commands listed in a file on any hosts specified. This is particularly useful for automations. For example, I used it to build out load balancer virtuals and pools on an F5.
  • -p enables you to break apart the number of hosts to run at a time using a percentage. This is a handy & more humanized way to ensure you do not kill your machine or the infrastructure you are managing when you crank threads through the roof 😉

Now that I have taken the time to explain some of those cool features, here’s an example of what it looks like in action.

Runner Demo

Host List

➜  ~  runner -l -r tuxlabs                               
tuxlabs.com
old.tuxlabs.com

There were 2 hosts listed.
➜

Basic Run Using Only -c, -u and defaults

Note: User defaults to the user you are logged in as if you don’t specify -u . Since I am logged in as ‘jriedel’ I have specified the user tuxninja instead.

➜  ~  runner -r tuxlabs -c 'id' -u tuxninja
RUNNER [INFO]: MATCHING HOSTNAMES WITH 'tuxlabs'
RUNNER [INFO]: 2 HOSTS HAVE BEEN SELECTED
RUNNER [INFO]: LOGFILE SET - /Users/jriedel/.runner/logs/runner.log.2015-08-25.01:59:59
RUNNER [INFO]: USER SET - tuxninja
RUNNER [INFO]: SSH CONNECT TIMEOUT is: 10 seconds
RUNNER [INFO]: THREADS SET - 20
RUNNER [INPUT]: Please Enter Site Pass: 
tuxlabs.com: uid=1000(tuxninja) gid=1000(tuxninja) groups=1000(tuxninja),27(sudo)
old.tuxlabs.com: uid=1000(tuxninja) gid=1000(tuxninja) groups=1000(tuxninja),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),113(lpadmin),114(sambashare),1001(admin)

RUNNER [RESULT]: Successfully logged into 2/2 hosts and ran your command(s) in 0:00:03 second(s)
RUNNER [RESULT]: There were 0 login failures.


RUNNER [INFO]: Your logfile can be viewed @ /Users/jriedel/.runner/logs/runner.log.2015-08-25.01:59:59
➜  ~

The Same Run Using Sudo 

Note: I just realized if you do not prompt for a password for sudo it will fail, I will have to fix that ! Whoops ! P.S. You should always prompt for a password when using sudo !

➜  ~  runner -r tuxlabs -c 'id' -u tuxninja -s
RUNNER [INFO]: MATCHING HOSTNAMES WITH 'tuxlabs'
RUNNER [INFO]: 2 HOSTS HAVE BEEN SELECTED
RUNNER [INFO]: LOGFILE SET - /Users/jriedel/.runner/logs/runner.log.2015-08-25.02:10:27
RUNNER [INFO]: USER SET - tuxninja
RUNNER [INFO]: SSH CONNECT TIMEOUT is: 10 seconds
RUNNER [INFO]: THREADS SET - 20
RUNNER [INFO]: SUDO IS ON
RUNNER [INPUT]: Please Enter Site Pass: 
tuxlabs.com: uid=0(root) gid=0(root) groups=0(root)
old.tuxlabs.com: uid=0(root) gid=0(root) groups=0(root)

RUNNER [RESULT]: Successfully logged into 2/2 hosts and ran your command(s) in 0:00:03 second(s)
RUNNER [RESULT]: There were 0 login failures.


RUNNER [INFO]: Your logfile can be viewed @ /Users/jriedel/.runner/logs/runner.log.2015-08-25.02:10:27
➜  ~

Runner with a command file in super quiet mode  !

➜  ~  cat lets-run-these 
uptime
who
date
uptime -s
➜  ~  

➜  ~  runner -r tuxlabs -cf lets-run-these -T 2 -p 50 -qq -u tuxninja
RUNNER [INPUT]: Please Enter Site Pass: 
tuxlabs.com:  05:17:51 up 14 days,  4:45,  0 users,  load average: 0.00, 0.01, 0.05
old.tuxlabs.com:  03:17:52 up 80 days, 47 min,  1 user,  load average: 0.42, 0.69, 0.78
old.tuxlabs.com: root     pts/0        2015-08-25 02:23 (173.224.162.99)
tuxlabs.com: Tue Aug 25 05:17:52 EDT 2015
old.tuxlabs.com: Tue Aug 25 03:17:53 CST 2015
tuxlabs.com: 2015-08-11 00:32:18
old.tuxlabs.com: 2015-06-06 02:30:42
➜  ~

Example of a simple regex & a failure

➜  ~  runner -r old.* -l
zsh: no matches found: old.*
➜  ~  

➜  ~  runner -r 'old.*' -l
old.tuxlabs.com

There was 1 host listed.
➜  ~

I hope you enjoyed the overview and new features. You can clone Runner on github.

Enjoy,
Jason Riedel

Runner Features Have Been Updated ! Read More »

SSH Tunneling

In my last post about Runner I briefly explained needing to modify your ~/.ssh/config to use a ProxyCommand to allow for automatic tunneling with SSH.

Host tlbastion
User tuxninja
ForwardAgent yes
HostName tlbastion.tuxlabs.com
DynamicForward 8081

Host *.tuxlabs.com
User tuxninja
ProxyCommand /usr/local/bin/sconnect -4 -w 4 -S localhost:8081 %h %p

What I didn’t explain is there is an alternative method that is arguably simpler. It requires creating three small shells scripts & placing them in your path or a common host path like /usr/local/bin/ with the chmod +x permission. Here is the script that sets up the ssh tunnel.

Script: starttunnel

$ cat /usr/local/bin/starttunnel 
ssh -o ServerAliveInterval=300 -CfgNTL -D 8081 tlbastion.tuxlabs.com
$

Running starttunnel, will connect you to your bastion/jump box and then background this connection with keep alives on. It will listen / dynamically forward ssh requests to 8081 through or to tlbastion.tuxlabs.com. Additionally, if you wanted to tunnel a web port specifically on a machine that sits within your network back to the machine you are tunneling from, you can add it to the script. Such that the required host/port always gets tunneled and is available on your machine when you run starttunnel. Example config would look like.

Script: starttunnel + forwarding http

 

$ cat /usr/local/bin/starttunnel 
ssh -o ServerAliveInterval=300 -CfgNTL 8080:tuxlabs1.tuxlabs.com:80 -D 8081 tlbastion.tuxlabs.com
$

Now that you have authenticated to your bastion and have a working tunnel you need to get ssh requests to go through this tunnel. However, if your like me you still want the ability to ssh to other stuff without going through that tunnel. So I created a new script called ‘sshp’. When I want to ssh through the tunnel / proxy I use ‘sshp’, when I want to ssh to somewhere else on the internet or another network I use plain old ‘ssh’. Here is my sshp script used to connect to machines behind the bastion.

Script: sshp

$ cat /usr/local/bin/sshp 
#!/bin/sh

ssh -o ConnectTimeout=3 -o StrictHostKeyChecking=no -o CheckHostIP=no -o ServerAliveInterval=300 -o "ProxyCommand /bin/nc -X 5 -x localhost:8081 %h %p" $1

$

Now, when you run sshp tuxlabs1@tuxlabs.com you will be connection through the tuxlabs bastion into tuxlabs1. Also notice in my previous post I used sconnect as the proxy command in this one we are using ‘nc’ aka netcat. I have found this method of tunneling to be the most simplistic and effective in my daily life. One more script you need is if you want to copy files you need to use scp. So you have to make a similar command ‘scpp’ for tunneling your copying of files. Here’s the script.

Script: scpp

$ cat /usr/local/bin/scpp 
#!/bin/sh

scp -pr -o ConnectTimeout=3 -o StrictHostKeyChecking=no -o CheckHostIP=no -o "ProxyCommand /bin/nc -x localhost:8081 %h %p" $1 $2
$

One final note…if you need use ‘*’ aka splat for copying many files you cannot use the script above, because the shell or script converts that incorrectly. Instead just use the full command yourself from the command line.

scp’ing with *

$ scp -pr -o ConnectTimeout=3 -o StrictHostKeyChecking=no -o CheckHostIP=no -o "ProxyCommand /bin/nc -x localhost:8081 %h %p" copy.all.* tuxninja@tlbastion.tuxlabs.com:

This would copy all files named ‘copy.all.<whatever>’ to the  bastion. Hope this hopes the folks out there feeling limited by bastions. They provide great security and are an absolute requirement in secure environments so learning tricks that make sure you only need to authenticate once for an extended period of time can come in real handy.

Enjoy,
Jason Riedel

 

 

SSH Tunneling Read More »

Runner: Multi-threaded SSH with Sudo support using Python & Paramiko

Example of Runner

$ runner -r web1 -c "whoami" -s
RUNNER [INFO]: MATCHING HOSTNAMES WITH 'web1'
RUNNER: 1 HOSTS HAVE BEEN SELECTED
RUNNER [INFO]: LOGFILE SET - logs/runner.log.2015-01-17.03:10:00
RUNNER [INFO]: USER SET - tuxninja
RUNNER [INFO]: SSH CONNECT TIMEOUT is: 5 seconds
RUNNER [INFO]: THREADS SET - 20
RUNNER [INFO]: SUDO IS ON
RUNNER [INPUT]: Please Enter Site Pass: 
web1.tuxlabs.com: 
web1.tuxlabs.com: root
web1.tuxlabs.com: [tuxninja@web1 ~]$ 

RUNNER [RESULT]: Successfully logged into 1/1 hosts and ran your commands in 0:00:08 second(s)
RUNNER [RESULT]: There were 0 login failures.

Why Runner ?

I have been working as a Systems & Network Administrator since 1999. In that time I have repeatedly had the need for rapidly executing commands across thousands of servers. There are many applications out there that solve this problem in various ways…to name a few…pdsh, Ansible, Salt, Chef, Puppet (mcollective),  even Cfengine and more. Some require agents running on the machines, some use SSH, but require keys…or learning curves. Alternatively, you can write your own code to solve this problem, which is what I did mostly for fun. I don’t recommend re-inventing the wheel if you need this for your job, just use what is already out there, or download runner and hack it to your hearts content for your purposes.

Fabric vs. Paramiko

Because I use Python for most of my work code these days, I decided to write my multi-threaded SSH command runner in Python this way I can use Runner for parallel SSH transport & easily bolt on my other Python scripts for additional functionality. Python has fantastic support for SSH via two libraries Fabric & Paramiko. Fabric is built on top of Paramiko. Fabric provides a simpler interface than Paramiko does for doing just about anything you can think of. Create a fabfile run it, and wolla instant results from commands ran via SSH. Fabric is really great for running & re-running a set of commands to automate an install or reporting for example. All that being said I still chose to use Paramiko over Fabric for three reasons.

  1. I don’t like abstraction. Fabric hides the ugly-ness of Paramiko, which I prefer to understand better.
  2. Writing this using Paramiko lent itself better to a command line utility used for adhoc commands than Fabric did.
  3. I wasn’t sure if Fabric’s abstraction would limit me later based on needing custom functionality. So for Runner I chose Paramiko, but to be clear, 9 times out of 10 I think I would choose Fabric.

Bastions

A bastion or jump box is a machine that is used as the gatekeeper of access to the rest of the machines in your network. In secure environments where your Corp network is separate from your Production network, you will have to SSH into a bastion, which usually has some form of 2-factor authentication (at least it should !) and then from there you may SSH into other hosts. A bastion can throw a real wrench in trying to manage thousands of machines in seconds, because you would have to authenticate to the bastion 1000 times ! The way around this, is by setting up your SSH config to proxy commands.

ProxyCommand & Sconnect

Sconnect (or connect.c) is a binary that is most commonly used as the proxy command for SSH. You can download / read more about sconnect here : https://bitbucket.org/gotoh/connect/wiki/Home and it will also tell you how to setup your SSH config. Using a ProxyCommand with Runner is required, you can however use any ProxyCommand you would like. Really quickly here is what you basically need to do.

  1. Download / Compile connect.c
  2. Copy it to /usr/local/bin/sconnect and set executable permissions
  3. In your SSH Config (.ssh/config) add…
    1. Host <ssh-config-profile-name>
      User tuxninja
      ForwardAgent yes
      HostName <bastion_name>
      DynamicForward 8081 (any uncommon port is fine)
    2. Host *.tuxlabs.com
      User tuxninja
      ProxyCommand /usr/local/bin/sconnect -4 -w 4 -S localhost:8081 %h.tuxlabs.com %p

That is basically it. Then you should start a screen session so you can background the SSH session, since you will leave this open for other SSH sessions to proxy through so you don’t have to go through 2-factor authentication more than once. So something like…

screen -S sshsession
ssh <ssh-config-profile-name>

After you authenticate, detach yourself from the screen using CTRL A then D. Now you can ssh to anything @ domain name in my case tuxlab.com and it will forward through the bastion. At this point you still have to authenticate using a username / password, which is fine. Runner deals with this.

Hosts

Runner requires a hosts file to run. By default it is configured to look in hosts/hosts-all for a list of all hosts. I use a script called ‘update-runner-hosts.pl’, which is included in my github to gather hosts from a URL and update the required hosts file. Once you have populated hosts/hosts-all with the FQDN for your hosts, you are ready to use Runner.

Note: You can use ‘-f’ to provide a custom location for your hosts file.

Great Flags / Features

So some of the really great features of Runner are threading (-t), sudo (-s), list only mode (-l) and the regular expression (-r). -r is for pattern matching your hosts lists, which is incredibly handy and absolutely required in an environment with hundreds to thousands of hosts and you only want to select hosts with -r ‘web’ in them.

(-1) one host per pool mode is a great feature, however it is dependent on understanding your environments hostname pattern so you will have to modify the regular expression in the code to make sure it works for you. It is currently setup to identify hostnames in pools when the naming convention is something like apache1234.tuxlabs.com.

Ok I could go on and on about runner, but it’s better to just share the code at this point and let you go! Note the statically defined proxy_command in the code, you may need to change this if you didn’t use sconnect or the same port.

Note: by default runner uses the user you are logged in as to SSH, you can prompt input for a different user with ‘-u’.

All code and accessories are available for download on github : https://github.com/jasonriedel/tuxlabs/tree/master/runner

Email tuxninja@tuxlabs.com with any question ! Happy SSH’ing admins!

Note: In various versions of this code I had a ‘-h’ allowing you to pass a CSV list of hosts, somehow I let that drop out of this version, sorry ! Feel free to re-add it !

The Runner Code

#!/usr/bin/env python
#Author: Jason Riedel

import paramiko
import getpass
import Queue
import threading
import argparse
import os.path
import time
import logging
import re
import datetime

## SETUP AVAILABLE ARGUMENTS ##
parser = argparse.ArgumentParser()
parser.add_argument('-f', action="store", dest="file_path", required=False, help="Specify your own path to a hosts file")
parser.add_argument('-l', action="store_true", dest="list_only", required=False, help="List all known hosts")
parser.add_argument('-q', action="store_true", dest="quiet_mode", required=False, help="Quiet mode: turns off RUNNER INFO messages.")
parser.add_argument('-qq', action="store_true", dest="super_quiet_mode", required=False, help="Super Quiet mode: turns off ALL RUNNER messages except [INPUT].")
parser.add_argument('-r', action="store", dest="host_match", required=False, help="Select Hosts matching supplied pattern")
parser.add_argument('-c', action="store", dest="command_string", required=False, help="Command to run")
parser.add_argument('-s', action="store_true", dest="sudo", required=False, help="Run command inside root shell using sudo") 
parser.add_argument('-t', action="store", dest="connect_timeout", required=False, help="ssh timeout to hosts in seconds")
parser.add_argument('-T', action="store", dest="threads", required=False, help="# of threads to run (don't get crazy)")
parser.add_argument('-u', action="store", dest="site_user", required=False, help="Specify a username (by default I use who you are logged in as)")
parser.add_argument('-1', action="store_true", dest="host_per_pool", required=False, help="One host per pool")
args = parser.parse_args()

##GLOBAL##
logging.getLogger('paramiko.transport').addHandler(logging.NullHandler())

stime = time.time()

## SET TIMEOUT ##
connect_timeout = 5
if args.connect_timeout:
    connect_timeout = args.connect_timeout

## SET THREADS / WORKERS ##
workers = 20
if args.threads:
    workers = int(args.threads)

## SET USER / PASS ##
site_user = getpass.getuser()
site_passwd = ''
if args.site_user:
    site_user = args.site_user

failed_logins = []
successful_logins = []

tstamp = datetime.datetime.now().strftime("%Y-%m-%d.%H:%M:%S")
logfile_dir = 'logs'
if not os.path.exists(logfile_dir):
    os.makedirs(logfile_dir)
logfile_path = '%s/runner.log.%s' % (logfile_dir, tstamp)
logfile = open(logfile_path, 'w')

## END GLOBAL ##

def ssh_to_host(hosts, site_passwd):
    for i in range(workers):
        t = threading.Thread(target=worker, args=(site_user, site_passwd))
        t.daemon = True
        t.start()

    for hostname in hosts:
        hostname = hostname.rstrip()
        q.put(hostname)

    q.join()

def worker(site_user, site_passwd):
    while True:
        hostname = q.get()
        node_shell(hostname, site_user, site_passwd)
        q.task_done()


def node_shell(hostname, site_user, site_passwd):
    ssh = paramiko.SSHClient()
    proxy_command = "sconnect -4 -w 4 -S localhost:8081 %s %s" % (hostname,'22')
    proxy_sock = paramiko.ProxyCommand(proxy_command)
    ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    try:
        ssh.connect(hostname, username=site_user, password=site_passwd, timeout=connect_timeout, sock=proxy_sock)
        transport = ssh.get_transport()
        transport.set_keepalive(1)

        cmd = args.command_string
	if args.sudo: 
		try: 
			## have to use invoke shell for sudo due to ssh config on machines requirng a TTY
			channel = ssh.invoke_shell() 
			sudocmd = 'sudo ' + cmd

			channel.send(sudocmd + '\n') 

			buff = ''
			while not '[sudo] password' in buff: 
				resp = channel.recv(9999)
				buff += resp

			channel.send(site_passwd + '\n') 

			buff = ''
			while not buff.endswith('$ '):
				resp = channel.recv(9999)
				buff += resp

			for line in buff.split('\n'):
				log_and_print("%s: %s" % (hostname, line))

		except Exception as e:
			log_and_print("ERROR: Sudo failed: %s" % (e))  
  
	else: 
        	(stdin, stdout, stderr) = ssh.exec_command(cmd)

		## stdout 
        	for line in stdout.readlines():
            		line = line.rstrip()
            		log_and_print("%s: %s" % (hostname, line))
		## stderr
        	for line in stderr.readlines():
            		line = line.rstrip()
            		log_and_print("%s: %s" % (hostname, line))

        successful_logins.append(hostname)
        ssh.close()

    except Exception as e:
        log_and_print("%s: failed to login : %s" % (hostname, e))
        failed_logins.append(hostname)
        ssh.close()

def log_and_print(message):
    if args.super_quiet_mode or args.list_only:
        if "RUNNER [INPUT]" in message or "RUNNER [ERROR]" in message or "RUNNER" not in message:
            print message
            logfile.write(message + '\n')
    elif args.quiet_mode or args.list_only:
        if "RUNNER [INFO]" not in message:
            print message
            logfile.write(message + '\n')
    else:
        print message
        if not args.list_only:
            logfile.write(message + '\n')

def get_hosts(file_path):
    if os.path.exists(file_path):
        hosts = open(file_path)
        selected_hosts = []
        if not args.host_match:
            selected_hosts = list(hosts)
            log_and_print("RUNNER [INFO]: SELECTING ALL HOSTS")
        else:
            host_match = args.host_match
            for host in hosts:
                if re.search(host_match, host):
                    selected_hosts.append(host)
            log_and_print("RUNNER [INFO]: MATCHING HOSTNAMES WITH '%s'" % (host_match))
    else:
        log_and_print("RUNNER [ERROR]: %s does not exist ! Try running ./update-runner-hosts" % (file_path))
        exit()

    ## Select one host per pool
    if args.host_per_pool:
        seen = {}
        host_per_pool = []
        for host in selected_hosts:
	    # Here strip values that make hostnames unique like #'s
	    # That way the dict matches after 1 host per pool has been seen 
            nhost = re.sub("\d+?\.", ".", host) #Removing #'s in a hostname like host1234.tuxlabs.com
            if not nhost in seen:
                seen[nhost] = 1
                host_per_pool.append(host)
        selected_hosts = host_per_pool

    log_and_print("RUNNER: %s HOSTS HAVE BEEN SELECTED" % (len(selected_hosts)))
    return selected_hosts

if __name__ == "__main__":
    file_path = 'hosts/hosts-all' ## update-hosts-all creates the DIR 

    if args.file_path:
        file_path = args.file_path
        if '~' in file_path:
            print "RUNNER [ERROR]: -f does not support '~'"
            exit()

    if args.list_only or args.command_string:
        selected_hosts = get_hosts(file_path)
        if args.list_only:
            for host in selected_hosts:
                host = host.rstrip()
                log_and_print(host)
            log_and_print("\nThere were %s hosts listed." % (len(selected_hosts)))
            exit()

        else:
            log_and_print("RUNNER [INFO]: LOGFILE SET - %s" % (logfile_path))
            log_and_print("RUNNER [INFO]: USER SET - %s" % (site_user))
            log_and_print("RUNNER [INFO]: SSH CONNECT TIMEOUT is: %s seconds" % (connect_timeout))
            log_and_print("RUNNER [INFO]: THREADS SET - %s" % (workers))
	    if args.sudo:
		log_and_print("RUNNER [INFO]: SUDO IS ON") 

            site_passwd = getpass.getpass("RUNNER [INPUT]: Please Enter Site Pass: ")

            q = Queue.Queue()

            ssh_to_host(selected_hosts,site_passwd)

            etime=time.time()
            run_time = int(etime-stime)

            timestamp = str(datetime.timedelta(seconds=run_time))
            log_and_print("\nRUNNER [RESULT]: Successfully logged into %s/%s hosts and ran your commands in %s second(s)" % (len(successful_logins), len(selected_hosts), timestamp))
            log_and_print("RUNNER [RESULT]: There were %s login failures.\n" % (len(failed_logins)))
            if len(failed_logins) > 0:
                for failed_host in failed_logins:
                    log_and_print("RUNNER [RESULT]: Failed to login to: %s" % (failed_host))
    else:
        parser.print_help()
        output = "\nRUNNER [INFO]: Either -l (list hosts only) or -s (Run cmd string) is required.\n"
        log_and_print(output)

Runner: Multi-threaded SSH with Sudo support using Python & Paramiko Read More »

How To: Add A Compute Node To Openstack Icehouse Using Packstack

openstack-compute-iconopenstack-compute-iconopenstack-compute-iconopenstack-compute-iconopenstack-compute-icon

Pre-requisites

This article is a continuation on the previous article I wrote on how to do a single node all-in-one (AIO) Openstack Icehouse install using Redhat’s packstack. A working Openstack AIO installation using packstack is required for this article. If you do not already have a functioning AIO install of Openstack please refer to the previous article before continuing on to this articles steps.

Preparing Our Compute Node

Much like in our previous article we first need to go through and setup our system and network properly to work with Openstack. I started with a minimal CentOS 6.5 install, and then configured the following

  1. resolv.conf
  2. sudoers
  3. my network interfaces eth0(192) and eth1 (10)
    1. Hostname: ruby.tuxlabs.com ( I also setup DNS for this )
    2. EXT IP: 192.168.1.11
    3. INT IP: 10.0.0.2
  4. A local user + added him to wheel for sudo
  5. I installed these handy dependencies
    1. yum install y opensshclients
    2. yum install y yumutils
    3. yum install y wget
    4. yum install y bindutils
  6. And I disabled SELinux
    1. Don’t forget to reboot after

To see how I setup the above pre-requisites see the “Setting Up Our Initial System” section on the previous controller install here : http://tuxlabs.com/?p=82

Adding Our Compute Node Using PackStack

For starters we need to follow the steps in this link  https://openstack.redhat.com/Adding_a_compute_node

I am including the link for reference, but you don’t have to click it as I will be listing the steps below.

On your controller node ( diamond.tuxlabs.com )

First, locate your answers file from your previous packstack all-in-one install.

[root@diamond tuxninja]# ls *answers*
packstack-answers-20140802-125113.txt
[root@diamond tuxninja]#

 Edit the answers file

Change lo to eth1 (assuming that is your private 10. interface) for both CONFIG_NOVA_COMPUTE_PRIVIF & CONFIG_NOVA_NETWORK_PRIVIF

[root@diamond tuxninja]# egrep 'CONFIG_NOVA_COMPUTE_PRIVIF|CONFIG_NOVA_NETWORK_PRIVIF' packstack-answers-20140802-125113.txt
CONFIG_NOVA_COMPUTE_PRIVIF=eth1
CONFIG_NOVA_NETWORK_PRIVIF=eth1
[root@diamond tuxninja]#

Change CONFIG_COMPUTE_HOSTS to the ip address of the compute node you want to add. In our case ‘192.168.1.11’. Additionally, validate the ip address for CONFIG_NETWORK_HOSTS is your controller’s ip since you do not run a separate network node.

[root@diamond tuxninja]# egrep 'CONFIG_COMPUTE_HOSTS|CONFIG_NETWORK_HOSTS' packstack-answers-20140802-125113.txt
CONFIG_COMPUTE_HOSTS=192.168.1.11
CONFIG_NETWORK_HOSTS=192.168.1.10
[root@diamond tuxninja]#

That’s it. Now run packstack again on the controller

[tuxninja@diamond yum.repos.d]$ sudo packstack --answer-file=packstack-answers-20140802-125113.txt

When that completes, ssh into or switch terminals over to your compute node you just added.

On the compute node ( ruby.tuxlabs.com )

Validate that the relevant openstack compute services are running

[root@ruby ~]# openstack-status
== Nova services ==
openstack-nova-api:                     dead      (disabled on boot)
openstack-nova-compute:                 active
openstack-nova-network:                 dead      (disabled on boot)
openstack-nova-scheduler:               dead      (disabled on boot)
== neutron services ==
neutron-server:                         inactive  (disabled on boot)
neutron-dhcp-agent:                     inactive  (disabled on boot)
neutron-l3-agent:                       inactive  (disabled on boot)
neutron-metadata-agent:                 inactive  (disabled on boot)
neutron-lbaas-agent:                    inactive  (disabled on boot)
neutron-openvswitch-agent:              active
== Ceilometer services ==
openstack-ceilometer-api:               dead      (disabled on boot)
openstack-ceilometer-central:           dead      (disabled on boot)
openstack-ceilometer-compute:           active
openstack-ceilometer-collector:         dead      (disabled on boot)
== Support services ==
libvirtd:                               active
openvswitch:                            active
messagebus:                             active
Warning novarc not sourced
[root@ruby ~]#

 Back on the controller ( diamond.tuxlabs.com )

We should now be able to validate that ruby.tuxlabs.com has been added as a compute node hypervisor.

[tuxninja@diamond ~]$ sudo -s
[root@diamond tuxninja]# source keystonerc_admin
[root@diamond tuxninja(keystone_admin)]# nova hypervisor-list
+----+---------------------+
| ID | Hypervisor hostname |
+----+---------------------+
| 1  | diamond.tuxlabs.com |
| 2  | ruby.tuxlabs.com    |
+----+---------------------+
[root@diamond tuxninja(keystone_admin)]# nova-manage service list
Binary           Host                                 Zone             Status     State Updated_At
nova-consoleauth diamond.tuxlabs.com                  internal         enabled    :-)   2014-10-12 20:48:34
nova-conductor   diamond.tuxlabs.com                  internal         enabled    :-)   2014-10-12 20:48:35
nova-scheduler   diamond.tuxlabs.com                  internal         enabled    :-)   2014-10-12 20:48:27
nova-compute     diamond.tuxlabs.com                  nova             enabled    :-)   2014-10-12 20:48:32
nova-cert        diamond.tuxlabs.com                  internal         enabled    :-)   2014-10-12 20:48:31
nova-compute     ruby.tuxlabs.com                     nova             enabled    :-)   2014-10-12 20:48:35
[root@diamond tuxninja(keystone_admin)]#

Additionally, you can verify it in the Openstack Dashboard

hypervisors

Next we are going to try to boot an instance using the new ruby.tuxlabs.com hypervisor. To do this we will need a few pieces of information. First let’s get our OS images list.

[root@diamond tuxninja(keystone_admin)]# glance image-list
+--------------------------------------+---------------------+-------------+------------------+-----------+--------+
| ID                                   | Name                | Disk Format | Container Format | Size      | Status |
+--------------------------------------+---------------------+-------------+------------------+-----------+--------+
| 0b3f2474-73cc-4df2-ad0e-fdb7a7f7c8a1 | cirros              | qcow2       | bare             | 13147648  | active |
| 737a0060-6e80-415c-b66b-a20893d9888b | Fedora 6.4          | qcow2       | bare             | 210829312 | active |
| 952ac512-19da-47a7-81a4-cfede18c7f45 | ubuntu-server-12.04 | qcow2       | bare             | 260964864 | active |
+--------------------------------------+---------------------+-------------+------------------+-----------+--------+
[root@diamond tuxninja(keystone_admin)]#

Great, now we need the ID of our private network

[root@diamond tuxninja(keystone_admin)]# neutron net-show private
+---------------------------+--------------------------------------+
| Field                     | Value                                |
+---------------------------+--------------------------------------+
| admin_state_up            | True                                 |
| id                        | d1a89c10-0ae2-43f0-8cf2-f02c20e19618 |
| name                      | private                              |
| provider:network_type     | vxlan                                |
| provider:physical_network |                                      |
| provider:segmentation_id  | 10                                   |
| router:external           | False                                |
| shared                    | False                                |
| status                    | ACTIVE                               |
| subnets                   | b8760f9b-3c0a-47c7-a5af-9cb533242f5b |
| tenant_id                 | 7bdf35c08112447b8d2d78cdbbbcfa09     |
+---------------------------+--------------------------------------+
[root@diamond tuxninja(keystone_admin)]#

Ok now we are ready to proceed with the nova boot command.

[root@diamond tuxninja(keystone_admin)]#  nova boot --flavor m1.small --image 'ubuntu-server-12.04' --key-name cloud --nic net-id=d1a89c10-0ae2-43f0-8cf2-f02c20e19618 --hint force_hosts=ruby.tuxlabs.com test
+--------------------------------------+------------------------------------------------------------+
| Property                             | Value                                                      |
+--------------------------------------+------------------------------------------------------------+
| OS-DCF:diskConfig                    | MANUAL                                                     |
| OS-EXT-AZ:availability_zone          | nova                                                       |
| OS-EXT-SRV-ATTR:host                 | -                                                          |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | -                                                          |
| OS-EXT-SRV-ATTR:instance_name        | instance-00000019                                          |
| OS-EXT-STS:power_state               | 0                                                          |
| OS-EXT-STS:task_state                | scheduling                                                 |
| OS-EXT-STS:vm_state                  | building                                                   |
| OS-SRV-USG:launched_at               | -                                                          |
| OS-SRV-USG:terminated_at             | -                                                          |
| accessIPv4                           |                                                            |
| accessIPv6                           |                                                            |
| adminPass                            | XHUumC5YbE3J                                               |
| config_drive                         |                                                            |
| created                              | 2014-10-12T20:59:47Z                                       |
| flavor                               | m1.small (2)                                               |
| hostId                               |                                                            |
| id                                   | f7b9e8bb-df45-4b94-a896-5600f47c269b                       |
| image                                | ubuntu-server-12.04 (952ac512-19da-47a7-81a4-cfede18c7f45) |
| key_name                             | cloud                                                      |
| metadata                             | {}                                                         |
| name                                 | test                                                       |
| os-extended-volumes:volumes_attached | []                                                         |
| progress                             | 0                                                          |
| security_groups                      | default                                                    |
| status                               | BUILD                                                      |
| tenant_id                            | 7bdf35c08112447b8d2d78cdbbbcfa09                           |
| updated                              | 2014-10-12T20:59:47Z                                       |
| user_id                              | 6bb8fcf3ce9446838e50a6b98fbb5afe                           |
+--------------------------------------+------------------------------------------------------------+
[root@diamond tuxninja(keystone_admin)]#

Fantastic. That command should look familiar from our previous tutorial it is the standard command for launching new VM instances using the command line, with one exception ‘–hint force_hosts=ruby.tuxlabs.com’ this part of the command line forces the scheduler to use ruby.tuxlabs.com as it’s hypervisor.

Once the VM is building we can validate that it is on the right hypervisor like so.

[root@diamond tuxninja(keystone_admin)]# nova hypervisor-servers ruby.tuxlabs.com
+--------------------------------------+-------------------+---------------+---------------------+
| ID                                   | Name              | Hypervisor ID | Hypervisor Hostname |
+--------------------------------------+-------------------+---------------+---------------------+
| f7b9e8bb-df45-4b94-a896-5600f47c269b | instance-00000019 | 2             | ruby.tuxlabs.com    |
+--------------------------------------+-------------------+---------------+---------------------+
[root@diamond tuxninja(keystone_admin)]# nova hypervisor-servers diamond.tuxlabs.com
+--------------------------------------+-------------------+---------------+---------------------+
| ID                                   | Name              | Hypervisor ID | Hypervisor Hostname |
+--------------------------------------+-------------------+---------------+---------------------+
| a4c67465-d7ef-42b6-9c2a-439f3b13e841 | instance-00000017 | 1             | diamond.tuxlabs.com |
| 0c34028d-dfb6-4fdf-b9f7-daade66f2107 | instance-00000018 | 1             | diamond.tuxlabs.com |
+--------------------------------------+-------------------+---------------+---------------------+
[root@diamond tuxninja(keystone_admin)]#

You can see from the output above I have 2 VM’s on my existing controller ‘diamond.tuxlabs.com’ and the newly created instance is on ‘ruby.tuxlabs.com’ as instructed, awesome.

Now that you are sure you setup your compute node correctly, and can boot a VM on a specific hypervisor via command line, you might be wondering how this works using the GUI. The answer is a little differently 🙂

The Openstack Nova Scheduler

The Nova Scheduler in Openstack is responsible for determining, which compute node a VM should be created on. If you are familiar with VMware this is like DRS, except it only happens on initial creation, there is no rebalancing that happens as resources are consumed overtime. Using the Openstack Dashboard GUI I am unable to tell nova to boot off a specific hypervisor, to do that I have to use the command line above (if someone knows of a way to do this using the GUI let me know, I have a feeling if it is not added already, they will add the ability to send a hint to nova from the GUI in a later version). In theory you can trust the nova-scheduler service to automatically balance the usage of compute resources (CPU, Memory, Disk etc) based on it’s default configuration. However, if you want to ensure that certain VM’s live on certain hypervisors you will want to use the command line above. For more information on how the scheduler works see : http://cloudarchitectmusings.com/2013/06/26/openstack-for-vmware-admins-nova-compute-with-vsphere-part-2/

The End

That is all for now, hopefully this tutorial was helpful and accurately assisted you in expanding your Openstack compute resources & knowledge of Openstack. Until next time !

 

How To: Add A Compute Node To Openstack Icehouse Using Packstack Read More »