How to use Linux instances on EC2 with Active Directory

AWS Linux AD Linux instances running on Amazon EC2 can now be joined to Simple AD directories from the AWS Directory Service.

This enables users to log in to all of their EC2 instances with a single set of domain credentials (no key pair needed) and set access controls, allowing domain admins to control which users can access particular instances.

Simple AD is a managed directory powered by a Samba 4, Active Directory Compatible Server. It supports commonly used features such as user accounts, group memberships, domain-joining Amazon EC2 instances running Linux and Microsoft Windows, as well as Kerberos based single sign-on (SSO), and Group Policies.

This makes it very easy for companies to be able to manage Amazon EC2 instances in the Amazon Web Services cloud.

Amazon have more details on their blog.

Hardening RedHat (CentOS) Linux for use on Cloud

If you next to deploy Linux on Cloud you should consider hardening the Linux instance prior to any deployment. Below are guidelines we have pulled together with regards to hardening a RedHat or CentOS instance.

Hardening Redhat linux guidelines

enable selinux

Ensure that /etc/selinux/config includes the following lines:
SELINUX=enforcing
SELINUXTYPE=targeted

Run the following on commandline to allow httpd to create outbound network connections
setsebool httpd_can_network_connect=1

check using
sestatus
To enable/disable
echo 1 >/selinux/enforce

disable the services

chkconfig anacron off
chkconfig autofs off
chkconfig avahi-daemon off
chkconfig gpm off
chkconfig haldaemon off
chkconfig mcstrans off
chkconfig mdmonitor off
chkconfig messagebus off
chkconfig readahead_early
chkconfig readahead_early off
chkconfig readahead_later off
chkconfig xfs off

Disable SUID and SGID Binaries

chmod -s /bin/ping6
chmod -s /usr/bin/chfn
chmod -s /usr/bin/chsh
chmod -s /usr/bin/chage
chmod -s /usr/bin/wall
chmod -s /usr/bin/rcp
chmod -s /usr/bin/rlogin
chmod -s /usr/bin/rsh
chmod -s /usr/bin/write

Set Kernel parameters

At boot, the system reads and applies a set of kernel parameters from /etc/sysctl.conf. Add the following lines to that file to prevent certain kinds of attacks:

net.ipv4.conf.all.rp_filter=1
net.ipv4.conf.all.accept_source_route=0
net.ipv4.icmp_echo_ignore_broadcasts=1
net.ipv4.icmp_ignore_bogus_error_messages=1
kernel.exec-shield=1
kernel.randomize_va_space=1

Disable IPv6

Unless your policy or network configuration requires it, disable IPv6. To do so, prevent the kernel module from loading by adding the following line to /etc/modprobe.conf:
install ipv6 /bin/true
Next, add or change the following lines in /etc/sysconfig/network:
NETWORKING_IPV6=no
IPV6INIT=no

Nessus PCI Scan

Upgrade openssh to latest version

upgrade bash to latest version

http://www.thecpaneladmin.com/upgrading-openssh-on-centos-5/

Set HTTP headers off

In /etc/httpd/conf/httpd.conf set the following values
ServerTokens Prod
ServerSignature Off
TraceEnable off

In /etc/php.ini set
expose_php = Off

Change MySQL to listens on only localhost

Edit /etc/my.cnf and add following to mysqld section
bind-address = 127.0.0.1

Make sure only port 80 443 21 are open

vi /etc/sysconfig/iptables
and add
ACCEPT tcp anywhere anywhere state NEW tcp dpt:https
ACCEPT tcp — anywhere anywhere state NEW tcp dpt:ftp

Finding disk bottlenecks on Linux

We recently had a client that had some issues with their site slowing down. They thought initially it was due to MySQL locks but this was not the case.  It was clear that the problem was with the disk. Some process was utilizing the disk. When running top we could see that the CPU wait time was 25-30%.

Also running vmstat we could see the wait time was quite high, so the question was which process was causing the issue. Interestingly doing a Google web search brings up almost no coherent posts on finding disk bottlenecks. The solution is good old iostat. That provides the information about the disk read and writes per partition but it does not tell you which process is causing the disk i/o. The later versions of linux kernel provide quite good diagnostic information about the disk i/o but this is not documented in the reasonably popular older posts on the subject of disk thrashing.

For the lastest kernel versions you can use iotop to pinpoint the process that is specifically doing the disk i/o. To do this:

1. Start iotop

2. press the left arrow twice so that the sort field is on disk write.

3. You will now be in real time mode of which process i is writing to the disk so you can see specficially

4. If you wish to get a historic view of writes to date then press ‘a’ again (just press ‘a’ one more time to switch back).

In this clients case the issue was their temp directory was on the same physical drive as their site and MySQL DB. Moving the temp diretory to a separate drive resolved the issue.

 

EC2 Linux Monitoring & Tuning Tips

When deploying on EC2 even though Amazon provides the hardware infrastructure, you still need to tune your instances operating system and monitor your application. You should review your hardware/software requirements and review your application design and deployment strategy

The Operating System

Change ulimit

‘ulimit’ Specifies the number of open files that are supported. If the value set for this parameter is too low, a file open error, memory allocation failure, or connection establishment error might be displayed. By default this is set to 1024 , normally you should increase this to at least 8096.

Issue the following command to set the value.

ulimit -n 8096

Use the ulimit -a command to display the current values for all limitations on system resources

Tune the Network

A good in detail reference for Linux IP tuning is here.  Some of the  important parameters to change  for distributed applications are below:

TCP_FIN_TIMEOUT

The tcp_fin_timeout variable tells kernel how long to keep sockets in the state FIN-WAIT-2 if you were the one closing the socketThis value takes an integer value which is per default set to 60 seconds. To set the value to 30  issue the command

echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout

TCP_KEEPALIVE_INTERVAL

The tcp_keepalive_intvl variable tells the kernel how long to wait for a reply on each keepalive probe. This value is in other words extremely important when you try to calculate how long time will go before your connection will die a keepalive death. The variable takes an integer value and the default value is 75 seconds. To set the value to 15 issue the following command

echo 15 > /proc/sys/net/ipv4/tcp_keepalive_intvl

TCP_KEEPALIVE_PROBES

The tcp_keepalive_probes variable tells the kernel how many TCP keepalive probes to send out before it decides a specific connection is broken.
This variable takes an integer value, The default value is to send out 9 probes before telling the application that the connection is broken. To change the valueto 5  use the following command.

echo 5 > /proc/sys/net/ipv4/tcp_keepalive_probes

 

Monitoring

You can monitor the system resources using command line but to make life easier you can use monitoring systems.  Couple of free opensource monitoring tools that we use

  • Ganglia a free monitoring system
  • Hyperic they have both a commercial and free offering

 

Logging

You will be amazed how few projects care about logging until they have hit a problem. Have a consistent logging procedure in place to collect the logs from different machines to troubleshot in case of a problem

Linux Commands

Some linux command that we use regulary to you might find useful. More details can be found here, here and here

  • top: display Linux tasks
  • vmstat Report virtual memory statistics
  • free Display amount of free and used memory in the system
  • netstat Print network connections, routing tables, interface statistics, masquerade connections, and multicast memberships
  • ps Report a snapshot of the current processes
  • iostat Report Central Processing Unit (CPU) statistics and input/output statistics for devices and partitions
  • sar Collect, report, or save system activity information
  • tcpdump dump traffic on a network
  • strace trace system calls and signals