Thursday, 24 July 2008

RedHat Installer aka Anaconda

What an annoying inflexible tool! It reminds me of Windows in it's methodology e.g. abstract everything from the user so that he doesn't have to know anything, hide real problems from the user so he doesn't know whats broken and then won't worry, stop the user from being able to manually setup anything so he can never fix any problems that occur from the automated fashion the installer uses. What happened to the rescue shell that used to start back in the day giving you some flexibility? What happened to the option of loading another driver at install time from supplemental CD/DVD or the Internet? What happened to the ability to partition using fdisk or something seperate from the anaconda script?

IMHO all this type of thing breeds is a new type of Linux admin very similar to a Windows admin of late who can only follow others step by step work and heavily reliant on Wizards and management GUIs while pissing off the experience Open Source community members who are forced to use distributions like these because of their support structure and market share.

Monday, 21 July 2008

Dell Hell

Had to contact Dell today for some assitance with their new PowerEdge 2950 servers we have just bought. We planned to run raid6 on all 6 drives and use that both for the OS and storage.

Seemed like a good plan till I realised partitions => 2GB are not supported without efi support but more importantly grub and lilo can't boot EFI/GPT configured drives.

There however is a workaround for this in the form of a linux kernel driver for EPI support to access those big partitions called efivars and a boot manager called efibootmgr which seems like a modified version of lilo.

The kernel module required seems to be included with the currently linux kernel in RH/Centos 5.1 but the efibootmgr needs to be installed and the disk partioned with an EFI partition for boot at the start of the disk as outlined in

Seems simple enough but as an admin used to fdisk I could not achieve this in DiskDruid without some further instruction.

On calling Dell support I was told that support can not be given as I'm not using thier OEM RH 5.1 CD and rather a downloaded version without their branding which would have no effect on issues I could face yet they still leave me unsupported!! After explaining for 10 minutes the situation did not change, and asking to speak to a supervisor resulted in the original operator rejoining the conversation to re iterate how Dell support will not help to resolve this issue and ending the conversation in a manner that led me to believe that was the last I'd hear from Dell.

However, today just as I had finished cussing Dell support I had a phonecall from a knowledgable chap with an Irish accent who talked me through possible options and one I was not aware of before. Raid on the raid controller supplied with the PE 2950 can be configured to create virtual disks on the raid arrays and unlike I had guessed you can have more than one of these per raid array so it's possible to create a small virtual disk that grub/lilo can boot and still use one raid6 array so no space is wasted :)

Just wanted to add the first support op who I spoke to called me back yesterday to confirm my resolution was successful. I'm still not clear if I miss understood the original operator and I was to be helped he just wasn't positive I'd get the support I needed as other customers haven't had success or if my call was listened to and a different action to the usual decided on since I presented myself as a very influential client.

Friday, 11 July 2008

Stolen Laptop data Recovery Script

I was bored last night and got inspired by the post
here about someone having a laptop stolen and created a script to recover data from a stolen laptop and alert me of it's current IP.

# Description: Stolen Laptop data Recovery Script (untested)
# Author:
# Date: 2008/07/10 21:16 BST
# Usage: Copy to /usr/local/bin/ and crontab with the following:
# */10 * * * * /usr/local/bin/ >/dev/null 2>&1
# Requirements: You must have wget, tcping nc installed and in the crontab users path
# Variables:
# Your home directory you want backed up. Remember the crontab user must have read rights to this directory.
# The host you want to backup your data too and are able to run nc exposed to the internet on.
# The file you need to create to enable the recovery script. Just echo 1 to this.
# Your email address. I personally use one that SMSs my phone or set a rule to SMS me on my mail server
# The mails themselves. You may customise these if you so wish
subject='Your STOLEN Laptop is Online!'
ip=`wget -q -o /dev/null -O - | grep '<b>' | cut -d '>' -f3 | cut -d '<' -f1`
Your stolen laptop is now online at $ip\n
Please logon to $backuphost and run 'nc -k -l 31337 > home.tar.bz2'\n
subject1='Your STOLEN Laptop Backup is now Complete!'
You may now remove $stolenfile and quit your 'nc -l 31337 > home.tar.bz2 command'\n
echo -e $body > /var/tmp/body.txt
whois $ip >> /var/tmp/body.txt
echo -e $body1 > /var/tmp/body1.txt
wget -q $stolenfile -O /dev/null
if [ $? -eq 0 ]
if [ -f /var/tmp/myip ]
myip=`cat /var/tmp/myip`
if [ $ip != $myip ]
mail -s "$subject" $email < /var/tmp/body.txt
echo $ip > /var/tmp/myip
tcping $backuphost 31337
if [ $? -eq 0 ]
ps -ax| grep "nc $backuphost 31337" | grep -v grep
if [ $? -eq 1 ]
if [ ! -f /var/tmp/backupcomplete ]
tar -cjvf - $home | nc $backuphost 31337
if [ $? -eq 0 ]
mail -s "$subject1" $email < /var/tmp/body1.txt
echo 1 > /var/tmp/backupcomplete
if [ -f /var/tmp/backupcomplete ]
rm /var/tmp/backupcomplete
if [ -f /var/tmp/myip ]
rm /var/tmp/myip
if [ -f /var/tmp/body.txt ]
rm /var/tmp/body.txt
if [ -f /var/tmp/body1.txt ]
rm /var/tmp/body1.txt

Thursday, 10 July 2008

CentOS 5 GFS Install

I have decided to include the steps I felt I needed to take to make a test install of GFS before I decided not to use it. It doesn't cover clvm configuration but it's very similar to LVM2 which is well documented on the net so I decided not to include it here.

#NTP is needed on every node in the cluster sync'd to the same place of course
yum -y install ntp
#Needed for gfs clustering
yum -y groupinstall Clustering
#GFS kernel module and FS utils
yum -y install gfs-kmod
yum -y install gfs-utils

#Needed to get the stupid system-config-cluster running for configuration of the cluster
yum -y install xorg-x11-xauth xorg-x11-fonts-base xorg-x11-fonts-Type1

#NTP is needed on every node in the cluster sync'd to the same place of course
chkconfig ntpd on
service ntpd start

#Stop updatedb trawling our gfs mounts
echo 'PRUNEFS = "auto afs iso9660 sfs udf"
PRUNEPATHS = "/afs /media /net /sfs /tmp /udev /var/spool/cups /var/spool/squid /var/tmp /cvp /mnt/cvp /media"' > /etc/updatedb.conf

# Create the filesystem with 125 journals(nodes) clustername coull and fs name cvp on /dev/hdb
gfs_mkfs -p lock_dlm -t coull:cvp -j 125 /dev/hdb

#config /etc/cluster/cluster.conf using system-config-cluster by throwing the x connection back to your machine via SSH

#add cluster hosts to /etc/hosts not DNS as this introduces a point of failure and some slowdown.

#Start clustering and GFS services
service cman start
service clvmd start
service gfs start

#Mount SAN device which should be a clv (centralised logical volume)
mount -t gfs /dev/san1/lvol0 /san -o noatime

# Memcache anyone? Not sure what the options are for yet as I've never set it up before
/usr/bin/memcached -d -m 512 -l -p 11211 -u nobody

GFS - Global File System

Thought I'd add my thoughts on GFS since the documentation on the net currently seems very fragmented and incomplete:

Large files can cause slowdown issues.

Machines really need 2 nics one for client access and one for access to the SAN, cluster communications and communication with the fence device.

The fence device is in most cases a APC power strip so when a node fails the other nodes can reboot that machine as part of the failover process, the other option is to close the switch port of that machine so it cant access the SAN data which IMO is worse than a reboot since recovery needs manual intervention.

When creating the GFS you need to have an idea how many nodes you are going to need as you cannot add more. I have read places that there are limits on the amount of nodes, it used to be very small(16) but now its over 100 for sure might even be 300 odd but each place I look I get a different value. We need to decide on how many we may eventually have I'm thinking just set it to 125 as I doubt we would ever have 125 web servers but then if things go well this statement could come back to haunt me in my sleep ;)

If we need to give access to the data to machines not in the cluster perhaps ones that don't use the data much we can use GNDB on each of the servers in the GFS cluster so they can be connected to as a network block device and data read and written to(albeit in a slower manner than if just gfs was used and creating more traffic on the client side of the network)

I can see no other real alternatives to GFS other than Veritas' expensive offering.

I can see now though after some heavy googling it should be possible to install gfs+clvm on both NAS servers to share the space they have in them and then anything linux can use gfs to talk to these but windows clients would be forced to use samba running on the NAS servers.

Wednesday, 9 July 2008

Handy shell commands

Dealing with broken things always tends to further my knowledge. Recently I discovered the following tips and tricks:

ls -d */ = Lists directories in the current directory.
disown = Disconnects a process fomr the bash session ready for logout if you have not already redirected the output.
nohup = Starts a process backgrounded and writing to nohup.out.
mkdir -p = Makes a directory and makes the parent directories if missing.

Rsync documentation bug

It seems developers still haven't got the hang of documentation ;)

Rsync is supposed to support the --password-file option and and env variable and for about a day I just assumed it was broken. It is not is seems the above options are only for use with rsyncd and not sshd forcing the user to use keys if they need to script rsync connections via ssh.

Windows users and administrators

Until recently I had very little time for Windows users and administrators. I felt they had very real skill and this was the reason they where more than happy to use broken crap everyday rather than learn a real OS. I now see that Windows users and administrators have a very difficult job having to learn lots of irrelevnant crap and hacks to get around the problems of their chosen operating system. The GUI in windows has turned into such a mess it doesn't even make sence to new users of computing as it did and as it should meaning if it wasnt for price mac would already be well on their way to the new user market share. Of course why use Windows and struggle everyday? Lack of education about choice or software support which IMO equates to lack of skill so these users still start off clueless as I had summised but they develop skill equal to those using other OS but just not as useful or really necessary skills and the fear of change once at a certain level stops them from migrating even though it makes sence.