Questions
and Answers
Amy Rich
Q We're installing a number
of Solaris 8 boxes on our LAN, and we want to verify that the speed
and duplex on the switch match that of the Sun network interfaces
during auto-negotiation. If we have to, we'll explicitly set
both sides and turn off auto-negotiation. These machines have ce0
interfaces, but ndd doesn't seem to be useful here.
I tried:
ndd /dev/ce link_status
ndd /dev/ce link_mode
ndd /dev/ce link_speed
The link_speed shows up as 0. That doesn't seem to be meaningful,
since it only seems to have a value of 0 or 1, and there are more
than two speeds available. Is there another variable I should be looking
at with ndd, or does link_speed have more than a 0/1 setting?
A The ndd command doesn't
work with ce interfaces. Use the -k switch to netstat
to obtain the information you're looking for:
netstat -k ce0 | egrep 'link_speed|link_status|link_duplex'
The output has the following meaning:
link_up - 0 down, 1 up
link_speed - speed in Mbit/s
link_duplex - 1 half duplex, 2 full duplex, 0 down
Q I just tried upgrading from FreeBSD
5.1 to FreeBSD 5.2 by obtaining the source from CVS and building things
from scratch. I followed the documentation at:
http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html
but it crashed during make installworld. It looks like things
installed ok, but obviously the crash should not have happened. Does
the normal build procedure not apply to FreeBSD 5 because it's
not a STABLE release yet?
A When upgrading FreeBSD from source,
always be sure to read /usr/src/UPDATING. The UPDATING file from
5.2 contains the following warning about the build and install process:
20031112:
The statfs structure has been updated with 64-bit fields to allow
accurate reporting of multi-terabyte filesystem sizes. You should
build world, then build and boot the new kernel BEFORE doing a
'installworld' as the new kernel will know about binaries using
the old statfs structure, but an old kernel will not know about
the new system calls that support the new statfs structure.
Note that the backwards compatibility is only present when
the kernel is configured with the COMPAT_FREEBSD4 option. Since
even /bin/sh will not run with a new kernel without said option
you're pretty much dead in the water without it. Make sure you have
COMPAT_FREEBSD4! Running an old kernel after a 'make world' will
cause programs such as 'df' that do a statfs system call to fail with
a bad system call. Marco Wertejuk <wertejuk@mwcis.com> also reports
that cfsd (ports/security/cfs) needs to be recompiled after these
changes are installed.
****************************DANGER*******************************
DO NOT make installworld after the buildworld w/o building and
installing a new kernel FIRST. You will be unable to build a
new kernel otherwise on a system with new binaries and an old
kernel.
Q I'm running sendmail 8.12.10
on our primary MX hosts and our backup MX hosts. Even though our primary
MX machines are well connected, we're seeing a great deal of
spam mail traveling through our backup MX machines. Is there some
vulnerability that spammers are scanning for? I can't figure
out why they're hitting both our backup and primary MX hosts.
A Many spammers use the backup
MX hosts because some sites that are doing spam filtering are only
doing it on their primary MX hosts. Spammers especially try to circumvent
hosts using DNSBL rulesets on the primary MX hosts only. If spammers
inject mail through the unprotected backup MX hosts, then the primaries
will see the spam as coming from the backup MX hosts and not the
original blocked host.
Q I'm running an ftp server
that allows uploads to a central area. I have a cron job that checks
this area and copies files off to a staging directory every half
hour. Occasionally files get copied in the midst of uploading, and
the copied file is therefore incomplete and corrupted. Is there
an easy way to move only files that have been sitting around long
enough to have transferred completely? I'm not sure how long
that time frame would need to be, though, because uploads vary in
size.
A Instead of waiting a specific
period of time, check whether the file is open before copying it
off into your staging area. You don't say which operating system
you're running, but the tool lsof runs on most Unix-like systems.
You may also have a tool that comes with the OS, like fuser, that
will suffice. Here's some sample output for /var/log/syslog
on a Solaris 8 machine:
lsof /var/log/syslog
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
syslogd 150 root 9w VREG 85,4 0 106343 /var/log/syslog
fuser /var/log/syslog
/var/log/syslog: 150o
Here's some output for /etc/passwd, which isn't open:
lsof /etc/passwd
fuser /etc/passwd
/etc/passwd:
If lsof or fuser doesn't show the file as open, then it's
safe to copy. If the file is open, lsof shows the process that has
the file open, the PID of the process, the user, the file descriptor
number and method of access, the type of node associated with the
file, the device numbers of the file, the size of the file or the
file offset in bytes, the node number of the file, and the file name.
The contents of the FD field have the following meaning:
FD is the File Descriptor number of the file or:
cwd -- Current working directory
Lnn -- Library references (AIX)
ltx -- Shared library text (code and data)
Mxx -- Hex memory-mapped type number xx
m86 -- DOS Merge mapped file
mem -- Memory-mapped file
mmap -- Memory-mapped device
pd -- Parent directory
rt -- Droot directory
txt -- Program text (code and data)
v86 -- VP/ix mapped file
FD is followed by one of these characters, describing the
mode under which the file is open:
r -- For read access
w -- For write access
u -- For read and write access
space -- If unknown and no lock character
- -- If unknown and lock character
If the file is open, the fuser output displays the process ID
of the program using the file followed by a letter specifying the
type of access (some types are only available on certain operating
systems):
a -- If the process is using the file as its trace
file in /proc
c -- If the process is using the file as its current
directory
e -- If the process is using the file as the executable
being run
f -- If the process is using the file as an open file
m -- If the process is using the file as a mmaped or
shared lib
o -- If the process is using the file as an open file
p -- If the process is using the file as the parent
of its current directory
r -- If the process is using the file as root directory
s -- If the process is using the file as a shared lib
t -- If the process is using the file as its text file
y -- If the process is using the file as its controlling
terminal
Q I'm running OpenSSH 3.7.1p2
on a Beowulf cluster of Linux machines. When a machine drops out
of the cluster and jobs fail over to another node, users are no
longer able to connect to the logical cluster name using ssh because
the host key has changed. It's non-optimal for them to have
to know the name of each cluster node (and self-defeating, because
a node might be up or down at any given time), but ssh clearly needs
that granularity. The sys admins also need that granularity, since
we need to occasionally need to operate on specific machines instead
of on the cluster as a whole. What's the best method to deal
with a situation like this?
A The issue, as you pointed out,
is that the users who are experiencing problems have a known_hosts
hostname/key pair entry for the cluster name, and not an individual
machine name. When the key changes but the host name stays the same,
ssh believes that an IP/hostname hijacking may have taken
place and won't allow password authentication. There are a
few methods to work around this depending on whether you're
more concerned about security or ease of use and whether you have
control over the machines from which the users are ssh'ing.
The first method is to put the same host key(s) on every system.
Check sshd_config for any HostKey lines and be sure
to copy those and the corresponding .pub files to each machine.
Now users will always see the same host key no matter which node
they connect to. You can also connect to the node by name and not
have any problems. The downside is that once the key is compromised,
all hosts are compromised.
The second method is to modify each user's known_hosts
file so that each node has its own entry and is also tagged with
the cluster name. This takes effort on the user's part and
may be difficult to keep up with if new nodes are being added or
changed continuously. If you rarely add or change nodes, this can
be accomplished through a large one-time addition, supplied by the
sys admins, to each user's known_hosts file. The new
hostname/key pairs would change from:
hostname,IP key-type key
to:
hostname,IP,cluster-name key-type key
For example:
node1,192.168.1.1,cluster,cluster.my.domain ssh-dsa AAAA...
node2,192.168.1.2,cluster,cluster.my.domain ssh-dsa BBBB...
...
nodeN,192.168.1.N,cluster,cluster.my.domain ssh-dsa ZZZZ...
If all of the users are connecting from centralized Unix-like machines
that the systems administrators control, add the above entries to
the global ssh_known_hosts file instead of modifying the known_hosts
file of each individual user. If possible, this is the best option
because no effort is required on the part of the user, and the integrity
of each node is not compromised when one key is cracked.
Q I have a laptop that I connect
to the serial ports of our machines in the datacenter to get console
access. Right now I'm using Hyperterm, but it's pretty
rotten and I'd like to find something better. Any suggestions
for something that will handle Solaris servers and perhaps do ssh,
too?
A Since you list Hyperterm, I'm
going to presume you're using some form of Microsoft Windows
on your laptop. If you want to replace Hyperterm for serial and
ssh, try using one of the following:
- The free ssh plugin to Teraterm, TTSSH, (http://www.zip.com.au/~roca/ttssh.html)
only supports SSH v1.5, so it's not useful if you need to
connect to SSH v2 servers, which are more secure.
- SecureCRT (http://www.vandyke.com/products/securecrt/)
is a commercial product but supports both versions of the SSH
protocol as well as serial, telnet, and a few other protocols.
The best solution, though, is to connect all of your machines
to one (or multiple, if you have a large number of machines) dedicated
console server that supports ssh. That way any machine that supports
ssh can remotely connect to a machine's console and there's
no need for anyone's laptop to support a serial connection
except when configuring the console server for the first time. If
you go this route, PuTTY (http://www.chiark.greenend.org.uk/~sgtatham/putty/)
might also be worth a look. It's a free product that supports
both versions of the SSH protocol. It currently has no serial interface,
but it's on the developer's wishlist.
Q I install Solaris on a number
of machines and then mirror the boot disks. The disks are always
identical, so is there an easy way to format them identically at
the same time, or do I have to format by hand on all the mirror
disks?
A If you're doing a large
number of installations, you should look into using JumpStart. Here's
a partial JumpStart profile that would do exactly what you want.
It also sets aside 20M of space in slice 7 of each disk for the
state databases. I happen to name the spare slices and mount them
during the install, because I have finish scripts that automatically
install and configure DiskSuite/Volume Manager for me:
install_type initial_install
system_type standalone
partitioning explicit
filesys rootdisk.s0 1024 / logging
filesys rootdisk.s1 1024 swap size=512m
filesys rootdisk.s3 1024 /usr logging
filesys rootdisk.s4 1024 /var logging
filesys rootdisk.s7 20 /spare/md0
filesys rootdisk.s5 free /files logging
filesys c0t1d0s0 1024 /spare/root
filesys c0t1d0s1 1024 unnamed
filesys c0t1d0s3 1024 /spare/usr
filesys c0t1d0s4 1024 /spare/var
filesys c0t1d0s7 10 /spare/md1
filesys c0t1d0s5 free /spare/files
If you opt not to use JumpStart or need to partition a disk after
the fact, you can still get the job done with one command chain instead
of using format. This example assumes that your boot disk is c0t0d0
and your mirror disk is c0t3d0:
prtvtoc /dev/rdsk/c0t0d0s2 | fmthard -s - /dev/rdsk/c0t3d0s2
Amy Rich, president of the Boston-based Oceanwave Consulting, Inc.
(http://www.oceanwave.com), has been a UNIX systems administrator
for more than 10 years. She received a BSCS at Worcester Polytechnic
Institute, and can be reached at: qna@oceanwave.com.
|