Article

oct2005.tar

Questions and Answers

Amy Rich

In the July 2005 issue, I answered a question dealing with sudo and shell redirection. I thank John Rouillard for submitting another solution to the same issue using the tee command. He says:

You rightly mention that redirection is done by the shell without the privileges to write to the file, but a sudo idiom using tee can get around this. Replace:

sudo command > /tmp/foo

or:

sudo command >> /tmp/foo

with:

sudo command | sudo /usr/bin/tee /tmp/foo > /dev/null

or:

sudo command | sudo /usr/bin/tee -a /tmp/foo > /dev/null

Now, since it's not a good idea to give just anybody access to tee, as they can now overwrite any file on the system, you can define the allowed arguments to tee as well. So to allow the use of tee to append to the file /tmp/foo, you would put (something like):

allowed_user  HOST = NOPASSWORD: /usr/bin/tee -a /tmp/foo, \
                     PASSWORD: /other/commands

Q We just installed Solaris 9 on a new Sun Fire V440 with an attached D1000 JBOD array. After mirroring all of the disks, we rebooted the machine and some of the database replicas were tagged as invalid. The error message says I need to run "metadevadm -u c2t1d0" to fix it. I tried running this and rebooted and still had the same issue. To make sure, we went through this again, and the exact same thing happened. What happened to the replicas, and how do we get them to stay current so that the machine doesn't continually have issues upon reboot?

A I'm guessing that you've used raidctl to configure an internal RAID device and then put you metadb replicas on it? This is not supported by Sun and will result in the behavior you describe. From:

http://sunsolve.sun.com/search/document.do?assetkey=1-9-77518-1

When probed by the operating system upon reboot, the internally mirrored device on the Sun Fire 440 server does not report a consistent device identity. For this reason, the metadbs previously created on this device are no longer valid.

The following is an example of the console output, which appears in /var/adm/messages:

Jun 25 10:15:02 hostname metadevadm: Invalid device relocation
information detected in Solaris Volume Manager
Jun 25 10:15:02 hostname metadevadm: Please check the status of the following disk(s):
Jun 25 10:15:02 hostname metadevadm:      c#t#d#

The metadevadm command can be used to determine if the information for device paths has changed. The -n option emulates the update, so any changes are not implemented. For example:

hostname# metadevadm -n -u c#t#d#
Updating Solaris Volume Manager device relocation information for c#t#d#
Old device reloc information:
   id1,sd@#############################
New device reloc information:
   id1,sd@#############################

This preceding output shows that that either of the following conditions has occurred:

The pathname stored in the metastate database no longer correctly addresses the device.
The device ID for a disk drive has changed.

Without the -n flag, the metadevadm command updates the device information. The update, however, does not resolve the problem, and the metadbs will still be invalid.

The metadb output will also show an unknown block count. For example:

hostname# metadb -i
     flags     first blk    block count
 M     p       20           unknown         /dev/dsk/c#t#d#s7
 M     p       7110         unknown         /dev/dsk/c#t#d#s7
 M     p       15612        unknown         /dev/dsk/c#t#d#s7

Q We're setting up two SPARC machines as part of a Sun Cluster. Because of hardware limitations, we're using the onboard ce interfaces and a qfe quad card in each machine. For redundancy's sake, we do not want to put all the private interconnects on the qfe card; however, the documentation I've read says that all the private cluster interconnects must be running at the same speed/duplex. Is this true, or can we use one of the ce interfaces as a private cluster interconnect, too?

A You've read the documentation correctly; all cluster private interconnects must be running at the same speed/duplex. I'm assuming that you're running one of the two ce interfaces at gigabit speeds and that you're doing point-to-point connections for the cluster private interconnects and/or auto-negotiation isn't going to work. If so, you can still individually set the other interface (the one you'd like to use as a cluster private interconnect) to 100FDX by using ndd. To have the modification persist through reboots, add an init file such as the following:

#!/sbin/sh
#
# Kernel/device tuning script
#
case "$1" in
  'start')
      echo "Setting ce1 to 100FDX"
      ndd -set /dev/ce instance 1
      ndd -set /dev/ce adv_1000fdx_cap 0
      ndd -set /dev/ce adv_1000hdx_cap 0
      ndd -set /dev/ce adv_100fdx_cap 1
      ndd -set /dev/ce adv_100hdx_cap 0
      ndd -set /dev/ce adv_10fdx_cap 0
      ndd -set /dev/ce adv_10hdx_cap 0
      ndd -set /dev/ce adv_autoneg_cap 0
    ;;
         
    'stop')
    ;;
         
    *)
      echo "Usage $0 start"
      exit 1
    ;;
  esac

exit 0

If you call this script, say, /etc/init.d/kernelmods, then you'd also want to link /etc/rc2.d/S69kernelmods (or earlier) to it so that it gets run before most of the multiuser network services.

Q Is there a program similar to split that will work on regular expressions instead of a fixed number of lines? We have one file full of three different kinds of data, but each type of data is separated by an identifiable expression. Here's a sample data set:

xxx1
aaa,bbb,ccc,ddd
aba,bab,cdc,dcd
xxx2
eee,fff,ggg
xxx3
hhh,iii
xxx2
eef,ffg,gge,fff

The separators are the xxx{num} tokens, where {num} is 1, 2, or 3 (depending on the kind of data). I want to end up with three separate files containing the different kinds of records.

A One portable shell solution is to use the csplit program. First, split on your record separator (I've made this match any number after xxx, just in case you never need more than three types of data):

#!/bin/sh
recsep='xxx[0-9]*'
numrecs=`grep -c "^${recsep}" $1`
if [ ${numrecs} -le 1 ]; then
  echo "Only one type of record, not splitting file."
  exit 0
elif [ ${numrecs} -eq 2 ]; then
  csplit -skf data $1 "%^${recsep}%" "/^${recsep}[0-9]*/"
else
  splitcount=`expr ${numrecs} - 2`
  csplit -skf data $1 "%^${recsep}%" "/^${recsep}/" \{${splitcount}\}
fi

This will leave you with some number of files that each contains one type of record. Second, you can recombine all of the records into files matching the record type and remove the unnecessary record separators:

for ifile in `ls data*`; do
  ofile=`head -1 ${ifile}`
  grep -v ${ofile} ${ifile} >> ${ofile}
done

Now you'll have the following files and output, given your sample data:

xxx1:
  aaa,bbb,ccc,ddd
  aba,bab,cdc,dcd
xxx2:
  eee,fff,ggg
  eef,ffg,gge,fff
xxx3:
  hhh,iii

Other solutions to your problem include writing an awk script and modifying FS and RS, or using some other language (Perl, Python, C, whatever) to print out the data between record separators.

Q I just purchased a used Sun box from eBay for a project we're doing at work. This project requires that we mirror the disks under SVM, and we have seven partitions available for data. This means I'm one partition short unless I use the backup partition, slice 2, for real data. I know that in the past this was not recommended, but is this true anymore? Can I just use slice 2 for data?

A As a general rule, you can repartition the disk and use slice 2 for data instead of the backup partition, but it's not conventional and will cause issues with some utilities. Unfortunately for you, one of the things that requires slice 2 to be the whole disk is SVM. Other commands that traditionally expect slice 2 to be the backup partition are prtvtoc and format. If you've already modified slice 2, you can change it back by running format and configuring slice 2 as follows:

Enter partition id tag[backup]:
Enter partition permission flags[wm]:
Enter new starting cyl[0]:
Enter partition size[35368272b, 7506c, 17269.66mb, 16.86gb]: 7506c

The above partition size should be the whole disk (you can specify cylinders, blocks, whatever).

There may be a solution to your problem, though. Depending on what kind of data you're putting on these partitions (OS data or non-OS data), you may be able to use soft partitions to create the extra space you need. Take a look at the docs.sun.com page on soft partitions to see whether it will meet your needs:

http://docs.sun.com/app/docs/doc/817-5776/6ml784a4c?a=view

If you set aside slice 6 on your two mirror disks for the non-OS data, you would create two soft partitions as follows:

metainit d81 -p c0t0d0s6 4g
metainit d82 -p c0t1d0s6 4g

metainit d91 -p c0t0d0s6 8g
metainit d92 -p c0t1d0s6 8g

Now you can mirror those metadevices like you would any others:

metainit d80 -m d81
metattach d80 d82

metainit d90 -m d91
metattach d90 d92

Q I'm trying to write a Jumpstart post script to put some disks into metasets. There are two issues to worry about. If disk has no label, one needs to be written out. If the machine already has a valid UFS partition on slice 0, then we don't want to modify the disk at all. I know how to do this by hand with format, but there doesn't seem to be a good, non-interactive way to accomplish this in a script.

A Let's assume that you're feeding your script the disk device but without the slice number, for example, c1t1d0. To begin, define a diskset name, get the path to the disk device, and write out a command file for format:

diskset='svmdisk'
diskdev="/dev/rdsk/$1"
echo "label" >> ${B}/tmp/formatcmds.$$

Next, determine whether the disk has a UFS partition on slice 0, and if not, label the disk:

diskstat=`fstyp "${diskdev}s0"`
if [ "X${diskstat}" = "Xufs" ]; then
  # the filesystem has already been initialized
  echo "Not modifying disk: ${diskdev}"
  exit 0
else
  # add a label to the disk
  echo "Labeling disk: ${diskdev}"
  format -d ${diskdev} -f ${B}/tmp/formatcmds.$$
fi

Double-check that you've successfully labeled the disk before you go any further:

if [ "X`prtvtoc -f ${diskdev}s0`" = "X" ]; then
  # the initialization did not work
  echo "ERROR: Initialization of ${diskdev} failed, exiting."
  exit 1
fi

At this point, you either should have bailed out of the script because the initialization failed or because you already had a valid UFS filesystem or you should now have a freshly labeled disk. From this point, you can add the host to the metaset and then add your disks:

metaset -s ${diskset} -a -h `hostname`
metaset -s ${diskset} -a ${diskdev}

You can use ${SI_HOSTNAME} in place of "hostname" if you've defined the hostname in your rules file, but if you're doing a generic rule to match many hosts, SI_HOSTNAME won't be set.

Amy Rich has more than a decade of Unix systems administration experience in various types of environments. Her current roles include that of Senior Systems Administrator for the University Systems Group at Tufts University, Unix systems administration consultant, and author. She can be reached at: qna@oceanwave.com.