Questions
and Answers
Amy Rich
Q We have a number of Netra T1 105
machines that we purchased from a company that was going out of
business. Unfortunately the company that previously owned these
machines set the eeprom password and put them into command-secure
mode and we can't get beyond it. How can we reset the prom password
so that we can reinstall the OS?
A Since security-mode is
only set to command, not full, and you have physical access to the
machine, you can access the eeprom from the running OS. If you have
the root password to the machine, simply boot it and use the eeprom
command to turn off security-mode and/or change the security-password.
If the disk was wiped clean before it was sold to you, you can try
booting and hope that the diag-device is set to network (where
you can put an OS image -- especially easy if you already have a
jumpstart server set up) or the CDROM.
If there's an existing OS image but you don't have the root password,
or if the machine has no useful diag device set, you can relocate
the boot disk to a machine with no eeprom password. Then you can
either zero out or reset the root password on the Netra's boot disk,
or you can just use the spare machine to install an entirely new
OS image with a root password that you know.
If all else fails, you can replace the NVRAM chip with a new one,
but that's generally an unnecessary last resort.
Q I come from a Linux background,
but I've just been hired as a sys admin at a site that's primarily
HP-based. I'm looking for information on creating snapshot volumes
on HP/UX 11.00 like I would under Linux. Is there something similar
to the following Linux command for HP:
lvcreate -s -L 256m -n snap /dev/vg00/lvol1
The above creates a snapshot linked to the lvol1 volume in the vg00
group.
A If you want to create snapshots
with HP/UX 11.00, search for the OnlineJFS documentation and the
documentation on using a Snapshot Filesystem for backups at http://docs.hp.com.
OnlineJFS is essentially an HP re-branded version of Veritas Volume
Manager. With OnlineJFS, a snapshot filesystem is created with the
mount command:
mount -F vxfs -o snapof=special|mount_point,snapsize=snapshot_size \
snapshot_special snapshot_mount_point
Once the filesystem is un-mounted, the snapshot ceases to exist. If
the filesystem is re-mounted, a fresh snapshot is taken.
Q I'm running OS X 10.3.5 on my
laptop and using the Fink Commander GUI to help me install/upgrade
third-party freeware. I've been doing this ever since the 10.2 days
with no problems, but recently when I've tried to do a self-update,
I've gotten errors and it dies. Here's the output of the self-update
command from within Fink Commander:
I will now run the cvs command to retrieve the latest package
descriptions. The 'su' command will be used to run the cvs command
as the user 'arr'. After that, the core packages will be updated
right away; you should then update the other packages using
commands like 'fink update-all'.
/usr/bin/su dsr -c 'cvs -q -z3 update -d -P'
? stamp-rel-0.6.2
? 10.2-gcc3.3/stable/main/finkinfo/base/dpkg-bootstrap.info
? 10.2-gcc3.3/stable/main/finkinfo/base/fink-0.16.0-1.info
.
.
.
? 10.3/stable/main/finkinfo/x11-system/xfree86-base.patch
? 10.3/stable/main/finkinfo/x11-wm/sawfish-1.1-16.info
C 10.3/stable/crypto/finkinfo/gaim-ssl.info
M 10.3/stable/crypto/finkinfo/kdebase3-ssl.info
M 10.3/stable/crypto/finkinfo/kdelibs3-ssl.info
### execution of /usr/bin/su failed, exit code 1
Failed: Updating using CVS failed. Check the error messages above.
No matter what cvs commands I try, or whether I try them from the
command line or from within the GUI, nothing seems to get past that
point. As far as I know, nothing changed between the last time I built
packages and now, so I'm not sure why it's failing. Do you have any
hints?
A Since you aren't getting any
specific errors from the cvs command, it's likely that there's a
problem with a file in the /sw/fink/dists tree. There are
three irregularities in your output, the lines:
C 10.3/stable/crypto/finkinfo/gaim-ssl.info
M 10.3/stable/crypto/finkinfo/kdebase3-ssl.info
M 10.3/stable/crypto/finkinfo/kdelibs3-ssl.info
In CVS, the leading tags have special meaning. From the man page:
M file The file is modified in your working directory. M can indicate
one of two states for a file you're working on: either there were
no modifications to the same file in the repository, so that your
file remains as you last saw it; or there were modifications in
the repository as well as in your copy, but they were merged successfully,
without conflict, in your working directory.
C file A conflict was detected while trying to merge your changes
to file with changes from the source repository. file (the copy
in your working directory) is now the result of merging the two
versions; an unmodified copy of your file is also in your working
directory, with the name .#file.version, where version is the revision
that your modified file started from. (Note that some systems automatically
purge files that begin with .# if they have not been accessed for
a few days. If you intend to keep a copy of your original file,
it is a very good idea to rename it.)
The most likely culprit is the file 10.3/stable/crypto/finkinfo/gaim-ssl.info.
Perhaps you've made a manual change to that file and now the maintainer
of the gaim-ssl package has also changed it. You should move that
file aside and try running the update again to see if that fixes
the issue:
sudo rm /sw/fink/10.2/unstable/main/finkinfo/libs/db31-3.1.17-6.info
sudo fink selfupdate-cvs
You might also want to check the two kde info files to see why they're
tagged with an M and/or move them aside as well.
Q We're running sendmail 8.13.1
on Fedora Core. I've configured sendmail so that it should be limiting
the number of connections per second differently for specific domains
and IP ranges. Our base rate is set at 50, but we have some offsite
machines under the same corporate umbrella with which we exchange
a lot of email and therefore need to bump up the maximum connection
rate. Unfortunately, everyone seems to be getting throttled at 30.
Here's the access file:
ClientRate:other.corp.domain 100
...
ClientRate:192.168.100 100
...
ClientRate:127.0.0.1 0
ClientRate: 50
The following should be the only pertinent line in my mc file:
FEATURE('ratecontrol')dnl
A You're using the wrong
syntax for the ClientRate statements. This snippet from the
sendmail 8.13 cf/README file explains the correct usage:
ratecontrol Enable simple ruleset to do connection rate control
checking. This requires entries in access_db of the form
ClientRate:IP.ADD.RE.SS LIMIT
Instead of specifying domain names, FQDNs, or IP blocks, you need
to specify specific IP addresses. It is hopefully that the sites
from which you need to accept heavier mail loads are using a handful
of relay servers so that you don't need to specify a large number
of IPs in the mc file.
Q I'm running Solaris 8 on a SunFire
4800 and trying to apply some patches to the system. Every time
I try to run patchadd, regardless of patch, I get the following
message (with the full path to the specific checkinstall
program omitted since it happens with all of them):
checkinstall: cannot open
pkgadd: ERROR: checkinstall script did not complete successfully
Dryrun complete.
No changes were made to the system.
The checkinstall file is in fact there, so I'm not sure what
it's saying it can't open. Any clues what the problem might be?
A When you receive checkinstall
errors during a patchadd, it usually indicates that all of
the files in the patch cannot be read by the user install,
or nobody, or that neither user exists. Patchadd will
attempt to execute the checkinstall script as the user install
and then fall back to the user nobody if the install
user doesn't exist.
If you have an install user, make sure that all directories
leading up to and including all the patch files, as well as the
files themselves, are readable by install. If you're storing
patches in an unreadable directory, either change the permissions
as needed or move the patches to someplace like /tmp where
the permissions are relaxed. If you don't have an install
user, make sure the nobody user is in both the /etc/passwd
and /etc/shadow files. Also apply the same rules for directory
and file readability as described above for the install user.
Q Recently our company's DNS server
started having issues resolving addresses on the first try. The
first attempted resolution results in a "host not found" error.
If the address is tried a second time, though, it appears to work
fine. This server is an OS X 10.3.5 machine running BIND 9.2.2,
and it's serving a variety of different machines. Might you have
any idea what the issue is and how I could fix it?
A You don't provide any debugging
output from named or dig, but based on the date your
question was submitted, I'd hazard a guess that you are being bitten
by an issue that's plaguing a number of OS X administrators. The
two possible problem/answers seem to be that the firewall is blocking
large EDNS udp packets, and you need to allow for those queries
to be passed or add the following to the options subcategory of
the named.conf for newer versions of bind:
edns-udp-size 512;
The second issue is conjectured to be one with the IPv6 implementation
on some BSD-based operating systems. To work around this, upgrade
to BIND 9.3 and force only IPv4 resolution by invoking named
as:
named -4
Q We have a new Netra 440
server running Solaris 9 that we're using to handle Web services for
our department. Things go fine for a while, but then the machine mysteriously
reboots with no errors logged to syslog. To track down the issue,
I hooked up a console server and logged the output to that as well,
just in case the machine was dying before it had a chance to throw
an error and log it to disk. The only output it logged before each
crash was the following:
Fatal Error Reset
SC Alert: Host System has Reset
You probably can't tell what the issue is from this little information,
but I was hoping that you might point me in the right direction when
it comes to exploring this problem further.
A For starters, you might want
to guarantee that you have core dumps saved after you reboot. Take
a look at the file /etc/dumpadm.conf and make sure it says
something along the lines of:
DUMPADM_DEVICE=/dev/dsk/c0t1d0s1
DUMPADM_SAVDIR=/var/crash/you.host.name
DUMPADM_CONTENT=kernel
DUMPADM_ENABLE=yes
The last line shows that savecore is enabled. If you're using
Veritas Volume Manager or Solaris Volume Manager, you'll need to set
the dump device to one of the mirrors to make sure it doesn't get
overwritten when the filesystem drivers take over. For Solaris Volume
Manager, the DUMPADM_DEVICE line would look more like:
DUMPADM_DEVICE=/dev/md/dsk/d1
To modify the dumpadm.conf file, use the command dumpadm(1m)
instead of editing the file directly.
If you get a core file, you can send it to Sun Support for analysis
once you open up a case, or you can try to analyze it yourself.
Because you specifically mention the hardware as a Netra 440,
though, there's a good chance that you're seeing the issue described
in Document 57618:
http://sunsolve8.sun.com/search/document.do?assetkey=1-26-57618-1
If you don't have a Sunsolve account, the document basically states
that there are problems with the ce0 or net0 interfaces
on those machines. The error message matches what you're seeing exactly.
To gather more diagnostic data to determine whether this is the issue,
they suggest setting the following from the OBP:
diag-switch? true
post-trigger none
obdiag-trigger none
You're experiencing this issue if the reset reason includes PBM FATAL
with a PCI IO-BRIDGE register output similar to:
ha019 console login:
Fatal Error Reset
SC Alert: Host System has Reset
@(#)OBP 4.10.10 2003/08/29 06:25 Sun Fire V440
Clearing TLBs
Loading Configuration
Membase: 0000.0033.0000.0000
MemSize: 0000.0000.4000.0000
Init CPU arrays Done
Init E$ tags Done
Setup TLB Done
MMUs ON
Scrubbing Tomatillo tags... 0 1
Block Scrubbing Done
Find dropin, Copying Done, Size 0000.0000.0000.5ca0
PC = 0000.07ff.f000.4c88
PC = 0000.0000.0000.4d28
Find dropin, (copied), Decompressing Done, Size 0000.0000.0006.6700
ttya initialized
System Reset: (PBM FATAL)
JBUS-PCI bridge
JBUS-PCI bridge
slave Error Register: 8000000000001000
To work around the issue, they suggest disabling the ce0 (net0)
interface and from the OBP and only using the ce1/net1
interface:
nvedit
0: probe-all install-console banner
1: " /pci@1c,600000/network@2" $delete-device drop
2:
^C
nvstore
setenv use-nvramrc? true
reset-all
If you definitely need another network interface, the document suggests
adding another PCI network card to the machine. A permanent fix for
the issue is still in the works according to the document.
Amy Rich, president of the Boston-based Oceanwave Consulting,
Inc. (http://www.oceanwave.com), has been a UNIX systems
administrator for more than 10 years. She received a BSCS at Worcester
Polytechnic Institute, and can be reached at: qna@oceanwave.com. |