Questions
and Answers
Amy Rich
Q I'm running Solaris 10 on
a V120, and it's acting as one of our DNS servers. Unfortunately,
every time the machine reboots, the DNS server is disabled. To check
diagnostic output, I ran:
svcs -x dns/server
which showed the following output:
svc:/network/dns/server:default (?)
State: disabled since Tue Jun 19 13:37:44 2005
Reason: Disabled by an administrator.
See: http://sun.com/msg/SMF-8000-05
See: named(1M)
Impact: This service is not running.
To fix it, I do:
svcadm enable dns/server
which brings up the service just fine, and everything works until
the machine is rebooted again. How do I permanently tell the service
manager to start BIND at boot?
A This is actually a known bug
with Solaris 10 and is described in docid 6226796, available to
support customers at:
http://sunsolve.sun.com/search/document.do?assetkey=1-1-6226796-1
There's a patch that's been rolled into Solaris Express
and should be making its way to the userbase as a Solaris 10 patch
as well.
Q I've decided to take the
plunge and start testing the fink unstable tree for OS X 10.3.9.
Our users want newer versions of software than are available in
the stable tree. I added unstable/main and unstable/crypto
to the Trees: line of /sw/etc/fink.conf and then used Fink
Commander 0.5.3 to do a selfupdate. Everything looked great,
and I updated a bunch of packages to the unstable versions. A week
later, though, I clicked Update in Fink Commander and received
the following output:
Err file: unstable/main Packages
File not found
Ign file: unstable/main Release
Err file: unstable/crypto Packages
File not found
Ign file: unstable/crypto Release
Hit http://bindist.finkmirrors.net 10.3/release/main Packages
Hit http://bindist.finkmirrors.net 10.3/release/main Release
Hit http://bindist.finkmirrors.net 10.3/release/crypto Packages
Hit http://bindist.finkmirrors.net 10.3/release/crypto Release
Hit http://bindist.finkmirrors.net 10.3/current/main Packages
Hit http://bindist.finkmirrors.net 10.3/current/main Release
Hit http://bindist.finkmirrors.net 10.3/current/crypto Packages
Hit http://bindist.finkmirrors.net 10.3/current/crypto Release
Failed to fetch
file:/sw/fink/dists/unstable/main/binary-darwin-powerpc/Packages \
File not found
Failed to fetch
file:/sw/fink/dists/unstable/crypto/ \
binary-darwin-powerpc/Packages File not found
Reading Package Lists...
Building Dependency Tree...
W: Couldn't stat source package list file: unstable/main Packages
(/sw/var/lib/apt/lists/ \
_sw_fink_dists_unstable_main_binary-darwin-powerpc_Packages)
- stat (2 No such file or directory)
W: Couldn't stat source package list file: unstable/crypto Packages
(/sw/var/lib/apt/lists/ \
_sw_fink_dists_unstable_crypto_binary-darwin-powerpc_Packages)
- stat (2 No such file or directory)
W: Couldn't stat source package list file: unstable/main Packages
(/sw/var/lib/apt/lists/ \
_sw_fink_dists_unstable_main_binary-darwin-powerpc_Packages)
- stat (2 No such file or directory)
W: Couldn't stat source package list file: unstable/crypto Packages
(/sw/var/lib/apt/lists/_sw_fink_dists_unstable \
_crypto_binary-darwin-powerpc_Packages)
- stat (2 No such file or directory)
W: You may want to run apt-get update to correct these problems
E: Some index files failed to download, they have been ignored, \
or old ones used instead.
I looked for the files that it complained about, and they weren't
there. I would have figured that the download would have picked them
up, but that appears to have failed as well. Where can I pick up these
files so that I can get updates without having to do a selfupdate
every time?
A You missed a step when making
your modifications. After you add the necessary lines to /sw/etc/fink.conf
and run selfupdate, you must also run scanpackages
to build the files that you're missing. If you want to do this
from Fink Commander, it's under the Source menu. You
can also do this from the command line by running:
sudo fink scanpackages
Q We're trying to JumpStart
a SunFire V100 with Solaris 8. The machine finds the boot server and
loads the kernel then tries to bring up the Ethernet interfaces. At
this point, the machine panics and then resets. Here's the output
from the boot sequence:
boot net1 - install
Boot device: /pci@1f,0/ethernet@5 File and args: - install
Timeout waiting for ARP/RARP packet
Timeout waiting for ARP/RARP packet
Requesting Internet address for <the machine's ethernet addr>
SunOS Release 5.8 Version Generic_108528-29 64-bit
Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved.
ip_rput_dlpi(dmfe2): DL_ERROR_ACK for DL_ATTACH_REQ(11), errno 8, unix 0
ip_rput_dlpi(dmfe2): DL_ERROR_ACK for DL_BIND_REQ(1), errno 3, unix 0
ip_rput_dlpi(dmfe2): DL_ERROR_ACK for DL_PHYS_ADDR_REQ(49), errno 3, unix 0
ip_rput_dlpi(dmfe2): DL_ERROR_ACK for DL_UNBIND_REQ(2), errno 3, unix 0
ip_rput_dlpi(dmfe2): DL_ERROR_ACK for DL_DETACH_REQ(12), errno 3, unix 0
strplumb: setifname dmfe2 unit 2 IP failed, error 6
dl_attach: DL_ERROR_ACK bad PPA
revarp_myaddr: dl_attach failed: error -1
whoami: revarp_myaddr failed: error -1.
WARNING: nfsdyn_mountroot: NFS3 mount_root failed: error -1
Cannot mount root on /pci@1f,0/ethernet@5 fstype nfsdyn
panic[cpu0]/thread=10408000: vfs_mountroot: cannot mount root
0000000010407970 genunix:vfs_mountroot+70 (10436800, 0, 0, 104395c8, 10, 14)
%l0-3: 0000000010436800 0000000010439f20 00000000de000000 \
0000000010436a28
%l4-7: 0000000000000000 0000000010413868 00000000000beafd \
0000000000000afd 0000000010407a20 genunix:main+8c \
(104101f0, 2000, 10407ec0, 10408030, fff2, 1005 2a0c)
%l0-3: 0000000000000001 0000000000000001 0000000000000015 \
0000000000000f36
%l4-7: 0000000010429638 0000000010472c00 00000000000d7438 \
0000000000000540
skipping system dump - no dump device configured
rebooting...
I've run test net1 from the OBP, and everything checks
out okay. I've also run watch-net-all, and we're
definitely seeing packets going by. If we let the machine boot off
the internal disk, we can bring up both Ethernet interfaces just fine,
and we don't see any errors. I'm rather at a loss to determine
and correct the problem. If you have any suggestions, I would very
much appreciate them.
A The V100s come with two onboard
Ethernet devices, dmfe0 and dmfe1. If you look at
your error output, you'll notice that it is referencing dmfe2,
which doesn't exist. There is a problem with some revisions
of Solaris 8 where the getminor() routine returns the wrong
value and thereby makes other routines fail. For the Sunsolve bug
reports, see the following (for x86 and SPARC, respectively):
http://sunsolve.sun.com/search/document.do?assetkey=1-1-1146859-1
http://sunsolve.sun.com/search/document.do?assetkey=1-1-4482263-1
The workaround is to try network boots from only the first interface,
dmfe0. You can also try to use the boot image from a version
of Solaris 8 that did not have this problem, such as 05/03.
Q We run Solaris 9 and Sun Cluster
3.1u2 on a two-node cluster. We have an in-house failover application
that provides service to clients on the service name at a given
port. Sometimes we want to block service to certain users, but we
aren't certain how to go about this. Is there a way to handle
this in the cluster itself? I also see that Solaris 10 has a package
for ipfilter; can we just build the open source version for
Solaris 9 and use that?
A While ipfilter is now
available as a Solaris package under 10 and the non-Sun version
will work for most circumstances under earlier versions of Solaris,
you can not use it in conjunction with a cluster. A cluster works
by migrating the IP and associated services from one node to another
during a failure. ipfilter works by keeping the state of
packets on a per-machine basis.
Unfortunately, there's no way for one machine in a cluster
to share filtering state information with another machine in a cluster.
If you want to filter connections to a service IP, place a device
in front of the cluster (a firewall, or maybe a router, if you only
want to filter machines outside your segment) and filter there.
The most redundant solution is to place a pair of devices that are
capable of sharing filtering information in front of your cluster
nodes.
Q I'm writing some automated
installation scripts for some software we need to install on various
OS types. Unfortunately, this software requires input from a human
at the keyboard. I haven't really found a way around this,
since I don't know of a way that will make the binary install
program read from my script instead of STDIN. Do you have any suggestions?
A You don't really give a
lot of details about what the installer is looking for or what scripting
language you're trying to use. The answer to your problem could
be as simple as echoing the input to the installer script (this
assumes some sort of shell scripting language like bourne, bash,
ksh, csh, tcsh, zsh, etc.):
echo "answer1\nanswer2\nanswer3\nanswer4\n" | ./binary-installer
Most likely, though, you'll want to use something more complex,
like the program expect, which will look for certain strings
and then respond with answers. Expect, available from http://expect.nist.gov/,
is written in Tcl and can also be linked with Tk to
create X11-based applications. You can call expect from a shell script
or possibly write your entire install script in expect.
There's a utility called autoexpect that watches your
interaction with a program and then creates an expect script based
on your input. autoexpect does not always create a perfect
script, because it must do some guesswork. The autoexpect
man page lists some of the most common problems with scripts generated
this way:
- Timing. A surprisingly large number of programs (rn, ksh, zsh,
telnet, etc.) and devices (e.g., modems) ignore keystrokes that
arrive "too quickly" after prompts. If you find your
new script hanging up at one spot, try adding a short sleep just
before the previous send.
- Echoing. Many programs echo characters. Without specific knowledge
of the program, it is impossible to know if you are waiting to
see each character echoed before typing the next. If autoexpect
sees characters being echoed, it assumes that it can send them
all as a group rather than interleaving them the way they originally
appeared. This makes the script more pleasant to read. However,
it could conceivably be incorrect if you really had to wait to
see each character echoed.
- Change. Autoexpect records every character from the interaction
in the script. This is desirable because it gives you the ability
to make judgments about what is important and what can be replaced
with a pattern match. On the other hand, if you use commands whose
output differs from run to run, the generated scripts are not
going to be correct. For example, the "date" command
always produces different output. So, using the date command while
running autoexpect is a sure way to produce a script that will
require editing in order for it to work.
Workarounds for each of these issues are described in the
man page as well.
If you're using Perl instead of a shell-based language,
there are several modules/programs you might want to investigate:
http://search.cpan.org/~rgiersig/Expect-1.15/
http://search.cpan.org/~djerius/Expect-Simple-0.02/
http://search.cpan.org/~lbrocard/Test-Expect-0.29/
Q We have a SunFire V240 that's
been having problems booting up correctly. I want to get to the
OBP to run some diagnostics, but I can't seem to be able
to send the machine a break signal to get it there. I've
tried using the LOM to power the machine off and back on, but
it always boots back into multi-user mode. I've also tried
issuing a break once the machine is fully up and running, but
that also fails. It looks like it's getting the break sequence
and not doing anything with it. I see the following message when
trying to send the break:
SC Alert: SC Request to send Break to host.
So, why isn't the break working, and is there a way to force
it to the OBP?
A Without being able to see
the configuration of your machine, I'm guessing that you've
set the alternate break sequence in /etc/default/kbd.
Look for a line that says:
KEYBOARD_ABORT=alternate
This is a good idea for machines hooked up to a console server,
so that disconnecting the console cable (or the keyboard, if you're
not using a console) does not cause the machine to drop to the
OBP. This setting takes affect after the OS is loaded and takes
control of the machine.
To actually get to the OBP, you can do one of several things:
1. Send the break sequence before the OS loads and the alternate
break sequence takes affect.
2. From the running OS, issue an init 0 to bring the machine
back to the OBP.
3. From the running OS, issue the alternate break sequence,
the following commands in succession:
return
~
control-b
Once you get to the OBP, you'll want to set the auto-boot?
variable to false for the time being so you can do a reset and
not need to break out. Many diagnostics you'd want to run
from the OBP do not deal well with a machine that's been
brought to the OBP with a break. When you're done running
your diagnostics, set the "auto-boot?" variable back
to true, so your machine will automatically boot after a crash.
Q I've just installed Solaris
10 on an old Netra T1 200 we use for testing, and I'm trying
to get the DHCP client on it functioning. Our DHCP server is
an old BSD box that's not handing out hostnames, so the
Netra keeps setting its hostname as unknown. I googled
for this problem and found a suggestion that says to put the
hostname in /etc/nodename as a fallback. I've tried
this, and I have a blank /etc/hostname.eri0 and /etc/dhcp.eri0.
I can see the machine register itself with the DHCP daemon on
the BSD box and it gets the correct IP, but it's apparently
still trying to set its hostname with the information gained
from the DHCP server (i.e., nothing). The steps to make this
work aren't that difficult to understand, and I think I've
followed them all correctly. Did I miss something fundamental?
A It sounds like you have the
setup configured correctly, but since you didn't post your
config file or your nodename file, I can't be absolutely
sure. There was a series of posts in comp.unix.solaris
from someone who was having a similar problem. In his case,
it turned out that he was missing the newline character after
the hostname in the /etc/nodename file. The thread discussing
this can be found here:
http://groups-beta.google.com/group/comp.unix.solaris/browse_frm/ \
thread/12370dc8db852d7c
Amy Rich has more than a decade of Unix systems administration
experience in various types of environments. Her current roles
include that of Senior Systems Administrator for the University
Systems Group at Tufts University, Unix systems administration
consultant, and author. She can be reached at: qna@oceanwave.com. |