Cover V13, i14

Article

dec2004.tar

Questions and Answers

Amy Rich

Q I have several files that I need to, for lack of a better word, shuffle together. Each file contains a column of data, and I need to wind up with one file that contains all columns, lined up by line number, and delimited by commas (to import into some database software). For example, say I have three files, foo, bar, and baz. Their contents are as follows:

foo:           bar:
                   
  1              a
  2              b
  3              c
  4              d
  5              e
  6              f
  7              g
  8              h
  9              i
                 j
baz:             k
                 l
  A              m
  B              n
  C              o
  D              p
  E              q
I want to combine the columns of foo, bar, and baz so I wind up with a file that looks like:

  1,a,A
  2,b,B
  3,c,C
  4,d,D
  5,e,E
  6,f,
  7,g,
  8,h,
  9,i,
  ,j,
  ,k,
  ,l,
  ,m,
  ,n,
  ,o,
  ,p,
  ,q,
I tried to write an awk script to do this, but it was more difficult than I anticipated because not all the columns are of the same length. Do you know of a simple way to handle a variable number of files and columns?

A Assuming that each file only has one column of data, the best tool for this particular job is the command paste, a utility to merge files by line position. By default, the newline character at the end of every line except those from the last file will be replaced by a tab character.

You can change this to a comma by specifying the -d flag:

paste -d , foo bar baz > datafile
Q Because of the recent rise in spam that only has a random dollar figure in the subject line, I want to write a sendmail rule that does a regular expression match and bounce it back. I'm running sendmail 8.12.11, and the rule I came up with is:

LOCAL_CONFIG
Kmoneyspam regex -a@MATCH (\$[0-9][0-9]+)

LOCAL_RULESETS
HSubject:       $>CheckMoneySubject

SCheckMoneySubject
R$*             $: $(moneyspam $1 $: <OK> $)
R<OK>           $@ OK
R$+             $#error $: 550 5.7.1 Spam not accepted here
Unfortunately, I can't get the ruleset to work correctly. It seems never to match any of the spam subject lines. Could you check over my ruleset and offer me some pointers?

A Your regular expression match is not working because sendmail is performing a macro expansion on the $. You need to use \$$ to match a literal $ character. I would also suggest changing the regular expression so that it doesn't catch ANY subject with $NN in it, but only those subjects that have JUST a dollar figure in them:

Kmoneyspam regex -a@MATCH ^(\$$[0-9]+)?$
This would reject:

Subject: $1
But not either of:

Subject: a $1
Subject: $1 a
This way you don't accidentally reject important mail, say something from an eBay auction you won, with a dollar figure in the subject line.

Also, the syntax is wrong in your ruleset's rejection line. It should be:

$#error $@ 5.7.1 $: 550 Spam not accepted here
Q We're running Solaris 9 on a variety of different SPARC-based platforms. Occasionally, when we add a new package, our /usr/local symlink gets removed and the new package creates its own /usr/local directory. This is annoying and extremely disruptive on a live system. How can we prevent /usr/local from being removed and remade?

A You don't say whether you're building the packages yourself or getting them from a third-party source over which you have no control. If you have any say in the package create process, then make sure you don't specify /usr/local as a directory in the prototype file. Make sure there are no lines that look like:

d none /usr/local 0755 root other
You can hard-code your prototype files to treat /usr/local as a symbolic link (this assumes that you're linking to /opt/local):

s none /usr/local=/opt/local
But the better solution is to set the BASEDIR to /usr/local and use relative paths in the prototype file. Here's an example prototype and pkginfo for the md5 program:

i pkginfo=pkginfo
!search /usr/local
d none sbin 0755 root other
f none sbin/md5 0755 root other


BASEDIR=/usr/local
PATH=/sbin:/usr/sbin:/usr/bin:/usr/sadm/install/bin
PKG=md5
NAME=md5
VERSION=1.0
ARCH=sun4u
CATEGORY=application
VENDOR=Your Name Here
EMAIL=me@my.domain
PSTAMP=20040811
CLASSES=none
Create the package with the command:

pkgmk -b /usr/local -o -f prototype
If you don't have control over the creation of these packages or you don't want to deal with the hassle of using symlinks or relative paths, you can instead mount /usr/local as a loopback filesystem and not use a symlink. To do this, make sure that no one is accessing files within /usr/local using a tool like lsof or fuser (or just boot into single-user mode). Remove the /usr/local symbolic link and create a directory in its place:

rm /usr/local
mkdir /usr/local
chmod 0755 /usr/local
Then create the following entry in /etc/vfstab:

/opt/local   -   /usr/local    lofs    -       yes     -
Finally, mount the loopback filesystem:

mount /usr/local
Now packages will see /usr/local as a directory, and you shouldn't have any more issues.

Q I use tcsh as my interactive shell on a variety of Unix platforms. I'd like to whip up an alias that does the equivalent of the following command line:

find . -exec grep EXP {} \;
I'd like to be able to specify EXP as the input to the alias, for example:

rgrep foo   should expand to   find . -exec grep foo {} \;
rgrep bar   should expand to   find . -exec grep bar {} \;
I'm not sure how to read the input from the command line, though; any suggestions?

A To do exactly what you asked, you could add the following alias to your .cshrc or .tcshrc:

alias rgrep '/usr/bin/find . -exec /usr/bin/grep \!:1 {} \;'
Unfortunately, what you asked for has a few shortcomings. This will not print the name of the file when you get a match. It will also try to grep on directories, special files, etc. To make this a bit better, it could be rewritten as:

alias rgrep '/usr/bin/find . -type f -exec /usr/bin/grep \!:1 {} \; -print'
This will produce output that has the matching line, followed by the matching filename on the next line. With more massaging, you could get it to look exactly like a standard grep command. The best answer, though, is to forget about the alias all together and use a tool like GNU grep, which has a recursive option like this:

grep -r foo .
GNU grep is available from:

http://www.gnu.org/software/grep/
Q We have a number of Windows users at our site that mount Samba shares from our OpenBSD server. When these folks go on the road, they want the same access to the shares, but we don't have any sort of VPN. Obviously, I don't want to pass NetBIOS traffic in the clear over a public network, and I don't really want to open us up for attack. Is there was a secure way to do this without an actual VPN?

A You can tunnel Samba over ssh or use stunnel to encrypt and redirect the session with SSL. In both cases, you're going to need to modify your client machines, since Windows already binds NetBIOS to the local port 139. Each client must therefore install the Microsoft Loopback adapter as described at:

http://support.microsoft.com/default.aspx?scid=839013
Configure the loopback device with an unused IP such as 10.10.10.10 or 222.222.222.222 and disable file and print sharing and NetBIOS under the TCP/IP settings.

The client machines can now opt to go with ssh or stunnel as an encryption method. Setting up tunneling with PuTTY is described at:

http://lists.samba.org/archive/samba/2004-May/085358.html  
Setting up access with stunnel is described at:

http://research.lumeta.com/ches/cheap/stunnelsolution.html
Q I'm running FreeBSD 4.10 on a generic PIII machine that I cobbled together from spare parts. The machine is up and running and acting as a development machine for a bunch of users, so I don't want to shut it down. The developers want to know some information about the BIOS and the hardware that's installed. Is there a way I can find that information without shutting down the machine?

A You don't say exactly what information you need to know, but you can get a lot from running dmesg or from the dmesg boot time snapshot, /var/run/dmesg.boot. Additionally, take a look at the tool dmidecode if you're running a machine that has DMI/SMBIOS. You can install it from the ports collection: /usr/ports/sysutils/dmidecode.

The dmidecode binary dumps the table contents for the DMI, containing a lot of information, in human-readable format. Each record from the table has a handle, type, size, and the decoded values. The handle is a unique identifier that allows records to refer to one another. The type describes the elements a computer can be made of. The size is roughly the size of the record (not accounting for text strings). The value of the decoded section depends on the information for that element. As an example, an old PII Celeron processor might look like the following:

Handle 0x0004
   DMI type 4, 32 bytes.
   Processor Information
         Socket Designation: SLOT 1
         Type: Central Processor
         Family: Pentium II
         Manufacturer: Intel
         ID: 65 06 00 00 FF 39 43 11
         Signature: Type 0, Family 6, Model 6, Stepping 5
         Flags:
                 FPU (Floating-point unit on-chip)
                 VME (Virtual mode extension)
                 DE (Debugging extension)
                 PSE (Page size extension)
                 TSC (Time stamp counter)
                 MSR (Model specific registers)
                 PAE (Physical address extension)
                 MCE (Machine check exception)
                 CX8 (CMPXCHG8 instruction supported)
                 SEP (Fast system call)
                 MTRR (Memory type range registers)
                 PGE (Page global enable)
                 MCA (Machine check architecture)
                 CMOV (Conditional move instruction supported)
                 PAT (Page attribute table)
                 PSE-36 (36-bit page size extension)
                 MMX (MMX technology supported)
                 FXSR (Fast floating-point save and restore)
         Version: INTEL(R) CELERON(TM)
         Voltage: 3.3 V
         External Clock: 66 MHz
         Max Speed: 200 MHz
         Current Speed: 533 MHz
         Status: Populated, Enabled
         Upgrade: Slot 1
         L1 Cache Handle: 0x0000
         L2 Cache Handle: 0x0000
         L3 Cache Handle: No L3 Cache
Be forewarned that the amount of information that dmidecode supplies takes up a number of pages. You'll probably want to parse it with a script or at least pass it through your favorite paging program.

The FreeBSD dmidecode port also comes with three other programs:

biosdecode -- Prints all the BIOS info it can find.

ownership -- Prints the "ownership tag" sometimes set on Compaq machines.

vpddecode -- Prints the "vital product data" information found in most IBM machines.

Amy Rich, president of the Boston-based Oceanwave Consulting, Inc. (http://www.oceanwave.com), has been a UNIX systems administrator for more than 10 years. She received a BSCS at Worcester Polytechnic Institute, and can be reached at: qna@oceanwave.com.