Creating
Cross-Platform Solutions with Open Database Connectivity
D. Hageman
Systems administrators can often find themselves in a situation
where they are forced to support a product lacking in adequate documentation,
stability, and technical support. The stories vary, but the end
result is generally the same. You can probably hear your boss right
now, saying: "We can't get rid of this product, because
it is essential to our business."
The Problem
This story began when the company's administrative assistant
came into my office in a complete panic. This would not normally
be an issue for me as I don't take care of her workstation,
but our Windows administrator was away on vacation. The story unfolded
as she explained that she had been using a proprietary database
product on her workstation, and it suddenly stopped working after
she tried to insert a record. Note that I am the database administrator
for the company, but this was the first I had heard about this database
system.
Apparently, about a year ago, she convinced the Windows administrator
to install this database product on her system. She created a couple
of administrative databases and, as time progressed, she began sharing
the databases with several co-workers. Shortly afterward, it became
apparent that the sharing of the databases was slowing down her
workstation to the point where it was unusable. The Windows administrator
corrected this problem by purchasing and installing the server software
associated with this database system. This was the state of affairs
when I became involved.
My research into the proprietary database quickly showed that
the system suffered from security holes and bad design decisions.
Also, documentation for the product was essentially non-existent.
The final straw, however, came when I discovered the only solution
was to upgrade to the latest version of the software. After a bit
of a struggle, I convinced the boss to replace the product with
a new system, instead of continuing the vicious cycle of paying
for upgrades to fix bugs.
The new system needed all the features of the old system as well
as some new ones to convince the boss it was worthwhile. It had
to be able to easily create reports and customizable user interfaces.
The new system needed to be secure, because the old system was easily
compromised due to implementation flaws. The system also needed
to be easy to back up and restore and be networked among multiple
machines without difficulty. Finally, if I were going to maintain
this system, the core of it needed to run on a Linux box.
The Solution
The best solution I found for this situation was to attach Microsoft
Access to a PostgreSQL database. I am sure many people reading this
article would not necessarily agree with me, but Access can work
decently as a front-end to a relational database system. It also
integrates reasonably well with other Windows applications and has
the same feature set as the old proprietary database. The PostgreSQL
database on the back-end would fulfill the requirements of easy
backups and maintenance.
I will not attempt to show the complete design of the Access/PostgreSQL
solution. The details of most of it are beyond the scope of this
article. My main goal is for you to understand the process of connecting
the Windows system to a PostgreSQL database using a technology called
Open Database Connectivity (ODBC). This technique can be useful
in a wide variety of situations other than the one I will detail
here, making it an invaluable tool in dealing with cross-platform
issues.
Open Database Connectivity
Open Database Connectivity is an open application programming
interface (API) developed as a method for applications to generically
interact with a relational database management system. This technology
has existed long enough that I have yet to find a modern database
product without an ODBC driver. As with most ventures into providing
universal data access, however, ODBC is not without its faults.
The most common problem with ODBC is with data type conversions.
Applications usually do not utilize the same data types as the database.
Part of the ODBC architectures addresses this by providing a way
for data types to be negotiated upon and converted. It follows that
ability for a connection and successful data transfer to be made
depends entirely upon the quality of the ODBC driver and the application
requesting the data. It is safe to say that not all ODBC drivers
are created equal.
Another issue is that many applications claiming to support ODBC
do not implement it well. Also, some products claiming to support
ODBC lack good documentation on how to utilize the features. If
ODBC has such issues, why should you bother with it? When it works,
it works really well and is a great tool in your problem-solving
tool chest. The really good news is that I have seen a trend of
these problems being fixed in more and more products as time progresses.
Configuring the PostgreSQL Server
To begin, I'll describe how to configure the PostgreSQL server
for remote ODBC connections. This is a four-step process on most
Linux server installations. The first step is to modify the configuration
file named postgresql.conf. The goal here is to instruct the database
server to listen to the correct interface. In version 7.4 and lower
versions of PostgreSQL, you need to look for two options. The first
is "tcpip_socket", which should be set to "true",
and the next is "virtual_host", for which the default
is to listen to all interfaces on the machine. If this is not the
behavior you want for the "virtual_host" option, then
you can modify it with specific network addresses.
The PostgreSQL development team has simplified the configuration
in version 8.0 and above. The configuration option we are looking
for is still located in the postgresql.conf file but is named "listen_addresses".
It defaults to just listening to localhost and should be changed
to either a "*", which will make the server listen to
all interfaces, or to a specific address.
The second step is to configure PostgreSQL's host-based access
controls to allow our remote machine to connect to our system. The
file to edit is pg_hba.conf, and it should be located in the same
place you found the postgresql.conf file. It contains fairly detailed
instructions on how to properly form the access control lines, which
should be read carefully. The following line will get you up and
running quickly but is definitely not the best way to do it:
host all all IPADDRESS 255.255.255.255 md5
You will need to change the IPADDRESS in the above example to the
IP address of the workstation using the ODBC driver. If the PostgreSQL
server is already running at this point, then you need to instruct
it to re-read its configuration file by sending it a SIGHUP or by
restarting it. If it isn't running, now is the time to start
it.
The third step is to create your database and load the initial
database schema. If you have some initial data to put into the system
you can also do that at this time. I won't go into the exact
instructions for doing this step as it is very well detailed in
the PostgreSQL documentation. I will recommend you stick with the
basic ANSI SQL data types in your database schema as they will have
the highest rate of success in the ODBC type negotiations.
The final step in the configuration of the database system, if
you are running a host-based firewall, is to open a hole in it to
allow the remote machine to connect. Details on how to do this are
beyond the scope of this article because so many different products
exist to perform this function.
Installing the ODBC Driver
The next part of this setup is the installation of the ODBC driver.
We shall use the official driver from the PostgreSQL development
team called psqlodbc. The driver's homepage can be found at:
http://gborg.postgresql.org/project/psqlodbc/
The latest driver available at this time is psqlodbc-08_00_0101.zip.
You can choose from several different package formats, which can be
handy for your Windows administrator. I recommend the MSI version,
as it will ensure all the registry entries will be properly created.
All ODBC connections on a Windows box are managed through the
ODBC Data Source Administrator located in the Control Panel. The
Control Panel has stayed relatively the same with each Windows version,
but the following instructions will be based on Windows XP Pro.
Locate the Control Panel icon on the Start menu and select it. Make
sure the Control Panel is in the "Category View" mode,
then look for and select the option "Performance and Maintenance".
This will open a new window with more icons. The option we want
next is "Data Sources (ODBC)". This will open the ODBC
Data Source Administrator for Windows.
As you scan the ODBC administrator application, you will see it
has a collection of tabs at the top of the window. Selecting the
"Driver" tab will result in a list of all the drivers
currently installed on your system. Since you have already installed
the psqlodbc driver, it should be listed along with the other drivers
available. If you don't see something like "PostgreSQL"
listed, then the driver probably did not install correctly. In a
situation like this, your best troubleshooting resource is the FAQ
on the driver's homepage and the pgsql-odbc mailing list.
Configuring Connections
There are two types of ODBC data sources you can create for connections
to your database -- "User DSN" connections, and "System
DSN" connections. The "User DSN" connections are
local to the user on that machine. The "System DSN" connections
are available to all users and all Windows services on the machine.
Consider carefully the purpose of the connection and decide which
is the best option for your needs.
The good news is that the configuration procedure for each type
of connection is same after the first step, which is choosing the
type of data source. Once you have selected the appropriate type,
you should see a list of the current data sources of that type along
with some buttons on the right. Select the "Add ..." button
and locate the "PostgreSQL" driver in the driver list.
This will open a dialog box containing the basic options for creating
a database connection including username, password, database server,
and database name. I won't go into the explanation of all those
options, but I will touch briefly upon the "Data Source"
and "Description" options.
The "Data Source" option is an arbitrary name you give
to the connection to identify it locally. I recommend choosing a
good convention for naming connections and using it for all workstations,
as well as using a good description. This will help you deploy and
maintain the setup if you have to distribute it across multiple
workstations. While you are looking at this dialog, I recommend
you examine the "Options" box in the lower right corner.
The "Datasource" button contains various advanced options
you can modify to tweak the behavior of this data source. If you
have a recent version of the driver and a recent version of Microsoft
Access, you shouldn't have to twiddle with these options.
The "Global" button will take you to options that will
enable logs of connections and ODBC interactions. If you are setting
up a development workstation, I recommend that you enable these
logging capabilities. It will slow things down but will provide
good feedback for debugging. These options are also good for investigating
weird issues on production machines. When you are all finished,
select "Save", and you'll be done with this part
of the process. At this point, the Windows machine is capable of
an ODBC connection to the PostgreSQL server.
As I mentioned before, Microsoft Access is designed to be used
"as-is" or as a front-end to a Microsoft SQL Server. Even
though it utilizes special methods to connect to SQL Server, its
ability to act as a good front-end for SQL Server allows it to be
a decent front-end to any SQL database using ODBC. The knowledge
on how to achieve this task is unfortunately not detailed to a high
degree in the Access help files. So, I will take you through the
two operations needed to have the ability to work with ODBC linked
tables in Microsoft Access.
Linking Tables
The first operation is the ability to link a table into an Access
database. Once a table is linked into the database you can interact
with it like any other Access table. This makes it easy for developers
to abstract their user interface away from the data source. We start
by selecting the "Table" object located on the left side
of the database design form. At this point, you should see at least
three options for creating tables along with any other tables you
might have already created in your database.
The next step is to right-click in the whitespace of the window
containing these options, which should produce a pop-up menu. On
this menu, you should find and select an option called "Link
Tables ...". This action will produce a dialog very similar
to the standard Windows open file dialog widget. You want to change
the drop-down select box labeled "Files of type:" to select
"ODBC Data sources". This will cause the dialog to close
and the Windows ODBC Data Source Administrator to open. The administrator
is a simplified version of the one seen previously in the Windows
Control Panel, and it should contain just two tabs labeled "File
Data Source" and "Machine Data Source". The tab you
want to select here is the "Machine Data Source". Assuming
you correctly set up your ODBC database in the Control Panel, you
should see it now listed under the "Machine Data Source"
tab.
Once you select your data source, you should get a dialog listing
all of the tables currently in the database. Select the tables you
want to link into your Access database and click the "OK"
button. It will ask you what the identifying keys are for each table.
I don't think I have to remind you too sternly it is proper
database design to build in a primary key into each table you create.
You may choose to rely on the database's OIDs, but I do not
recommend it in this situation. This advice will save you or your
Access developer from problems later. Once you have chosen the columns
to use as the primary keys, the linked tables will appear under
the "Tables" object in your database. The final test to
ensure that it all worked properly is to just double-click on the
table to see if it will open without complaint.
And Re-Linking Tables
The next operation you need to understand to support ODBC tables
is the ability to refresh the ODBC linking. This operation is needed
in two particular situations. The first situation is when you change
the structure of a table because these changes will not automatically
propagate to your Access front-end. The other situation is when
you change the location of your database server. I found out the
hard way that just changing the connection information in the ODBC
Administrator is not enough to point the front-end to a new server.
It appears that some of this information is stored in the Access
file. I can rationalize this behavior to a certain degree as a security
measure, but in the end it might just be a design oversight. The
point is to be aware that this situation exists and to know how
to overcome it.
The process of re-linking the ODBC tables is very similar to the
process of the initial linking. You start as before by selecting
the "Tables" object from the design view of database.
Next, you will select and then perform a right mouse-click on any
ODBC-linked table. A pop-up menu should appear as before, but it
will contain slightly different options. The option we are looking
for is called "Link Table Manager". When you select this
option, a dialog will appear with a listing of all the current ODBC-linked
tables in your database. Select the check box next to any table
you need to refresh.
If you have changed the location of the database server, you will
also need to select the option at the bottom of the dialog called
"Always prompt for a new location". If you don't
check this option, it will try to refresh the table with the information
stored in your Access database rather than what is listed in your
data source link. When this option is selected, it will prompt you
for the new location of the data source, similar to the process
of initially linking the tables. Obviously, if you haven't
changed the location of the database server, then you do not need
to select this option and it will just update the table structure
information.
Securing the Connections
At this point, you should have the basic knowledge for leveraging
ODBC in your work environment; however, we also need to look at
ways of making these types of connections secure. Note that these
connections as configured in this article should not be encrypted.
PostgreSQL can natively support SSL-encrypted connections, but the
current psqlodbc driver does not yet support it. Several third-party
drivers for PostgreSQL do support SSL, and these products may be
worth looking at depending on your network setup and the sensitivity
of the data being transferred. I also recommend, if you create an
application like the one mentioned in this article, that you create
a unique account for each user. This can help you produce an audit
trail if you later need it. Finally, creating restrictive host-based
access rules in your pg_hba.conf is always good database administration.
Conclusion
In the end, I created a full-featured application to replace a
buggy old system for my company. The new database had all of the
features of the old system but none of the headaches. The boss was
happy that the company didn't have to shell out money for an
upgrade, and the administrative assistant was happy that she could
use her system again. Most importantly, I was happy I wouldn't
have to spend any more time in front of a Windows server dealing
with a buggy database. Now what should I tell the Windows administrator
when he gets back?
D. Hageman has been involved with open source software for
nearly a decade. His interests include artificial intelligence,
robotics, and embedded systems. He currently works as an independent
computer consultant and can be reached at: dhageman@dracken.com. |