Creating Cross-Platform Solutions with Open Database Connectivity

D. Hageman

Systems administrators can often find themselves in a situation where they are forced to support a product lacking in adequate documentation, stability, and technical support. The stories vary, but the end result is generally the same. You can probably hear your boss right now, saying: "We can't get rid of this product, because it is essential to our business."

The Problem

This story began when the company's administrative assistant came into my office in a complete panic. This would not normally be an issue for me as I don't take care of her workstation, but our Windows administrator was away on vacation. The story unfolded as she explained that she had been using a proprietary database product on her workstation, and it suddenly stopped working after she tried to insert a record. Note that I am the database administrator for the company, but this was the first I had heard about this database system.

Apparently, about a year ago, she convinced the Windows administrator to install this database product on her system. She created a couple of administrative databases and, as time progressed, she began sharing the databases with several co-workers. Shortly afterward, it became apparent that the sharing of the databases was slowing down her workstation to the point where it was unusable. The Windows administrator corrected this problem by purchasing and installing the server software associated with this database system. This was the state of affairs when I became involved.

My research into the proprietary database quickly showed that the system suffered from security holes and bad design decisions. Also, documentation for the product was essentially non-existent. The final straw, however, came when I discovered the only solution was to upgrade to the latest version of the software. After a bit of a struggle, I convinced the boss to replace the product with a new system, instead of continuing the vicious cycle of paying for upgrades to fix bugs.

The new system needed all the features of the old system as well as some new ones to convince the boss it was worthwhile. It had to be able to easily create reports and customizable user interfaces. The new system needed to be secure, because the old system was easily compromised due to implementation flaws. The system also needed to be easy to back up and restore and be networked among multiple machines without difficulty. Finally, if I were going to maintain this system, the core of it needed to run on a Linux box.

The Solution

The best solution I found for this situation was to attach Microsoft Access to a PostgreSQL database. I am sure many people reading this article would not necessarily agree with me, but Access can work decently as a front-end to a relational database system. It also integrates reasonably well with other Windows applications and has the same feature set as the old proprietary database. The PostgreSQL database on the back-end would fulfill the requirements of easy backups and maintenance.

I will not attempt to show the complete design of the Access/PostgreSQL solution. The details of most of it are beyond the scope of this article. My main goal is for you to understand the process of connecting the Windows system to a PostgreSQL database using a technology called Open Database Connectivity (ODBC). This technique can be useful in a wide variety of situations other than the one I will detail here, making it an invaluable tool in dealing with cross-platform issues.

Open Database Connectivity

Open Database Connectivity is an open application programming interface (API) developed as a method for applications to generically interact with a relational database management system. This technology has existed long enough that I have yet to find a modern database product without an ODBC driver. As with most ventures into providing universal data access, however, ODBC is not without its faults.

The most common problem with ODBC is with data type conversions. Applications usually do not utilize the same data types as the database. Part of the ODBC architectures addresses this by providing a way for data types to be negotiated upon and converted. It follows that ability for a connection and successful data transfer to be made depends entirely upon the quality of the ODBC driver and the application requesting the data. It is safe to say that not all ODBC drivers are created equal.

Another issue is that many applications claiming to support ODBC do not implement it well. Also, some products claiming to support ODBC lack good documentation on how to utilize the features. If ODBC has such issues, why should you bother with it? When it works, it works really well and is a great tool in your problem-solving tool chest. The really good news is that I have seen a trend of these problems being fixed in more and more products as time progresses.

Configuring the PostgreSQL Server

To begin, I'll describe how to configure the PostgreSQL server for remote ODBC connections. This is a four-step process on most Linux server installations. The first step is to modify the configuration file named postgresql.conf. The goal here is to instruct the database server to listen to the correct interface. In version 7.4 and lower versions of PostgreSQL, you need to look for two options. The first is "tcpip_socket", which should be set to "true", and the next is "virtual_host", for which the default is to listen to all interfaces on the machine. If this is not the behavior you want for the "virtual_host" option, then you can modify it with specific network addresses.

The PostgreSQL development team has simplified the configuration in version 8.0 and above. The configuration option we are looking for is still located in the postgresql.conf file but is named "listen_addresses". It defaults to just listening to localhost and should be changed to either a "*", which will make the server listen to all interfaces, or to a specific address.

The second step is to configure PostgreSQL's host-based access controls to allow our remote machine to connect to our system. The file to edit is pg_hba.conf, and it should be located in the same place you found the postgresql.conf file. It contains fairly detailed instructions on how to properly form the access control lines, which should be read carefully. The following line will get you up and running quickly but is definitely not the best way to do it:

host  all  all  IPADDRESS 255.255.255.255  md5

You will need to change the IPADDRESS in the above example to the IP address of the workstation using the ODBC driver. If the PostgreSQL server is already running at this point, then you need to instruct it to re-read its configuration file by sending it a SIGHUP or by restarting it. If it isn't running, now is the time to start it.

The third step is to create your database and load the initial database schema. If you have some initial data to put into the system you can also do that at this time. I won't go into the exact instructions for doing this step as it is very well detailed in the PostgreSQL documentation. I will recommend you stick with the basic ANSI SQL data types in your database schema as they will have the highest rate of success in the ODBC type negotiations.

The final step in the configuration of the database system, if you are running a host-based firewall, is to open a hole in it to allow the remote machine to connect. Details on how to do this are beyond the scope of this article because so many different products exist to perform this function.

Installing the ODBC Driver

The next part of this setup is the installation of the ODBC driver. We shall use the official driver from the PostgreSQL development team called psqlodbc. The driver's homepage can be found at:

http://gborg.postgresql.org/project/psqlodbc/

The latest driver available at this time is psqlodbc-08_00_0101.zip. You can choose from several different package formats, which can be handy for your Windows administrator. I recommend the MSI version, as it will ensure all the registry entries will be properly created.

All ODBC connections on a Windows box are managed through the ODBC Data Source Administrator located in the Control Panel. The Control Panel has stayed relatively the same with each Windows version, but the following instructions will be based on Windows XP Pro. Locate the Control Panel icon on the Start menu and select it. Make sure the Control Panel is in the "Category View" mode, then look for and select the option "Performance and Maintenance". This will open a new window with more icons. The option we want next is "Data Sources (ODBC)". This will open the ODBC Data Source Administrator for Windows.

As you scan the ODBC administrator application, you will see it has a collection of tabs at the top of the window. Selecting the "Driver" tab will result in a list of all the drivers currently installed on your system. Since you have already installed the psqlodbc driver, it should be listed along with the other drivers available. If you don't see something like "PostgreSQL" listed, then the driver probably did not install correctly. In a situation like this, your best troubleshooting resource is the FAQ on the driver's homepage and the pgsql-odbc mailing list.

Configuring Connections

There are two types of ODBC data sources you can create for connections to your database -- "User DSN" connections, and "System DSN" connections. The "User DSN" connections are local to the user on that machine. The "System DSN" connections are available to all users and all Windows services on the machine. Consider carefully the purpose of the connection and decide which is the best option for your needs.

The good news is that the configuration procedure for each type of connection is same after the first step, which is choosing the type of data source. Once you have selected the appropriate type, you should see a list of the current data sources of that type along with some buttons on the right. Select the "Add ..." button and locate the "PostgreSQL" driver in the driver list. This will open a dialog box containing the basic options for creating a database connection including username, password, database server, and database name. I won't go into the explanation of all those options, but I will touch briefly upon the "Data Source" and "Description" options.

The "Data Source" option is an arbitrary name you give to the connection to identify it locally. I recommend choosing a good convention for naming connections and using it for all workstations, as well as using a good description. This will help you deploy and maintain the setup if you have to distribute it across multiple workstations. While you are looking at this dialog, I recommend you examine the "Options" box in the lower right corner. The "Datasource" button contains various advanced options you can modify to tweak the behavior of this data source. If you have a recent version of the driver and a recent version of Microsoft Access, you shouldn't have to twiddle with these options.

The "Global" button will take you to options that will enable logs of connections and ODBC interactions. If you are setting up a development workstation, I recommend that you enable these logging capabilities. It will slow things down but will provide good feedback for debugging. These options are also good for investigating weird issues on production machines. When you are all finished, select "Save", and you'll be done with this part of the process. At this point, the Windows machine is capable of an ODBC connection to the PostgreSQL server.

As I mentioned before, Microsoft Access is designed to be used "as-is" or as a front-end to a Microsoft SQL Server. Even though it utilizes special methods to connect to SQL Server, its ability to act as a good front-end for SQL Server allows it to be a decent front-end to any SQL database using ODBC. The knowledge on how to achieve this task is unfortunately not detailed to a high degree in the Access help files. So, I will take you through the two operations needed to have the ability to work with ODBC linked tables in Microsoft Access.

Linking Tables

The first operation is the ability to link a table into an Access database. Once a table is linked into the database you can interact with it like any other Access table. This makes it easy for developers to abstract their user interface away from the data source. We start by selecting the "Table" object located on the left side of the database design form. At this point, you should see at least three options for creating tables along with any other tables you might have already created in your database.

The next step is to right-click in the whitespace of the window containing these options, which should produce a pop-up menu. On this menu, you should find and select an option called "Link Tables ...". This action will produce a dialog very similar to the standard Windows open file dialog widget. You want to change the drop-down select box labeled "Files of type:" to select "ODBC Data sources". This will cause the dialog to close and the Windows ODBC Data Source Administrator to open. The administrator is a simplified version of the one seen previously in the Windows Control Panel, and it should contain just two tabs labeled "File Data Source" and "Machine Data Source". The tab you want to select here is the "Machine Data Source". Assuming you correctly set up your ODBC database in the Control Panel, you should see it now listed under the "Machine Data Source" tab.

Once you select your data source, you should get a dialog listing all of the tables currently in the database. Select the tables you want to link into your Access database and click the "OK" button. It will ask you what the identifying keys are for each table. I don't think I have to remind you too sternly it is proper database design to build in a primary key into each table you create. You may choose to rely on the database's OIDs, but I do not recommend it in this situation. This advice will save you or your Access developer from problems later. Once you have chosen the columns to use as the primary keys, the linked tables will appear under the "Tables" object in your database. The final test to ensure that it all worked properly is to just double-click on the table to see if it will open without complaint.

And Re-Linking Tables

The next operation you need to understand to support ODBC tables is the ability to refresh the ODBC linking. This operation is needed in two particular situations. The first situation is when you change the structure of a table because these changes will not automatically propagate to your Access front-end. The other situation is when you change the location of your database server. I found out the hard way that just changing the connection information in the ODBC Administrator is not enough to point the front-end to a new server. It appears that some of this information is stored in the Access file. I can rationalize this behavior to a certain degree as a security measure, but in the end it might just be a design oversight. The point is to be aware that this situation exists and to know how to overcome it.

The process of re-linking the ODBC tables is very similar to the process of the initial linking. You start as before by selecting the "Tables" object from the design view of database. Next, you will select and then perform a right mouse-click on any ODBC-linked table. A pop-up menu should appear as before, but it will contain slightly different options. The option we are looking for is called "Link Table Manager". When you select this option, a dialog will appear with a listing of all the current ODBC-linked tables in your database. Select the check box next to any table you need to refresh.

If you have changed the location of the database server, you will also need to select the option at the bottom of the dialog called "Always prompt for a new location". If you don't check this option, it will try to refresh the table with the information stored in your Access database rather than what is listed in your data source link. When this option is selected, it will prompt you for the new location of the data source, similar to the process of initially linking the tables. Obviously, if you haven't changed the location of the database server, then you do not need to select this option and it will just update the table structure information.

Securing the Connections

At this point, you should have the basic knowledge for leveraging ODBC in your work environment; however, we also need to look at ways of making these types of connections secure. Note that these connections as configured in this article should not be encrypted. PostgreSQL can natively support SSL-encrypted connections, but the current psqlodbc driver does not yet support it. Several third-party drivers for PostgreSQL do support SSL, and these products may be worth looking at depending on your network setup and the sensitivity of the data being transferred. I also recommend, if you create an application like the one mentioned in this article, that you create a unique account for each user. This can help you produce an audit trail if you later need it. Finally, creating restrictive host-based access rules in your pg_hba.conf is always good database administration.

Conclusion

In the end, I created a full-featured application to replace a buggy old system for my company. The new database had all of the features of the old system but none of the headaches. The boss was happy that the company didn't have to shell out money for an upgrade, and the administrative assistant was happy that she could use her system again. Most importantly, I was happy I wouldn't have to spend any more time in front of a Windows server dealing with a buggy database. Now what should I tell the Windows administrator when he gets back?

D. Hageman has been involved with open source software for nearly a decade. His interests include artificial intelligence, robotics, and embedded systems. He currently works as an independent computer consultant and can be reached at: dhageman@dracken.com.