W2H: WWW Interface to GCG

Martin Senger Industry Programme, EMBL Outstation, The European Bioinformatics Institute.

Introduction

The W2H is an abbreviation for the WWW interface to the GCG Sequence Analysis Software Package (Genetics Computer Group, Inc., Madison, Wisconsin) or to the derived services (such as EGCG - Extended GCG, Sanger Centre, Hinxton, UK, or HUSAR, Heidelberg Unix Sequence Analysis Resources). The W2H tries to cover as much functionality as possible, and to do it as user friendly as we could achieve. It gives you the opportunity to access more than hundred programs from any platform where Netscape runs.

The presented interface is being developed as a collaborative project between European Bioinformatics Institute (EMBL-EBI), Hinxton, UK (within the Biostandards project in the Industry Support Programme) and German Cancer Research Center (DKFZ), Heidelberg.

Description

The users knowing the Wisconsin Package Interface will recognize very soon that the W2H interface was designed to be as much compatible with WPI as possible. On the other hand, the W2H interface supports also a classical usage without obligation of working with the working lists. Thus both sequence oriented and application oriented approaches are available.

Also the implementation is quite different from WPI. The W2H presents a real client-server architecture. A user is a client using the Netscape Navigator (WWW browser) on her/his computer, and all requests are transferred via network (using HTTP protocol) to the server computer where the GCG programs are running and making the analysis. We tried our best to minimize the number of necessary round-trips between client and server.

A typical scenario starts by choosing one or more sequences you want to perform some analysis on. Then you select or type the application you want to run. An application program window appears, displaying the selected sequences as input and allowing you to set the required and optional parameters before running the program. >From the application window the program is started and the result window appears. Either you refresh the window by clicking a button, or you specify the client-pull method to poll a result buffer automatically.

The interface is quite complex covering besides executing the GCG analysis programs also features like sequence selector, search set builder, pattern chooser, access to the sequence databases, uploading client files to the GCG server or showing and manipulating the graphical outputs. Together it consists of more than 30 HTML frames plus, for each application, a specialized form with all mandatory and optional parameters is automatically generated.

For special environments, like workshops, conferences and company intranets, there is a special mode (Intranet mode) which can be easily set up and used without having the UNIX accounts for all users on the server side.

Software implementation

The W2H is based on the Netscape Navigator version 2.0 or later or other browsers capable of interpreting a JavaScript scripting language embedded in the HTML documents.

The main advantage of an HTML embedded language is that the whole user request is prepared on the client side without necessity to make the network round-trips. The JavaScript language is capable to verify user's inputs, to suggest the default values and to provide sufficient help. An entering a single value in a form can consequently produce the derived values and let them appear in other places not even in the same form, but also in a quite separate window. It makes the user interface much more powerful and user-friendly.

Security

The applications developed to be used through the WWW interface should always carefully consider the possibility of security holes. The WWW tools, specially CGI scripts, are very powerful and used in the wrong way can make the system vulnerable against the wanted or unwanted attacks.

The W2H design has to take into consideration the security issues even more seriously because it enables an access to the server computer completely, of course only for registered users. The interface considers how to protect the server machine against the unauthorized users and how to protect the user data against a not-allowed access by other users.

Comparison with other GCG interfaces

There are also other user interfaces to GCG. Here I present a short comparison with the WPI - Wisconsin Package Interface (Genetics Computer Group Inc., Wisconsin - Madison) and with www2gcg (Bioinformatics Unit, Universite Libre de Bruxelles). Of course, the table does not include all features but tries to concentrate on the topics I consider important both from the user's and developer's points of view.

Summary

Moving activities to the client side substantially reduces networking. The W2H does it generally.

Taking advantage of the GCG/WPI configuration files is an important advantage. These .config files contain both parameter definitions and layout description, including dependencies between parameters. They are also supported and quite improved in GCG 9.

Using a platform independent client browser is a general advantage of any WWW interface over other solutions. The W2H depends on using Netscape (because of embedded JavasScript).

Both W2H and www2gcg implements the real client-server architecture which gives the users distributed processing and better relocation of resources.

More independent comparison would be welcome. In the table above, very important features, such as performance, complexity, user learning curve, or maintainability, were not considered at all or only partly.

Future directions

Besides extending the basic functions and features (above all implementing batch queue processing and dependency rules between parameters), the W2H will consider in the future further parsing and processing of the applications outputs and linking it to the other information sources (very probably by using SRS).

An another direction, already started, is a design and an implementation of the CORBA-based interface to the the GCG and similar applications.

Availability

The W2H interface has its own homepage with links to the documentation, to the latest news and FAQs as well.

The W2H interface is provided as a free software (but useful only for the GCG licensed users) and is available by anonymous ftp at
EBI and
DKFZ


Go to: previous article - next article - Table of contents