Network Performance Monitoring in EMBnet

Jan H. Noordik, J.A.M. Leunissen and K. Cuelenaere

Dutch National EMBnet Node

CAOS/CAMM Center, University of Nijmegen, The Netherlands


Introduction

The European Molecular Biology network EMBnet was established in 1988 to link European laboratories where biocomputing and bioinformatics were used in molecular biology research. The initiators saw the network as a way of bringing a fast growing stream of information to users throughout Europe and to and from, at that time, the EMBL Data Library in Heidelberg. But EMBnet was also seen as much more.

Bioinformatics,  and equally the rapidly developing science, required extensive user training and support and on occasion users/researchers would require specialised hardware and software that could not be economically duplicated throughout a country or a group of nations. It was thought that such needs could best be handled by providing national language help and regionally tailored services.

Thus EMBnet swiftly evolved into a series of collaborating national and specialised nodes, spread throughout Europe and cooperating for their and the users common good. Today EMBnet, as an "Institute without Walls", not only complements Europe's central facilities such as the EBI (European Bioinformatics Institute) in Hinxton (UK), but it is also the defacto collaboration forum for bioinformatics worldwide. Currently EMBnet consists of over thirty partner institutes or EMBnet nodes. The network is organised as a "Stichting" under Dutch law. Funds are mainly obtained from node fees and from a Concerted Action Program grant from the European Commission; (ERBBIO4-CT96-0030).


Network requirements

EMBnet nodes maintain daily updated DNA sequence databases and provide login and/or Web biocomputing services for hundreds of researchers throughout Europe, who wish to use their national node's infrastructure and facilities for database searching and biocomputing. These operations involve the daily transport of hundreds of Mb's of data between nodes and between nodes and their end users. Considerable bandwidth availability is a condition for smooth operation. On-line services, as provided by many EMBnet nodes, require fast response times. In its operational strategy, EMBnet's success is critically dependent on both these network parameters; i.e. fast and reliable network connections with moderately high throughput capacity.

Spurred by the rapidly increasing amount of data to be handled in EMBnet, and the as rapidly increasing network traffic in general, by the end of 1994 EMBnet decided to start an ongoing network performance monitoring program. The data to be produced by this program were intended to provide individual EMBnet nodes with objective and reliable figures on their network accessibility. Many nodes expressed an urgent need for this information, to be used in discussions with their local network authorities. In addition the data could be used to verify claims of international e.g. DANTE and national data network providers like SURFNET in the Netherlands, that (international) network performance and the quality of service (QoS), as experienced by the end-user in the network, are constantly improving and that bottlenecks in network traffic are gradually disappearing.


Methods

In the monitoring program described here, network performance data have been collected during 1995, 1996 and 1997 and measurements still continue. For the data collection two simple tools for network testing were used; ping and traceroute. The IRIX Man Pages describe Ping as follows:

"a tool for network testing, measurement and management. It utilizes the ICMP protocol's ECHO_REQUEST datagram to elicit an ICMP ECHO_RESPONSE from a host or gateway. ECHO_REQUEST datagrams (`pings') have an IP and ICMP header, followed by an 8-byte time stamp, and then an arbitrary number of `pad' bytes used to fill out the packet."

For our monitoring facility, we used packet sizes of 64 bits to mimic average traffic in EMBnet, which is usually a mix of telnet for on-line sessions and FTP for data transfer. Ping requests are sent to all EMBnet nodes several times a day on a daily basis from different locations within the network. At 01:00, 09:00, 12:00, 15:00 and 18:00 hrs. a fixed number of five requests is sent, of which round-trip times, RTT's and packet loss statistics are collected. From these data, mimimum, maximum and average values per day are calculated and the daily average RTT values were used (averaged) to an overall monthly indicator, the quality of network accessibility (QNA) for a particular EMBnet node. The variation of QNA's (the average RTT in a period of one month), shows general and node specific improvement or deterioration of network performance as a function of the time. Packet loss data give an indication of node reachability. A high monthly average percentage packet loss in practice means that a node is almost cut off from the network. A 100% packet loss in the daily data indicates the impossibility to reach the node at that specific day. Traceroute data were used occasionally to track specific network hops causing extreme delays.


Results

With the Dutch node as originating node, data have been collected since November 1994 and QNA numbers have been calculated for all EMBnet nodes since then. For verification, also data sets from Spain and Norway were collected for several months in 1995 and 1996. These daily data for each month are graphically accessible from a QNA request (hypertext link at the bottom of this document), in gnuplot graphs as shown in the following picture fragment:


Fragment of a gnuplot

Minimum, average and maximum round-trip time as measured on a specific day in five sets of five ping requests. Round-trip times are given in seconds. Relative packet loss, the number of `lost' packets divided by the number of packets sent, ranging from 0.0 to 1.0 (100% packet loss). Blue bars of 100% indicate that the remote node was down at the time of the measurement. In that case, obviously no RTT's are reported. Packet-loss statistics can be used as a measure for node availability. Average round-trip time level (QNA) in a given month, in seconds.

To interpret these data correctly, one should be aware that RTT delays are due to both network congestion and destination occupancy. Since both are generally strongly dependent on the time of the day, RTT data show a time-off-the-day dependence as is exemplified in the picture below. This example shows measurements to a German destination from The Netherlands. These daily fluctuations are only visible as the difference between minimum and maximum RTT in the representation of the daily measurements and disappear completely after averaging to monthly QNA numbers.


Plot of round-trip times versus the time of the day

It must also be emphasised that the ping data collected in this project do not give an indication of specific bottlenecks along the data paths. We only try to collect and present factual data on the accessibility of the different EMBnet nodes, in order to determine if network problems hamper node services. By having performed these measurements over longer periods (some years), improvement or deterioration for specific nodes could be traced. For less global measurements of this kind, network and routing topology is required but that would result in a project far beyond the scope of the current one. However, to help track down some extreme delays for specific nodes, traceroute measurements have been performed on an incidental basis.


Conclusions

In 1995, a Network Usage and Quality Advisory Group of the Dutch Network organization SURFnet, defined "an upper RTT limit of 125 msec. without packet loss" as a minimum QoS level for interactive on-line work. For data transport only, a slightly less stringent criterium could be used. If one accepts this value of 125 msec. as an upper QNA limit for on-line work, the network performance results collected sofar in this project, are rather disappointing.

As a reader and/or EMBnet node manager you are invited to draw your own conclusions about the accessibility of your own node.

This document is intended to be permanently usable as a monitor for the control of network performance to or from any specific EMBnet node. Therefore  we, the Dutch node, are ready to expand the number of nodes taking part in the measurements. A copy of the monitoring protocol is available on request. Data resulting from these measurements will be sent to the Dutch node where they will be added automatically to the files which can be queried from this publication. For further information and a local copy of the monitoring protocol, contact Koen Cuelenaere , CAOS/CAMM Center Nijmegen


Acknowledgement

We thank Hans Engelkamp and Wim Janssen of the CAOS/CAMM Center for their help in the development of the monitoring utility and the data collection and the data processing code. The Network Performance Monitoring Project was partially funded by European Commission under grant ERBBIO4-CT96-0030.


Go to: previous article - next article -Table of contents