Parallel Tools Consortium Projects
Unfortunately, there are few mechanisms available for isolating such bugs. Those that do exist use somewhat cryptic output to describe the status of an application's message operations. Many users simply add print statements to their application to create a program trace, which they can use to analyze what happened.
The Ptools Message Queue Manager project was formed in response to this need. Its goal is a parallel debugging interface for examining the state of an application's message passing. The interface is general enough to support a variety of hardware platforms and message passing systems, yet specific enough to provide the programmer with enough information to easily find message passing bugs.
Main display of MQM
The main display shows the number of pending message
operations for each process.
Note that the controls are very simple: only two windows
and a few buttons. The display is scalable, since it grows as the
number of processes increases (256 are shown here).
The user can view the status of these operations from
many perspectives, using any combination of the following:
Supplementary displays allow the user to focus on a subset of the processes or
view the specific message operations associated with a particular process.
Filter control window
The Query Manager is designed to provide an interface to a group of daemons
in a cluster system, the run-time system of a parallel machine, or to a
debugger or monitor that provides adequate functionality. Implementation
of a query manager interfacing to the IPD debugger on the Intel Paragon
was straightforward and took only a few days.
Structure of MQM
The GUI is implemented as a set of Tcl/Tk scripts that are executed by an embedded Tcl interpreter. These scripts can be configured so that features not supported by a particular target platform can be removed from the interface.
If you are a parallel computer user, you should be aware that significant user input helped to guide every phase of the design of MQM. Although this phase of the project has come to a close, we are still interested in hearing your views on how the design might be improved.
If you work for a parallel computer vendor, we need your help in providing the users with an interface for examining the status of an application's message operations on your company's platform(s). Ptools is willing to make the interface between MQM and your system available "anonymously" to the general community, if your company prefers not to assume responsibility for the accuracy or longevity of the mechanism.