Development and Evaluation of a Computational Aid to Facilitate Flightdeck Agenda Management

Development and Evaluation of an Aid to Facilitate Agenda Management

Ken Funk, Woo Chang Cha,

Robert Wilson, and Joachim Zaspel

Department of Industrial and Manufacturing Engineering

118 Covell Hall, Oregon State University

Corvallis, OR 97331-2407, USA

541-737-2357; funkk@engr.orst.edu

Rolf Braune

7250 Old Redmond Road, #140

Redmond, WA 98052, USA

204-885-1943; 71611.1126@compuserve.com

Commercial air transportation has an admirable safety record, yet each year hundreds of lives and hundreds of millions of dollars worth of property are lost in air crashes in the United States alone. About two-thirds of these aircraft accidents are caused, in part, by pilot error. Many of these errors are errors in performing flightdeck (or cockpit) functions, others are errors in managing flightdeck goals and the functions to achieve those goals. This paper describes the development and evaluation of a prototype computational aid to facilitate the management of flightdeck goals and functions.

Background: Agenda Management

The concept of Agenda Management is an extension of a theory of Cockpit Task Management proposed by Funk [5]. Informally speaking, an agenda is a list of things to be done. So, managing a flightdeck agenda can be described informally as managing the intentions of the flightcrew and flightdeck automation and managing their activities to fulfill those intentions.

More formally, Agenda Management is described in terms of actors, goals, functions, and resources. An actor is an entity that does something in that it can control or change the state of the aircraft and/or its subsystems. Pilots are human actors; machine actors include autoflight and flight management systems. A goal is a representation (mental, electronic, or even mechanical) of an actor's intent to change the state of the aircraft or one of its subsystems in some significant way, or to maintain or keep the aircraft or one of its subsystems in some state. For example, a pilot might have a goal to descend to an altitude of 9,000 ft, a goal to maintain the current heading of 270E , and a goal to crossfeed fuel to correct a fuel system imbalance. If configured properly, the autoflight system in this example would also have a goal to descend to 9,000 ft and a goal to hold 270E . Goals come about as a result of planning and decision making in the case of human actors, and computation or human input, in the case of machine actors.

A function is an activity performed by an actor to achieve a goal. That activity may directly achieve the goal or it may produce sub-goals which, when achieved by performing sub-functions, satisfy the conditions of the original goal. Actors use resources to perform functions. Human actor resources include eyes, hands, memory, and attention; machine actor resources include input and output channels, memory, and processor cycles. Other machine resources include flight controls, electronic flight instrument system displays, and radios. In general, several goals might exist at any time, so several functions must be performed concurrently to achieve them. Actors must be assigned to perform those functions and resources must be allocated to enable them. An agenda then is a set of goals to be achieved and a set of functions to achieve those goals.

Agenda Management (AMgt) is a high-level flightdeck function performed cooperatively by flightdeck actors which involves two sub-functions:

Goal management is the process of

recognizing or inferring the goals of all flightdeck actors;
canceling goals that have been achieved or are no longer relevant;
identifying and resolving conflicts between goals; and
prioritizing goals consistently with safe and effective aircraft operation.

Function management is the process of

initiating functions to achieve goals;
assigning actors to perform functions;
assessing the status of each function (whether or not it is being performed satisfactorily and on time);
prioritizing those functions based on goal priority and function status; and
allocating resources to be used to perform functions based on function priority.

At any point in time, AMgt performance is satisfactory if and only if:

there are no goal conflicts;
all goals and functions are properly prioritized; and

either

performance of all functions is satisfactory, or

if that is not possible, actors are actively engaged in bringing the highest priority unsatisfactory functions up to a satisfactory level of performance.

In an earlier study that considered only the management of functions performed by human actors (that is, task management [4]) we found strong evidence of function prioritization errors in 24 (7%) of 324 aircraft accidents investigated by the National Transportation Safety Board and 133 (28%) of 470 aircraft incidents reported to the Aviation Safety Reporting System. Two recent aircraft accidents illustrate human actor vs. machine actor goal conflicts. In 1994 in a China Airlines Airbus A300 on approach to Nagoya, Japan, the flightcrew inadvertently initiated an autoflight system go-around maneuver while trying to continue the landing [2]. The goal conflict between the flightcrew and the autoflight system caused an out-of-trim condition that resulted in a stall and crash which killed 264 persons. In an American Airlines Boeing 757 on approach to Cali, Columbia in 1995, the flightcrew accepted an air traffic control clearance direct to a designated navigational fix [1]. They inadvertently configured the aircraft's flight management system to fly the airplane to a different fix. This goal conflict was not detected in time to prevent the aircraft from crashing into mountainous terrain, killing 159 persons.

Objectives

From these preliminary findings we have concluded that AMgt -- and specifically the failure to perform AMgt satisfactorily -- is a significant factor in flight safety. The objectives of our research were to develop and evaluate an experimental computational aid to facilitate AMgt. We call this aid the AgendaManager.

The AgendaManager

Simulator Environment

Our part-task simulator models a generic, twin engine transport aircraft. It is built from components developed at the NASA Langley and NASA Ames Research centers and in our own lab. It runs on one or two Silicon Graphics Indigo 2 computers and provides a simplified aerodynamic model (Langley), autoflight system (Langley), Flight Management System (Langley), primary flight displays (Ames), Mode Control Panel (Ames), and system models and system synoptic displays (OSU). The software is written in C, FORTRAN, and Smalltalk (VisualWorks 2.5).

Analysis and Design

As a first step in designing the AgendaManager, we developed a formal, functional model of Agenda Management using IDEF0, a graphical modeling methodology useful for representing and decomposing complex activities. IDEF0 helps the analyst represent activities, inputs and outputs to and from those activities, controls or constraints on the activities, and mechanisms which perform the activities. From the IDEF0 model we generated a data dictionary consisting of the entities that are the inputs, outputs, and controls of the activities in the model. We used these to define the object-oriented architecture of the AMgr.

AMgr Architecture

Major AMgr objects include System Agents, Actor Agents, Goal Agents, Function Agents, an Agenda Agent, and an Agenda Manager Interface. Each agent is a simple knowledge-based object representing the corresponding elements of the cockpit environment. As a representative of such an element, the Agent's purpose is to maintain timely information about it and to perform processing that will facilitate AMgt. An Agent's declarative knowledge is represented using instance variables. Its procedural knowledge is represented using Smalltalk methods. The categories of Agents are described below and the overall architecture is illustrated in Figure 1.

The purpose of a System Agent (SA) is to help the pilot (and the AMgr itself) maintain situational awareness. Each SA represents a system in the simulated environment, such as the aircraft, the fuel system, or even a pilot, and receives information from that system via an inter-process connection called a socket. An SA's declarative knowledge includes the past, current, and projected future state of the corresponding system. Its procedural knowledge includes how to project future state and how to recognize system abnormalities. This means that an SA maintains not only current and past system state information, but can also be called upon by other agents (see below) to project future state information in order to anticipate future events. It can also recognize system faults and instantiate Goal Agents (see below) for goals to correct them.

Actor Agents (AAs) recognize actors' goals, implicitly and explicitly, and make them known to the rest of the AMgr. An AA represents an actor, such as a pilot or an automation device. As declarative knowledge, each AA maintains information about the current state of the corresponding actor, including his/her/its agenda. AA procedural knowledge covers how to obtain state information.

A very important AA is the Flightcrew (or pilot) Agent. The Flightcrew Agent has a serial connection to a Verbex automatic speech recognition (ASR) system. This allows the pilot to declare his/her goals explicitly by short vocal utterances. The intent is to be able to recognize pilot goals primarily by monitoring air traffic control (ATC) clearance acknowledgements. That is, when a pilot acknowledges ATC clearances, he/she typically repeats the clearance back to the controller. The Flightcrew Agent, using the Verbex system, interprets these as pilot goals for the control of the aircraft. For example, heading, altitude, airspeed, and waypoint goals are declared as the pilot verbally acknowledges ATC clearances by repeating them back to the controller (the experimenter, in our study). The Verbex system "eavesdrops" on the pilot and sends a coded form of the utterance to the Flightcrew Agent which translates it and declares a goal by creating an instance of a Goal Agent.

The purpose of Goal Agents (GAs) is to maintain information about all actors' goals. A GA represents an actor's goal, such as one to descend to and maintain an altitude of 9,000 ft or one to crossfeed fuel from one fuel tank to another to correct an imbalance. A GA has declarative knowledge about the state of the goal to be achieved (pending, active, or terminated) and whether or not it is achieved. A GA's procedural knowledge includes how to determine if the goal is achieved and how to determine whether or not its goal is consistent with the goals of other GAs. Each GA is associated with one Function Agent.

The purpose of a Function Agent (FA) is to monitor whether its goal is being pursued in a correct and timely manner. An FA represents a function, which is an activity performed to achieve a goal. Each FA has declarative knowledge about the state of its function (pending, active, or terminated, like the goal) and the status of its function (how well the function is being performed and whether or not goal achievement is likely). FA procedural knowledge includes how to assess function state and status and how to assess goal and function priority based on prevailing conditions. FAs not only assess the current status of functions, but also use the prediction capabilities of SAs to project future function status.

The single Agenda Agent is the executive Agent which coordinates the activities of all other Agents. Its declarative knowledge consists of the current set of GAs and FAs. Its procedural knowledge includes what to do when a new GA is introduced (e.g., check it against other GAs for compatibility), what to do when a GA changes state (e.g., move it to another part of the Agenda), and how to develop overall priority ratings for the Goal/Function Agents based on importance and urgency.

Operation

As the simulator runs it sends state data to the AMgr, whose SAs maintain a situation model of the simulated aircraft and its environment. AAs monitor real or simulated actors, detect or infer goals, and instantiate GAs. GAs look for conflicts with each other and monitor SAs to see if the goals are achieved. FAs monitor the progress -- if any -- made in achieving their associated goals. The Agenda Agent prioritizes GAs and FAs and keeps track of goal conflicts. The AgendaManager Interface presents this agenda information to the pilot.

Pilot Interface

The AgendaManager Interface (AMI) consists of display formats for presenting agenda information to the pilot. It is illustrated in Figure 2, which shows what the pilot would see in the possible (but hopefully, very unlikely) situation depicted in the diagram in Figure 1. Each line on the AMI is a message concerning a GA and FA pair, consisting of the name of the goal and a status comment if a problem exists or is anticipated.

In the situation underlying both figures, the Fuel System Agent has detected an out-of-balance condition between the left and right fuel tanks and has instantiated a GA for the goal to remedy it, and the pilot has correctly begun crossfeeding fuel. The corresponding FA has determined that this function is being performed satisfactorily, but will require attention later to terminate fuel crossfeeding, so the AMgr message for it is white, which denotes a satisfactory status.

The pilot has received an air traffic control clearance to reduce speed to 240 knots (kt), maintain the present heading of 070 degrees, and descend to an altitude of 9,000 ft. He/she has verbally acknowledged this clearance and the AMgr has recognized these aviate (aircraft control) goals and instantiated GAs and FAs. Speed is currently too high and is not decreasing, so the AMgr speed message is amber and its comment notes the problem. The airplane's current heading is 070 degrees, so the AMgr's message for this is gray, with no explanatory comments, so as not to distract.

Although the aircraft is correctly descending towards 9,000 ft, the pilot has inadvertently set the autoflight system to descend to 8,000 ft. This goal conflict has been detected by the two GAs and is signalled by an amber-colored message.

Two other system faults have occurred. There is a fire in the left engine and the pressure in the center hydraulic subsystem has dropped below an acceptable level, and corresponding SAs have detected them and instantiated GAs for goals to correct them. As the engine fire condition is critical, its message is displayed in red at the very top of the display. The hydraulic system fault is intermediate in priority between the flight control goals and the fuel balance goal, it is displayed in amber between them.

AgendaManager Evaluation

Objective

The purpose of the experiment was to determine any differences in AMgt performance between the use of the AMgr and the use of a model (developed in our lab) of a conventional monitoring and alerting system called the Engine Indication and Crew Alerting System (EICAS).

Method

A total of ten airline pilots participated in the experiment, with the first two being used to refine the scenarios and identify and correct problems with software and procedures.

Prior to the experiment each subject was given a brief introduction to the study, filled out a pre-experiment questionnaire, and read and signed an informed consent document. The following forty minutes were used to train the Verbex speech recognition system to recognize the subject's voice so that altitude, speed, and heading goals could be determined from ATC clearance acknowledgements. After a short break the subject learned how to fly the flight simulator using the Mode Control Panel (MCP -- the autoflight system interface), recognize and correct experimenter-induced goal conflicts and subsystem faults, interpret EICAS and AMgr displays, and alter programmed flightpaths. After a lunch break, the subject flew two 30 minute scenarios (one with EICAS, one with the AMgr), separated by a five minute break. Upon the completion of the experiment the subject answered a post-experiment questionnaire.

The primary factor investigated in the experiment was monitoring and alerting system condition (whether AMgr or EICAS was used). The experimental design was balanced in regard to the monitoring and alerting system used and the scenario (1 or 2).

We collected data for each subject on:

how correctly the subject prioritized within concurrent subsystem functions;
the average subsystem fault correction time;
the average time to properly program the autoflight system;
the percentage of goal conflicts detected and corrected;
the average time to resolve goal conflicts;
how correctly the subject prioritized concurrent subsystem and aviate functions;
the average number of unsatisfactory functions at any time;
the percentage of time all functions were satisfactory; and
the subject's rating of the effectiveness of each monitoring and alerting system: -5 (great hindrance) to +5 (great help).

Results

Table I shows the results obtained for each of these variables. The first three, within subsystem correct prioritization, subsystem fault correction time, and autoflight programming time, show no significant statistical differences (p-values > 0.05) across the AMgr/EICAS conditions. This is critical for the interpretation of the results in that it supports the hypothesis of the AMgr being the only cause of significant differences. For example, within subsystem prioritization performance does not differ between the two conditions. Also, once a subsystem fault is detected, the process of correcting it is identical between the two conditions. Programming the autoflight system is identical in both conditions. However, we did observe a minor practice effect for each subject between the two scenarios, i.e., they showed significant improvement in programming the autoflight system.

A key objective of the AMgr is to support the pilot in recognizing goal conflicts and to help resolve those in a timely manner. The next two variables, goal conflicts corrected percentage and goal conflict resolution time, directly reflect this, and the results clearly indicate how successful the AMgr condition achieved it. Any time a goal conflict existed, the AMgr helped the subject identify this conflict (100%) whereas with EICAS, the subjects only identified 70% of the conflicts. Also, with the AMgr the subjects were able to resolve the conflict nearly 19 seconds faster. This may have helped them achieve an overall lower level of unsatisfactory functions (AMgr: 0.64; EICAS: 0.85) by making more time available to them.

It is crucial for the pilot to recognize that primary flight control functions (i.e., aviate functions) are usually more critical than subsystem related functions. The AMgr clearly showed its strength by helping the pilots in 72% of the cases to correctly prioritize. With EICAS the pilots only achieved 46%. Last, but not least, with the AMgr the subjects were able to achieve a significantly higher percentage where all functions were performed satisfactorily (AMgr: 65%; EICAS: 52%).

Independent of how well an individual can perform under a given condition, it is also important that subjectively he or she finds this condition acceptable. Based on our results, the subjects' effectiveness ratings strongly support the AMgr (4.8 vs. 2.).

Discussion

The results of our investigation clearly suggest that the concept of the AgendaManager can have a very significant impact on flight crew performance, helping them in successfully managing goals, functions, and resources. In that, the AMgr represents a software tool which shows the potential for significantly reducing the probability of undetected flight crew errors. It directly builds on the success of existing crew monitoring and alerting systems (such as EICAS) by including pilot intent logic [6]. Given the industry's objective of significantly reducing the number of commercial transport accidents, the AMgr must be seen as one of the facilitating tools in this effort.

Further Research

Based on our results, we believe that there are several research paths to be explored. For example, the AMgr should be evaluated in a more realistic scenarios in a full-mission simulator. This is necessary to be sure that the effects that we saw in this evaluation were not merely artifacts of the simplified part-task environment.

During AMgr development, we experimented with a goal communication method that integrated overt communication (via clearance acknowledgement) and covert communication (via script-based intent inferencing) [3]. Although we chose to include only overt goal communication in the current version of the AMgr, covert methods offer the potential of low pilot workload and should be further investigated.

An enhancement we are currently exploring is Fuzzy Function Agents (FFAs). Function Agents in the current version of the AMgr use conventional (crisp) logic to assess how well functions are being performed. In some cases (for example, aviate functions) fuzzy logic may be more appropriate, so we are developing FFAs to provide more human-like function assessments. Through interviews with pilots we extracted fuzzy if-then rules to model human function assessment. Then we fine-tuned the rules with the application of a genetic algorithm which minimized the discrepancy between human and machine assessments of sample scenarios. Although a preliminary evaluation of the FFAs has revealed performance comparable to that of human pilots, the method needs further development.

Although the AMgr has potential as an operational aid, its near-term benefits may be realized in other ways. For example, with suitable modifications, the AMgr could be embedded in a part-task trainer to facilitate AMgt training. Another possible role is as a research tool. With relatively minor changes the AMgr could be used to capture AMgt data on-line in full-mission simulator experiments. In fact, the greatest value of the AMgr may be in this capacity, helping us understand the phenomenon of Agenda Management better.

Acknowledgements

This research was performed under NASA Ames Research Center grant NAG 2-875. Kevin Corker and Barbara Kanki were our technical monitors. We greatly appreciate their support and encouragement. We also gratefully acknowledge the technical and moral support provided to us throughout the project by Greg Pisanich, of Sterling Software.

References

Aeronautica Civil of the Republic of Colombia, Controlled Flight Into Terrain, American Airlines Flight 965, Boeing 757-223, N651AA, Near Cali, Colombia, December 20, 1995, Santafe de Bogota, DC, Colombia: Aeronautica Civil of the Republic of Colombia, September 1996.

Aircraft Accident Investigation Commission - Ministry of Transport Japan, China Airlines Airbus Industrie A300B4-622R, B1816 Nagoya Airport April 26, 1994, Ministry of Transport, July 19, 1996.

Cha, W.C. & Funk, K., "Communicating Pilot Goals to an Intelligent Cockpit Aiding System", Proceedings of the 1997 IEEE International Conference on Systems, Man and Cybernetics, 12 - 15 October 1997, Orlando, Florida.

Chou, C.D., Madhavan, D., & Funk, K.,"Studies of Cockpit Task Management Errors," International Journal of Aviation Psychology, 6(4), 1996, pp. 307-320

Funk, K., "Cockpit Task Management: Preliminary Definitions, Normative Theory, Error Taxonomy, and Design Recommendations, The International Journal of Aviation Psychology, Vol. 1, No. 4, 1991, pp. 271-285

Funk, K. & Braune, R., "Expanding the Functionality of Existing Airframe Systems Monitors: The AgendaManager," Proceedings of the Ninth International Symposium on Aviation Psychology, 27 April - 1 May 1997, Columbus, Ohio, in press.

Figure 1. AgendaManager architecture.

Figure 2. AgendaManager interface.

Table I

AgendaManager evaluation results, mean values (all times in seconds).

Response variable	AgendaManager	EICAS	p-value
within subsystem correct prioritization	100%	100%	NA
subsystem fault correction time	19.5	19.6	.9809
autoflight system programming time	7.0	5.9	.1399
goal conflicts corrected percentage	100%	70%	.0572
goal conflict resolution time	34.7	53.6	.0821
subsystem/aviate correct prioritization	72%	46%	.0308
average number of unsatisfactory functions	0.64	0.85	.0466
percentage of time all functions satisfactory	65%	52%	.0254
subject effectiveness rating (-5 to 5)	4.8	2.5	.0006