Problems in the Assessment of Mental Load
This paper indicates the reasons why it is difficult to find a correlation between objective and subjective mental work load.
In physical tasks there is often a clear correlation between measures of the amount of work done by a person and the amount of work achieved in the world.
Many attempts have been made to find a similar correlation between mental workload and performance achieved.
But in most complex real mental tasks there may be several strategies for doing the task, which get the same acceptable result but need different amounts of mental work. So the person doing the task can adapt their strategy according to the task demands and their mental capacity. The result is there can be a non-monotonic relation between mental workload and task load achieved.
For performance prediction, I suspect there is still a big gap between pragmatic techniques which can be used in practice but only give a ball-park figure, and the sorts of complex considerations which theoreticians would ideally like to see in such a technique.
Topics :
I. Introduction
A. Strategy and workload
Examples : mental processing as a function of strategy; strategy as a function of task demands.
Discussion : alternative strategies; strategy as a function of task demands; mental processing capacity and strategy; interaction of task, capacity, and strategy; capacity as a function of task demands; stress and strain - complexity of the function, difficulty with independent definitions.
B. Work capacity of the individual :
arousal, motivation, practice and experience, fatigue.
III. Techniques for workload prediction : specifying task demands; mental task operations - prediction, description, validity, types of mental operation; performance parameters - predictions for populations, pragmatic techniques, fast-time simulation; evaluating prediction against required performance.
Problems in the Assessment of Mental Load
Lisanne Bainbridge
Department of Psychology, University of Reading
Le Travail Humain, 1974, 37 (2), 279-302.
I. INTRODUCTION
In this paper we will discuss the problems of assessing how the pressures of mental work affect the behaviour of those doing it.
We will not begin with a brief definition of 'mental load', for reasons which will become obvious. Firstly, it is usual to distinguish between two aspects of mental work load : the task demands imposed on an operator/controller, and the amount of mental work they do in response to these demands. Many writers suggest that these aspects parallel the physical concepts of 'stress' and 'strain'. Actually the situation is more complicated, as four main factors are involved. The amount of work done by a particular individual, and the performance they achieve, will depend on the 'task demands' in relation to their 'work capacity'. These four factors interact in a complex dynamic way, and aspects of this interaction are not adequately described by numerical methods. As a result of this interaction it is difficult to give independent definitions of the terms 'stress' and 'strain' in relation to mental work. The following discussion suggests that the superficial similarity between physics or physical work and mental work is misleading, and it might be better not to use the terms 'stress' and 'strain' for mental work load.
Although we have not got a full understanding of the relation between these four factors, we must still meet the present need for techniques of assessing mental task difficulty. One of the basic problems in studying mental tasks is that the behaviour is not directly observable, much of it the people themselves are not conscious of. This means that studies of mental work have to be made by using, not the behaviour itself, but synthetic descriptions of what is assumed to underly the behaviour which can be observed. There are many problems in making these descriptions. We will discuss the limitations which must be accepted in any such techniques at the present time.
II. WORK CAPACITY AND TASK STRATEGY
Two types of factors complicate the relationship between the workload imposed by the task demands and the workload experienced by the individual. The operator may be able to vary the strategy they use to do the job; this may affect how 'difficult' the job is for them. In addition the 'load' from task demands or a particular strategy may affect not only the amount of work done in a particular time but also the individual's basic capacity for doing mental work. We will discuss first the interaction between strategy and workload, and then the factors affecting individual work capacity.
A) Strategy And Workload
Evidence will be quoted that the operator can use alternative methods for doing a task, and these may differ in the amount of mental work needed to give equivalent task achievement. Also, different strategies may be used by an operator at different levels of task demand (amount of task work to be done in a given time). We can suggest that these two findings interact : with increasing levels of task demand each component task is done using a strategy which is more economical in terms of mental work, so that the total task demand is still within the operator's mental work capacity. This interaction has been extensively discussed by Sperandio (1972). It shows that the relation between task demands, workload and mental work or information processing load is not simple but is discontinuous and nonlinear.
We will first give some examples of the way in which mental processing varies with strategy, and strategy varies with task demands. We will then make some general points about the ways in which strategies can vary; we will discuss the interaction between strategy, capacity and workload, and the implications of this interaction for the nature of 'stress' and 'strain'.
1) Examples
a) Mental processing as a function of strategy
Leplat and Bisseret (1965) have done an air-traffic control study in which area controllers had to find conflicts (collision risks) between aircraft. Each aircraft was described by variables such as height, speed, and position, written on a flight-strip. The controllers used two methods, some arranged the flight-strips into groups of aircraft in the same geographical sector, others arranged the strips by flight level. These two methods were compared theoretically, by working out the minimum number of comparisons and decisions necessary to do the task in each case. Grouping by flight level is more efficient, as flight level is a variable which immediately indicates a risk of conflict, and fewer repeated comparisons are needed to check other important variables. The methods were also compared experimentally : the flight level method was faster and there was evidence that future conflicts were foreseen earlier. This suggests that, although all the controllers were experienced and able to control adequately, the flight level strategy required fewer operations, took less time and allowed more sophisticated performance.
In a task of power supply control to steel-melting furnaces Bainbridge (1972) found three main methods of defining the control error. These involved increasingly complex calculations and assessment, but did not make an observable difference to the adequacy of meeting the main control criterion (Bainbridge et al. 1968). The more complex methods were more elegant and more integrated with other aspects of the task.
In the same power control task the operators sometimes found quantitative data values by numerical calculations, sometimes by judgement using categories, e.g., large, medium, small (Bainbridge, 1971). Calculation takes longer and involves more operations and working storage, but gives a more accurate answer.
b) Strategy as a function of task demands
Sperandio (1971), in an air-traffic control study, found that with four aircraft to handle approach controllers used 'direct' routing, finding for each aircraft its shortest route to the runway, while with eight aircraft they used 'standard' routing, sending aircraft to a stacking point which had a standard runway approach. The methods were equally safe, but the latter required less knowledge about individual aircraft, and although it involved more verbal communication this was more routine. Sperandio also found, with increasing numbers of aircraft, increasing simplicity and decreasing redundancy of verbal messages, changes in working memory, and changes in task sharing in the team. Coeterier (1971) has also found changes in strategy with number of aircraft, in terminal controllers.
In a field study of approach controllers, Soede et al (1971) measured both the duration of various stages in the task, and also such task parameters as number of communications and number of changes in traffic. They calculated correlations between the stage durations, and found that the sizes of these correlations were related to the levels of the parameters. This implies that the timing of each stage is a dynamic function of the overall pattern of activity and the task conditions.
Within an overall strategy, details of the methods used depend on the particular context, for example, not only on the number of aircraft but also on their relative positions in space. Leplat and Bisseret (1965) developed algorithmic flow diagrams to describe the decisions needed to resolve different types of conflict situation, suggesting that the length of a path in the diagram indicated the task complexity. Experimentally, the time taken to resolve a given type of conflict correlated with the length of this path. Coeterier (1971) found that the sequence in which aircraft were brought in to land depended on their present courses in relation to the runway and the wind direction.
In the power control task (Bainbridge, 1974), the operator's type of behaviour, whether he made a control action or surveyed the present and future states of the furnaces, depended on the size of the control error.
Details of the task context or the workplace design can also constrain the character of individual operations within a general strategy. Bainbridge (1971) has suggested that digital displays encourage numerical calculations and dial displays encourage judgement of variable values. Leplat and Bisseret (1965) discuss how the actual numbers in a calculation influence the operations needed, e.g. (1443-1422) is simpler than (1511-1422). (For some experiments on this, see Posner, 1965.) The complexity of the relation between displays and controls (S-R compatibility) can affect reaction time; the longer RTs required for some S-R relations have been explained by inferring that a more complex rule is needed to select the response (cf. Fitts and Deininger, 1954).
Many laboratory studies have shown the ways in which performance of a single task operation can vary. For example, dial design affects the speed and accuracy of reading (Grether, 1949). Stimulus-response code affects reaction time (Fitts and Posner, p. 105). Messages may become less redundant (Sperandio, 1969). Memory may be less precise, although mistakes are not completely random (Bisseret, 1970). Allocation of time between information sources depends on the probability and utility of the information (Senders, 1966; Kanarick and Peterson, 1969). Under time pressure low priority operations may be omitted (Conrad, 1954), or speed may be increased at the expense of accuracy (Pew, 1969).
2) Discussion
a) Alternative strategies
We have suggested that the operator varies their strategy according to task circumstances. These examples show that variations in strategy can occur at several levels and in different ways. These may be qualitative differences in the information processing by which the task is done; different variables may be considered in different sequences, either at the level of overall strategy or at the level of how particular items within the overall strategy are handled. Within any one processing operation there can be quantitative changes in performance parameters.
The alternative strategies may not give exactly the same task performance, but they may give performance which is equally acceptable in meeting task demands. Whether this is true or not depends on the stringency with which the task demands specify performance details. For instance many different strategies can be used in air-traffic control because the chief criterion is 'safety', so long as this global criterion is met it does not matter what method is used or how elegant it is.
The alternative strategies are not necessarily all available to all operators. Which strategies an operator can use will depend on their aptitude and experience (see section B), the strategies represent the degrees of freedom of action available to the operator/controller (Sperandio, 1971).
b) Strategy as a function of task demands
Figure 1 : Effect of changing working methods on relation between mental work and task work (part A from Sperandio, 1972). This figure is a simplification: in practice the use of methods overlaps, so there are not discontinuities.
[part A : by changing to strategies which need less mental effort, the controller can deal with increasing task demands within the same mental work load.
part B : using different working methods, amount of mental work done can remain constant while task achieved increases.]
If an operator/controller can use several different strategies which give equally acceptable task performance and these vary in the amount of mental processing required, then the same task demand can be met by several different levels of mental work. The more economical strategies may leave 'spare' working storage capacity, time etc. which can be used to handle further task items so that, inversely, different levels of task demand may be dealt with by the same amount of mental processing. Increasing task demands therefore do not necessarily lead to increased 'load' or to deterioration in performance. Similarly a change in amount of task work done does not necessarily imply or require a change in the operator's capacity for mental processing, only a change in how the available capacity is used.
Sperandio (1972) has graphically illustrated the relation between amount of mental processing, strategy, and increasing task demands His figure is reproduced in the upper part of fig. 1. This shows clearly the discontinuous and nonlinear nature of the relation, although this is still oversimple as there are so many ways in which strategies can vary: as well as discontinuities related to qualitative changes in strategy there can be continuously graded quantitative changes in performance.
Figure 2 : Adaptation of performance to task demands, via choice of strategy
The feedback process by which strategy is adapted to task demands could be summarised as in Fig 2. The operators themselves have to compare their performance with the task demands and to adjust their strategy on the basis of differences [so that task performance matches task demands].
We know very little about these processes of comparison and adjustment. As they interact with the effects of the operator's working capacity they are discussed more fully in d) below. Speed, accuracy, and appropriateness of task decisions and messages are example dimensions of task demands, but we do not know much about the best dimensions and criteria for assessing system performance, or how to measure them (see discussions by Parsons, 1972; Hopkin, 1971).
c) Mental processing capacity and strategy
A particular strategy involves certain mental operations, working storage requirements, etc. An operator will have a certain ability to carry out mental operations and use working storage, so these capacities will determine whether or not they can carry out a particular strategy. The strategies they know are also part of their capacity. We have discussed how an operator could adapt their strategy to the task demands so that they maintain a standard level of mental processing. The availability of these strategies depends on the operator's mental processing capacity.
The operator's potential capacity consists of the mental processing operations and strategies which they has available, considered independently of any particular task demands or time constraints. They will be able to do certain operations, such as mental arithmetic, and to do them at a certain level of performance in terms of, for instance, the speed-accuracy trade-off they can achieve. They will have certain strategies available, and a potential working storage capacity influenced by their knowledge of the task structuring. The operations and strategies which they have available at their best are their maximum potential. How much of this maximum can actually be realised at a particular time is their momentary potential capacity; this is related to their occupational capacity, the rate at which they can work continuously without fatigue. The complex factors influencing potential capacity are discussed more fully in e) and section II B.
When the operator uses this potential in doing a task they achieves a particular performance level, the amount of task work done in a given time; this is their performance capacity. As discussed above, if they can use alternative strategies they may be able to achieve different performance capacities for the same potential capacity and the same amount of mental processing. A given strategy may have a performance capacity limit, which depends both on the number of operations it requires and on the operator's capacity to carry out these operations, but a more economical strategy may have a higher performance limit.
Figure 3 : Adaptation of mental work done to mental capacity, via choice of strategy
The process by which the strategy used is adapted to the potential capacity available could be represented as in Fig. 3 [choice of strategy is adapted so mental work done matches mental capacity available]. The representation implies that the person adapts to maintain an optimal level of mental activity (Hebb, 1955). More economical strategies are used when task demands are high, and less economical ones when demands are low (Parkinson, 1958). The diagram also implies that the person uses all their potential capacity available. This of course requires optimal motivation, but motivation can be included as one of the many factors affecting momentary capacity.
It should be clear from this discussion that we are a long way from any simple notion that mental processing capacity can be measured adequately by the information 'bits' processed per second.
d) Interaction of demands, capacity and strategy
Figure 4 : Choice of optimum working method depends on task and personal factors
The two previous sections have discussed how strategy is adapted to task demands and to individual capacity. Whatever strategy is used it affects both the task performance achieved and the mental processing required. The two types of adaptation therefore interact, adaptation of strategy to capacity will affect achieved performance and therefore adaptation to task demands, and vice versa. This interaction of the four main factors, via choice of strategy, can be represented by the diagram in fig. 4. An operator has to maintain a dynamic equilibrium between the two demands on the strategy they use.
[For possible additional links between the loops via effects of task demands on capacity, see (e).]
We know next to nothing about the criteria an operator uses in assessing the situation and adapting their strategy. As these criteria are probably complex, adaptation in relation to them will be complex too. The choice between alternative behaviours may be made unconsciously. Sperandio (1972) notes that when task demands increase experienced operators adapt to a more economical strategy earlier than inexperienced operators do. Bartlett (1943) says that fatigued operators are not aware that their methods have changed and their performance has deteriorated. This suggests that their ability to assess their performance has also been affected by fatigue.
So long as a strategy is available which both produces performance which matches the task demands and is also within the operator's capacity, then all is well. The task becomes difficult or 'stressful', when it is no longer possible to produce the required performance, when the amount of mental processing necessary for this performance is not possible within the time or other capacities available. It may be possible to work harder than normal (see discussion of arousal in section II B). If this increased response is inadequate, or is required for some time so leads to fatigue (see II B), then performance deteriorates. Little is known about the way in which performance does break down. It is generally assumed that as a first stage the quantitative changes in individual task operations are taken too far : speed is increased so that too many mistakes are made, or messages become too cryptic to convey information. Bartlett (1943) however, in his study of fatigue, says that the overall organisation of behaviour breaks down first, the individual operations are carried out adequately but they are done in the wrong sequence or at the wrong time. The allocation of time between tasks breaks down so that infrequent tasks are omitted completely, concentration is centred on a few aspects (Davis, 1948). Working storage may become disorganised so that values are checked unnecessarily frequently (Welford et al. 1950). Of course, performance does not necessarily breakdown in the same way under conditions of fatigue and of overload.
e) Capacity as a function of task demands
The operator's potential capacity consists of the strategies and individual operations which they have available for use. This capacity can be affected by the task demands, workplace design and environmental conditions, in four ways. Task details and workplace design can constrain the strategy used, and the capacity for processing individual operations can also be affected.
i) The potential capacity for a particular operation, for instance the minimum time in which it can be done, can be affected by the workplace design. Environmental conditions can affect potential performance on such operations as signal-noise discrimination.
ii) The operator's basic capacity for performance, in general terms of speed-accuracy trade offs which can be achieved, working storage capacity available, etc. can be affected by the level of task demands and the environmental conditions acting as stressors (see discussion of arousal in B).
iii) If several tasks must be done at the same time, then the capacity available for any one of them may be constrained by the mental operations required by the other tasks.
These concurrent tasks may be completely independent, the effect on behaviour may then depend on whether the same sense organs, effectors or mental processors are required. For example, Allport et al. (1972) found that their subjects could sight-read piano music and do an auditory shadowing task at the same time.
On the other hand, when all the tasks use visual displays or verbal response, attention must be divided between them. When independent tasks cannot be done at the same time, working storage capacity may be needed to keep track of the other tasks while one is attended to.
Alternatively the concurrent tasks may be interdependent, for example the different aircraft to be handled in air-traffic control, or the tasks may be sub-operations of a different type within a main strategy, for example predicting future events in process control as a part of choosing the best control action. In these tasks the common context can reduce the working storage needed : many of the items remembered may be common, for example the relative positions of aircraft in space, or there may be strong redundancies so that the value of one variable implies the possible range of others - this is particularly true in process control. Unfortunately very little is known about what tasks can be done simultaneously with adequate performance, or about the extent to which practice and aptitude affect this potential capacity.
As details of the task demands context can affect the operator's potential for both the strategies which can be used and the possible performance of individual operations, this adds a further interactive complexity to the relations represented in fig. 4.
f) Stress and strain
We can now return to the introductory statement that the amount of mental work done by an individual, and the performance they achieve, depend on the task demands in relation to their work capacity. Some investigators equate this with the simple relation found in many physical systems :
capacity = output (strain) / input (stress).
The notions of stress as imposed load, and strain as the effect of carrying that load, have certainly proved useful in work physiology, which has also contributed the distinction between maximum and occupational work capacities. The superficial relevance of these terms to the study of mental load is misleading however. For example, Luczak (1971) suggests the equation :
mental working capacity = f (mental strain) / g (mental stress),
and implies that if any two are measured then the third can be calculated. There are three reasons why this type of equation is inapplicable to mental work :
- the amount of mental work (processing) done is not necessarily equivalent to the output performance achieved, so at least four factors are involved,
- the relation between these four factors is not a simple linear one,
- it is difficult (perhaps impossible) to give independent definitions for the factors.
Complexity of the function. This has been discussed in detail above. The amount of mental processing work and the performance achieved are not simple continuous functions of the level of task demand, as there can be qualitative as well as quantitative changes in the strategies used before performance begins to deteriorate. The choice of strategy also depends on the individual's working capacity, which can itself be affected by task demands, so that all four factors interact, via strategy, in a complex adaptive way. This is made even more complex by the factors affecting individual capacity discussed in section B. In physics and physiology the relation between potential capacity and performance capacity is at least monotonic, even if non linear, while in mental work this is not true. In addition, when the change in mental task strategy is quantitative, then this change can be described by numerical methods, but if a qualitative change in information processing operations is involved, then a numerical description is inadequate. We can now see the difficulty with defining 'mental load', as it involves all these aspects.
Difficulty with independent definitions. It is quite difficult to describe the demands of a task in a way which does not include any account of the operator's response to these demands. The required input/output relations, and their time constraints, must be quoted without specifying how the transformation between input and output is achieved, or the definitions must be very global as in the safety criterion of air-traffic control. Any description of the details of particular task contexts or the workplace design immediately constrains the behaviour of the operator, in the methods they use or their capacity to perform individual operations, and so affects the mental processing work they must do and the performance achieved (see e above). This suggests that it would be artificial and misleading to attempt to define independently the four factors of task demand, potential capacity, amount of mental work done and performance achieved. Certainly no un-confounded measures can be obtained.
The standard method around this problem, of measuring the effects of a task on an operator in a way which is independent of the task demands, is to measure 'strain' by changes in either
- physiological variables such as sinus arrythmia or eye-blink rate or in
- performance on secondary concurrent tasks such as mental arithmetic.
While these do provide solutions if independence is necessary these methods have their own difficulties. Physiological measures can be confounded by physical aspects such as body movement, while secondary tasks have to be chosen carefully so that they do not interact with the primary task. The measures obtained tend to measure effects beyond the point at which the task demands can no longer be met easily by the operator. For a full understanding of the effects of task demands on behaviour we need to know about the changes in strategy and processing operations which occur adaptively before performance breaks down, and physiological and secondary task measures give little clue to the nature of these aspects.
Inversely, this discussion of the adaptability of mental processing behaviour suggests a possible reason why it has been so difficult to validate physiological measures of mental 'strain'. Perhaps the variations in task demand used in these tests have not led to a continuous and monotonic change in task difficulty, so these supposed task demand changes have not produced a correlated change in the 'strain' measure used.
B) Work Capacity Of The Individual
We have seen how an individual's work capacity both affects the strategy they use and interacts with this strategy to affect the amount of mental work done and the task performance achieved (fig. 3). We have also seen how the adaptive interaction between task demands, individual capacity, mental work done and performance achieved (fig. 4) is complicated by the way in which details of the task demands can constrain the individual's capacity for strategies and operations (cf. e). We must now mention some more factors which affect the individual's capacity and further complicate the nature of their response to a particular situation. The task demands and amount of mental work done may affect momentary potential capacity by affecting arousal level and motivation. In the long term the amount of mental work done will also affect potential capacity via practice and fatigue. Although these factors will be discussed separately here they are not necessarily independent.
Arousal. Arousal has a very complex effect on capacity and performance, which is not yet well understood. We know that performance of simple task operations is related to task demands by an inverted U-function : as task demands increase performance increases but only up to a maximum, at higher demand levels performance decreases (Alluisi et al. 1957; Cumming and Croft, 1973). Task performance also varies with level of arousal in a similar function (e.g. Broadbent, 1971). Experimental studies of this function have measured changes in performance. I assume here that these changes occur because of changes in processing capacity; the majority of these experiments have studied simple single operation tasks, in which this would certainly be true. Whether these changes in capacity are basic changes in potential, or simply shifts in the parameters of performance, i.e. quantitative strategy changes, is an important point which has not been properly established. As both task demands and arousal are related to performance by a similar function this suggests that they might have their effect via a similar mechanism, for instance if task demands affect arousal which affects capacity. Welford (1973) and Mackworth (1970, ch. 2) have used signal-detection theory to account for the effects of arousal, this type of theory can easily be related to a discrimination capacity.
Some additional points complicate the arousal effect. Individuals vary in their intrinsic level of arousal. External stressors such as noise and sleeplessness alter an individual's level of arousal. Finally, the level of incentive also has an inverted-U effect on performance, and the optimal level of incentive depends on the difficulty of the task; it is lower for difficult than for easy tasks. This suggests that task demands, incentives and other external stressors combine together with the individual to affect performance capacity via some arousal mechanism. If this arousal mechanism does involve a basic capacity change then this adds another dimension on which the individual can adapt to the task demands.
Motivation. Arousal level is related to motivation, that is the operator's willingness to use their potential capacity optimally. This is not a factor which can be ignored by the ergonomist, as the nature of work can affect relations in the working-group, particularly communication and leadership, and group relations can affect job satisfaction and job attitudes, and so can affect task performance (Trist and Bamforth, 1951). Studies of job attitudes in industry have been almost entirely concerned with unit assembly type tasks, when a worker's concept of a reasonable day's work has a much greater effect on output than their potential capacity. The motivation of operators involved in continuous processes, with considerable responsibility for equipment, output and perhaps human lives, may be rather different, but we know very little about this. It is generally assumed that work with more opportunities for individual choice gives greater job satisfaction, or inversely, that if an operator has a greater potential capacity than is required in their job this can be a source of dissatisfaction, but the effects are not quite a simple as this. Coeterier (1971), for instance, has suggested that in a situation of high flexibility the good controllers perform better, while the poor controllers perform worse. This interacts with the problem of arousal and job difficulty, as for poor controllers the job is more difficult (see next section).
Practice and experience. We have already discussed the way in which, if an individual can use alternative strategies, this may give changes in performance capacity for the same potential capacity and amount of mental work done. An important further effect is that when an operator improves or learns new strategies through practice or experience they will not only increase their performance capacity but there will be a real increase in their potential capacity.
Practice can lead to improvement at all levels of strategy and processing operations. For example, practice can make quantitative changes in the performance of individual operations, speed-accuracy trade-offs can improve so that faster more accurate performance is possible (Rabbitt, 1970; Pew, 1969). Individual operations may become integrated into automatic sequences (Bryan and Harter, 1899; Pew, 1966). Working storage capacity can improve : codes can become more efficient (Miller, 1956), or more items can be remembered when task organisation is better understood. For example, inexperienced operators in the power control task (Bainbridge, 1974) did not realise that some items were common to different parts of the task. Bisseret (1970) found that when the number of aircraft to be handled increased from 5 to 11, experienced air-traffic controllers remembered more aircraft, while inexperienced controllers remembered fewer.
Experience can also lead to the development of entirely new strategies. In the power control task (Bainbridge, 1972) inexperienced operators used feed back control, they first noticed a control error then decided what to do about it. By doing the task over a period of time they learned the effect which any action would have on the control error, and also learned about the autonomous furnace changes which could affect control error. They were then able to predict the future behaviour of the furnaces, to anticipate the need for control action and to act immediately when necessary, so using predictive control. Most of the operators using this predictive strategy assessed the control error by judgement, into categories of alright, above, action required, etc. The best controller, who had considerable process knowledge and was able to do rapid mental arithmetic, was able to calculate the rates of power usage exactly, thus removing the remaining uncertainty about the need for control action. These examples show how greater task knowledge and ability to do more complex operations can actually make the task easier, the process is more fully understood and so events are anticipated and more fully under control. Experience can therefore have a positive feedback affect on capacity, the better someone is the better they get.
The natural aptitude of an individual for particular types of operation has an effect on potential capacity which is similar to the effects of practice. This basic ability can be affected by such factors as age, physical fitness, illness, and diurnal and other physiological rhythms.
Fatigue. Continuous work at maximum capacity can be harmful, and leads rapidly to exhaustion. An operator usually works at some level below this, some level at which they can work continuously without fatigue, this is their occupational capacity (a term from work physiology). They can work at higher capacity for short periods in an emergency, though sudden changes in workload may not be responded to properly. Fatigue comes from the effects of working in non-optimal conditions over a period of time, whether this non-optimality comes from the amount of work to be done or the working conditions. The effect may vary with the individual. The mechanism of psychological fatigue is not well understood. From the above discussions of capacity and arousal it should be clear that we do not know enough about the nature of mental capacity to be able to specify the characteristics of a mental 'occupational' capacity.
In his classic study of fatigue and complex performance Bartlett (1943) says that it is the overall organisation of behaviour, the strategy aspects of performance, which break down first. The right thing is done at the wrong time, or vice versa. As aspects of the task which might otherwise have been done correctly are now not within their expected context, for instance the cues needed do not arrive at the right time, such timing and sequential disruptions can have a rapidly destructive effect on performance.
However complex and interacting the factors affecting the difficulty of mental tasks may be in theory, practical techniques are still needed for assessing and predicting mental task performance. These techniques will be discussed in the following section.
III. TECHNIQUES FOR WORKLOAD PREDICTION
[Note, 2021 - This paper was written nearly 50 years ago. Obviously the specific techniques referred to in this section are out-of-date. However I suggest that many of the general points made are still important - if nothing else this section shows how complex the issues are, with so many unknowns and interactions.]
We need to understand the factors which affect mental task behaviour for practical as well as for theoretical reasons. In practice we might be concerned with assessing the general difficulty level of a job for payment purposes, or with the performance level of an individual for bonus or promotion, or with predicting operator performance as a tool for assessing alternative man-machine interface designs in a complex system. We will go on to discuss this last requirement. Good techniques for obtaining the more general measures must be assessed by rather different criteria, as they must be applicable quickly and reliably by non-ergonomists. Kitchin and Graham's (1961) method meets many of the criteria discussed below for more complex techniques.
The human operator aspects of complex system designs can be assessed in two ways : by predicting task performance on the basis of theory and data, or by building an experimental mock-up of the task situation and measuring actual operator behaviour. Usually the mock-ups tested have been chosen on the basis of previous theoretical predictions. These two approaches involve very different problems and techniques. We will concentrate on the a priori prediction techniques. The mock-up test does at least have the advantage that real performance can be measured, whatever the problems of measurement, evaluation and experimental control. Parsons (1972) gives a full and sensible analysis of methodology for such large scale experiments.
In studying mental behaviour in an existing task, as the mental operations are unobservable, the best we can do is to make a synthetic description of the operations assumed to underlay behaviour which can be observed : eye movements, control actions or verbal reports. Many of the air-traffic control and process control studies mentioned above are of this type. When the task does not exist, but is in the predesign stage, we have the inverse problem of first predicting the underlying mental behaviour and then predicting the overt behaviour which this will produce. Obviously our methods for making these predictions will rely heavily on techniques used and processes identified in the studies of existing tasks. To make a complete prediction of mental behaviour in a given situation we would need complete knowledge of all the factors affecting it which have been discussed in Section II, in sufficient detail to account for a particular individual in a particular environment at a particular time. If we had this knowledge the task of complete prediction would probably be beyond the limitations of logic, quite apart from the question of whether complete predictions are desirable. This means that complete prediction is impossible now through lack of knowledge, and may be impossible in the future through too great complexity. This situation may be acceptable to the academic, but is not much help to the practitioner. Instead of giving up at this point we need to discuss what predictive techniques are acceptable given our present limited knowledge, and what special features are needed in these techniques because of our lack of knowledge. We can also comment on the limits to the techniques which will be worth using in practice even when our knowledge of mental behaviour has increased.
There is a classic procedure underlying most techniques for predicting complex system performance and designing the workplace. The first stage is to specify the task demands ('task description'). The operating method by which these demands are met is then specified ('task analysis'). The performance achieved by this method is predicted. The predicted performance is compared with the required performance, and if necessary the operating method is revised until performance is acceptable. The workplace is then designed to the method specified. This is a sequential procedure in which each stage is progressively expanded into more detail. We will use these stages as a framework for our discussion of acceptable techniques.
a) Specifying task demands
It is necessary to clarify the task context at two levels. Firstly, the required overall system output must be specified, for instance the permitted configurations of aircraft in the sky, or the product to be made. Methods of specifying overall system performance do not come within the issues discussed here. Secondly, the performance which must be achieved by the human operator in the system must be specified, i. e. the input/output transforms which must be made, the time-limits and tolerances on these transforms, etc. What the operator must do in detail depends on how tasks in the system have been allocated between person and machine, and on the workplace design. The best form of this allocation and design depend in turn on whether the task they present to an operator can be done by that person so that overall system output is acceptable. Consequently, it is not actually possible for the designer to consider human-machine allocation, task demands, operator methods, performance and workplace design in sequence. The interactions between these different aspects of system design must be considered at the same time.
In process control the product depends on physical-chemical conditions which must be produced by machine. The task presented to the operator can be described in terms of these process dynamics, with associated tolerances and bandwidths. These dynamics can be described by the control loops available to the operator (Beishon, 1967) or by the process states they can transform (Cuny and Deransart, 1972). If the operator is provided with decision making aids, either automatic controllers or a computer, then assessing the task demands facing the operator is more difficult. In any task involving decision making or supply of power, which could be done by either person or machine, the interplay of the different design aspects becomes important.
b) Mental task operations
Although developing a system design is a dynamic interactive process rather than a sequence of stages, the different aspects of the design process still need distinct techniques. We first need procedures for predicting and describing the methods by which the operator meets the task demands. In theory it is impossible either to predict the methods an operator uses to do a task, or to describe the mechanisms underlying these methods. No general specification of task demands is sufficient to predict a priori the heuristic methods a human being will use to do a task (see Bainbridge, 1973). These methods can only be identified empirically, and present techniques for doing this are inadequate. We do not know how knowledge of the external world is represented in the human head, or how this knowledge is used in behaviour. Task knowledge is not necessarily a simple mapping of the real world, it may be a function of many organisational processes which are not yet understood. Leplat and Pailhous (1972) discuss the dangers and difficulties of trying to identify an operator's mental model of the process.
Practical necessity however makes us look to see how these theoretical difficulties can be sidestepped. The standard procedure, for dealing with a situation which is not sufficiently well understood to generate a theoretical solution, is simply to use a technique which works. So long as this technique has been validated properly, then we know the range of situations in which it is useful and the probability of error. The little theoretical understanding we do have can be used to optimise the technique as far as possible.
The operator's strategies must be both predicted and described. A review of all the available techniques for describing mental tasks would fill a separate paper, we will make some general comments here.
Prediction. The task demands will partly constrain the methods or strategies the operator can use, the less the flexibility in the environment the greater the constraint. The discussion in Section II suggests that it is particularly important in flexible situations to identify alternative strategies which can give an equivalent result for different mental work. It is not possible to pre-specify all possible methods and sources of error (except perhaps in trivial tasks with a small number of different states), but we should be able to provide guidelines about alternatives; for instance in process control, feedback or predictive control can be used and quantitative assessments can be made by judgement or numerical calculation. Unfortunately we have not yet got sufficient data on complex tasks to make very adequate suggestions for optimising these predictive guidelines.
In many examples of predesign studies, only one sequence of operations has been prescribed for each task situation, some even assume only one sequence of situations. The discussion of adaptation to task demands suggests that prescribing one sequence is inadequate. (Except in special cases, such as aircraft start-up procedures, where the prescribed sequence is so important that special methods for enforcing it have to be used.) This prescriptive approach may have occurred for several reasons. In general there have been two distinct groups of people, one group working on predesign and making overall measures of system performance, the other group studying what people doing real tasks actually do. Perhaps only the second group realise that the operator never does what the designer intended ! Workplace design does become much more difficult if one accepts that different strategies may be used; as Sperandio (1971) points out, the interface must then be designed to facilitate all the strategies.
In addition, much of the predesign work has been done on discrete sequential tasks, such as pilot start-up procedures, or responding to an enemy aircraft. The detailed studies of operator methods have been concerned with air-traffic control, maintenance, and process control. In these tasks there is a complex interacting situation which may never be repeated exactly, so no set sequence of behaviour is possible. There may be more or less standard routines for particular task components, and frequently occurring situations may have over-learned standard responses. The most important aspect of these tasks, however, is the choice of behaviour appropriate to the present context. To make this choice the operator needs an overall knowledge of the operations and effects available (Cuny and Deransart, 1972; Bainbridge, 1974).
Description. Different techniques are needed to describe different aspects of the operating methods : the standard sequences in the task, the process of identifying the present situation and choosing appropriate behaviour, and the choice between alternative ways of obtaining an equivalent result.
Standard information processing routines can be described by the algorithmic flow diagrams used in describing computer programs. Special versions have been developed for use in task analysis, these include the displays and controls used in each operation (cf. Kurke, 1961). This is useful in interface design; it is also important as the main task units must be defined by observable events (Baker, 1971) as these are the only points at which behaviour can be checked empirically.
The choice of behaviour appropriate to a context is described by Bond and Rigney (1966) as if this choice follows a Bayesian procedure. This is inadequate as it does not represent the deterministic aspects of the worker's knowledge of task constraints and set sequences. The process of choice can be described by conditional statements, again as in computer flow diagrams. The organisation of working storage is an important aspect of the effect of present context on choice of behaviour, this working storage can be incorporated in the conditional statements (Bainbridge, 1974). The alternative strategies available for obtaining an equivalent result can best be included by describing the task not as a sequence of operations but as a sequence of sub-goals, each with a list of alternative methods for reaching this goal (Rigney and Towne, 1969; Bainbridge, 1974). An additional technique is needed to describe the process of decision between these alternatives. Descriptions which include conditional statements and sub-goals avoid the problem of specifying individually all the possible behaviour sequences. Instead they give an overall description from which all the operating sequences could be generated (in theory, the number might actually be infinite). This description also includes the decisions at choice points, and shows each section of the behaviour in its context. The flexibility of the behaviour is therefore described by a technique which itself includes this flexibility.
Discussions of methods for describing mental tasks usually distinguish between discrete and continuous tasks. This had not been done here. Certainly, when considering the control of fast response systems such as aircraft and car, muscle and nervous system dynamics are important aspects of the human behaviour and special descriptive techniques are required. In continuous control of slow response systems the control actions are often physically trivial, and the decisions to make these actions can be described by the tools used for discrete tasks. It is also necessary to account for behaviour which is frequently repeated, for instance the sampling of controlled variables, but this can be included as part of the choice between alternative activities.
Validity. We have now reached a stage where even a practical approach has lead to quite a complex description of the mental task operations. There are two further important questions about these techniques; we need to know both how valid the techniques are as a simple description of mental behaviour and also how complex a description is sufficient for particular purposes. These questions do not seem to have been investigated much, instead there has been considerable complacency about the adequacy of flows diagrams. If the flow diagram technique is valid then it should predict operator behaviour to within a correct order of approximation. (The validation of more complex descriptive techniques which claim to account for the actual mental mechanism used, as in computer simulation of cognitive processes, is a much more difficult problem.) Leplat and Bisseret (1965) have found that the length of flow diagram needed to describe a strategy does correlate with the time taken to do the task in an experiment. They discuss the difficulties of controlling an experiment so that this correlation can be tested. Subsidiary to this overall validity is the question of how complex a description must be to be sufficiently valid for different purposes. We know nothing about this. The complexity needed probably depends on the stringency of the tolerances on operator performance. Under some circumstances a single sequence might be adequate. We would need guidelines for selecting the characteristic sequences to use in simplified descriptions.
Types of mental operation. It would be useful to have a taxonomy of types of mental operation, to use both in describing the task activities and in the next stage of predicting performance. This taxonomy should include the information processing operations and memory structures a human being can use, and their capacities and limitations. Many writers have realised the importance of this and suggested a set of operations, e.g., Boulding (1956),Cotterman (1959), Crossman (1965), Edwards (1967),Fogel (1961), Haggard (1963),Miller (1962, 1967). A few of these schemes have a theoretical basis ; Crossman suggests operations which are compatible with the description of computer processing, Edward's is related to decision theory. The other schemes are based on experience with complex tasks; they usually imply that the operations are named in increasing order of difficulty, but the mechanisms underlying the operations are usually not discussed. Most of these schemes have had little impact, perhaps because they do not have strong theoretical justification or tested empirical validity. A necessary and sufficient taxonomy should emerge from detailed study of the operations actually used in real tasks. This suggestion may appear to beg the question, as mental operations cannot be observed so any description of the behaviour must be made using assumed components. The way out of this circle is to find the operation types which best give an account of the behaviour, and which recur. Wortman (1966) for instance found that he needed four types of memory structure to describe the process of medical diagnosis : lists, list structures, paired associates, and chunks.
c) Performance parameters
Once the operating methods have been identified, the classic technique for predicting task performance is to predict the performance (typically the time taken) for each task operation, and to add these together. In predicting performance we are again faced with a problem of complexity. The particular task context, the environment and the individual, all interact to affect the performance achieved on individual operations, and the choice of strategy and use of working storage to integrate these operations in correctly timed meaningful sequences. We need to specify not only normal performance, but also the characteristics of performance breakdown, when and how this occurs. Again we must discuss what data would be necessary and sufficient for practical purposes, rather than attempting completeness.
Many existing techniques concentrate on predicting performance of individual task operations, basing the predictions on either experimental data or theoretical equations. The whole of the discussion in Section II shows that strategy changes are a major aspect of performance however, so that these must be considered in the prediction. It might seem difficult to predict performance if more than one sequence of operations might occur. This can be dealt with by making predictions for each of the sequences, so obtaining a sample distribution of performance predictions (see discussion of fast-time simulation below). Alternatively, one could suggest that a simple technique for assessing task difficulty would be to ignore the times of individual operations, and simply to count the number of operations in a strategy plus the number of items of working storage required. If the strategies used actually have a more powerful effect on performance than the factors affecting details of the individual operations, then this would give a simple ordinal measure of the difficulty of the alternatives.
If we do want to include the individual operations then we can make the same simplifications as used in work study (e.g. Methods Time Measurement) and in work physiology. We can use data for populations, rather than predicting for individuals, and we can simply add predictions together ignoring sequential effects, as in almost all cases this will give a conservative (over large) estimate of the task requirements.
Predictions for populations. Instead of attempting to predict individual performance we can use data on, or equations describing, the expected average and variability of performance in the population. Of course we can only use population data if the mean is a valid representation of the population, the data is normally distributed with small variability. Otherwise it is necessary to use data from sub-populations, and we need to identify the dimensions of the population on which a change makes a significant difference to the data. Both the personality type of the operator and the nature of the task may be important. It might be useful to distinguish between sources of performance variability which the ergonomist cannot control, such as minor illness and domestic worries, and those on which he could control the performance of the operator to within required limits - by selection, training or workplace design.
We know in a general way that extroversion, introversion and conservatism affect speed, accuracy, and tolerance of monotony, and that intelligence affects speed and capacity, but we do not really have much data, in a form which can be used by ergonomists, on which personality dimensions have a differential effect on which aspects of behaviour under which circumstances. We do have some knowledge about the effects of environment and task operation type on time taken and accuracy, but we know very little about the nature of strategies and how they break down. We can sidestep an important theoretical issue here - the extent to which measures such as time or errors correlate with subjective feelings of difficulty or concentration, as we are interested in predicting performance. Some example standardisations do exist. Cooperband and Alexander (1965) use statistical decision theory to predict limiting performance capacities in signal/noise discrimination tasks. Pew (1969) finds that reaction time varies with
log (p (correct) / p (error)) in the speed accuracy tradeoff. The sampling theorem can be used to predict sampling rates in a continuous task (Senders, 1966), although this cannot be applied to many real situations as the system dynamics are often not known.
Obviously a great deal of work is needed to establish performance data for standard populations. This data should relate to the taxonomy of operation types mentioned above. In obtaining these data it is not at all adequate simply to suggest dimensions of task difficulty, or to suggest operations to measure these dimensions, or to suggest abstract laboratory tasks to reflect these dimensions, and so do an experiment to get standard data. All these suggestions must be very carefully validated and shown to give significant differences, or one can go on making yet more complex suggestions without ever showing that they make a useful contribution to measurement, or that the results can be extrapolated to a real situation. Probably, at this stage of our knowledge, detailed studies of real tasks are more likely to show the dimensions of difficulty than simple laboratory experiments without this basis.
Pragmatic techniques. As we have little data on standard populations, these populations do not provide an existing tool for performance prediction. Workers in this area who have had to devise a predictive technique have sensibly side stepped the need for real data or an understanding of basic mechanisms by using pragmatic equations to make their predictions. Indeed, so long as the equations have been properly validated, they may actually be less misleading than some equations which do originate from currently available human performance data; these can have a spurious appearance of validity when they are really oversimple in a complex context.
Validation is a tool which can be used to increase the power and efficiency of a practical technique. A pragmatic technique is usually accepted as valid when its assessment of the difficulty of several tasks correlates with assessments made by human factors experts. It is insufficient to use a single correlation to judge the usefulness of a complex technique, as this does not indicate how parts of the technique contribute to its success. All the well developed techniques for establishing a valid and reliable measurement scale become applicable (Anastasi, 1968). Measures made by the different parts of the technique should be correlated individually with the external assessment, only the parts which correlate highly are then used in future. For instance, in the Siegel et al. (1964) DEl technique, the different equations included are weighted by factors to give maximum agreement with the specialist assessments. Measures made by the different parts of the technique should also be correlated with each other, if several parts of the technique give measures which correlate highly then only one of them is necessary to represent this aspect of the measurement.
Fast-time simulation. Using a computer can considerably simplify the work involved in making complex predictions. A computer program can incorporate the large quantity of data on alternative strategies and parameters which are being used to predict the operator's performance. Both the operator/controller and the equipment they use can be simulated in a program which can be run to give a performance prediction. It is easy to make a large number of such runs to obtain a sample distribution of performing predictions. For example Rigney and Towne (1969) could use their sub-goal representation of task structure (see above) to generate all possible sequences of behaviour. The ability to make a large number of runs is also useful as the parameters in the representation of the person or the machine can be varied, to see what effects this has on performance. Siegel and Wolf (1961) simulated timing, probability of success and constraints on timing in a task sequence. They varied speed and stress thresholds to predict the frequency of task failure for different types of individual. They have extended this (1969) to include the effect of team relationships. Of course, the sample of predictions obtained in this way is only useful if the representation of the operator in the simulation is valid. When parameters such as 'fatigue' or 'social cohesiveness' in a team are varied, then the usefulness of the result depends on the validity of the way in which the parameter and its effect have been represented.
When we do have a better theoretical understanding of mental task behaviour, and more adequate performance data, these will presumably be used in simulation-type predictions. We need to assess the point at which adding further complexity to these simulations makes no difference to the discriminating power of the technique, or has no additional cost benefit. The money and time costs of more detailed techniques will be more worthwhile the more stringent the tolerances on acceptable performance.
d) Evaluating prediction against required performance
Detailed predictions are usually made in order to identify and resolve points in a task at which difficulty and overload occur, so that bottlenecks and slowing of performance can be removed. The classic method of identifying overload is to predict a time-line of task demands, and a parallel time-line of mental operations required, then to assess the time needed for each of these operations and add these times together. If the time needed is more than the time available, then this indicates overload.
This is a very simple but certainly practical approach. One objection is that other dimensions of performance assessed by other criteria, such as accuracy or relevance, might give a different assessment of the areas of task difficulty. We do not know to what extent such differences might arise, or how important they might be. More fundamentally, these predictions should not be made for a single prescribed working method, but must allow for the effects of strategy changes on workload, and for quantitative changes in task operations which may gradually develop into unacceptable performance.
In general conclusion, if we want our techniques to be based on theoretically acceptable notions, rather than using ad hoc methods which have been shown to work, it is obvious that a great deal of detailed empirical research is needed, following on from the studies described in II.A.1. We need to know more about mental strategies and types of information processing operation, and we need valid and sufficient generalisations about the performance that can be achieved when they are used.
-
Access to other papers through Home page
©2021 Lisanne Bainbridge
Comments
Post a Comment