Processes underlying human performance : I. using an interface, the bases of classic HF/E
This chapter was written in response to a request to cover the whole of cognitive ergonomics in 30 pages (I didn’t succeed !) for a handbook of aviation human factors.
In this first section the figures are fairly conventional, but the text is not.
Note on interface technologies : this was written in the early 90s, when graphical interfaces were quite primitive - lines and alphanumerics made from individual pixels large enough to be visible, and no touch screens.
The chapter is presented here in 3 sections, though they are meant to be read together.
I. Understanding the interface, the bases of classic human factors/ ergonomics.
III. Mental workload, learning, and errors.
Topics in this section on interface design :
A. Detecting and discriminating
1. detecting
2. discriminating between stimuli
3. absolute judgement
4. sensory decision making
B. Visual Integration
1. movement, size and colour constancies
2. grouping processes
3. shape constancy
C. Naming, and simple action choices
1. interdependence of the functions
2. shape, colour and location codes for name and status
3. size : size codes
interface size : actual size ratio
making comparisons between sizes
direction of movement to meaning
4. reaction times
D. Action execution
1. acquisition movements
2. control or tracking movements
E. Summary and implications
Theory
Practical aspects
Processes underlying human performance : I. using an interface, cognitive processes underlying classic HF/E
Lisanne Bainbridge
Department of Psychology, University College London
August 1995
Published in Garland, D.J., Hopkin, V.D. and Wise, J.A. (eds) Aviation Human Factors, Erlbaum, November 1998.
INTRODUCTION
Two decades ago, a chapter on aviation with this title might have focused on physical aspects of human performance, on representing the control processes involved in flying. There has been such a fundamental change in our knowledge and techniques that this chapter will focus almost exclusively on cognitive processes. The main aims are to show that relatively few general principles underlie the huge amount of information relevant to interface design, and that context is a key concept in understanding human behaviour.
Classical interface human factors/ ergonomics consists of a collection of useful but mainly disparate facts and a simple model of the cognitive processes underlying behaviour - that these processes consist of independent information-decision-action or if-then units.
(I use the combined term human factors/ ergonomics, shortened to HF/E, because these terms have different shades of meaning in different countries.
'Cognitive' processing is the unobservable processing between [or before] arrival of stimuli at the senses and initiating an action.)
Classic HF/E tools are powerful aids for interface design, but they make an inadequate basis for designing to support complex tasks. Pilots and air traffic controllers are highly trained and able people. Their behaviour is organised and goal-directed, and they add knowledge to the information given on an interface in two main cognitive activities : understanding what is happening, and working out what to do about it.
As the simple models of cognitive processes used in classic HF/E do not contain reminders about all the cognitive aspects of complex tasks, they do not provide a sufficient basis for supporting HF/E for these tasks. The aim of this chapter is to present simple concepts which could account for behaviour in complex dynamic tasks and provide the basis for designing to support people doing these tasks. As the range of topics and data which could be covered is huge, the strategy will be to indicate key principles by giving typical examples, rather than attempting completeness. This chapter will not present a detailed model for the cognitive processes suggested, or survey HF/E techniques, and it does not discuss collective work. The chapter will be in three main sections on : simple use of interfaces; understanding, planning and multi-tasking; and learning, workload and errors. The conclusion will outline how the fundamental nature of human cognitive processes underlies the difficulties met by HF/E practitioners.
USING AN INTERFACE : CONTEXTUAL AND COGNITIVE PROCESSES UNDERLYING CLASSIC HF/E
This chapter distinguishes between cognitive functions or goals, what is to be done, and cognitive processes, how these are done. This section starts with simple cognitive functions and processes underlying the use of displays and controls, on the interface between a person and the device they are using. More complex functions of understanding and planning will be discussed in the next main section.
I take the view that simple operations are affected by the context within which they are done. Someone does not just press a button in isolation : for example, a pilot keys in a radio frequency as part of contacting air-traffic control, as part of navigation, which is multi-tasked with checking for aircraft safety, etc. From this point of view, an account of cognitive processes should start with complex tasks. However that is just too difficult. Here, I have started with the simple tasks involved in using an interface, and point out how even simple processes are affected by a wider context. The next main section builds up from this to discuss more complex tasks.
Five main cognitive functions are involved in using an interface :
* discriminating a stimulus from a background, or from other possible stimuli. The process usually used for this is decision making.
* perceiving 'wholes'. The main process here is integrating together parts of the sensory input.
* naming,
* choosing an action. The cognitive process by which these two functions are done (in simple tasks) is recoding, i.e. translating from one representation to another, such as (shape : name), or (display : related control).
* comparison, which may be done by a range of processes from simple to complex.
Because discriminating and integrating stimuli are usually done as the basis for naming or for choosing an action, it is often assumed that the processes for carrying out these functions are independent, input driven, and done in sequence. However, the discussion will show that these processes are not necessarily distinct, or done in sequence, and that they all involve use of context and knowledge.
This section will not discuss displays and controls separately, as both involve all the functions and processing types. Getting information may involve making a movement such as visual search or accessing a computer display format, while making a movement involves getting information about it. The four sub-sections here are on :
detecting and discriminating;
visual integration;
naming, and simple action choices;
action execution.
A. Detecting and Discriminating
It might be thought, because the sense organs are separate from the brain, that at least basic sensory effectiveness, the initial reception of signals by the sense organs, would be a simple starting point, before considering the complexities that the brain can introduce such as naming a stimulus or choosing an action. However sensing processes turn out not to be simple : there can be a large contribution of prior knowledge and present context.
This part of the chapter is in four sub-sections, on : detecting; discriminating one signal from others that are present, or not present (absolute judgement); and sensory decisions. It is artificial to distinguish between sensory detection and discrimination, although they are discussed separately here, because they both involve (unconscious) decision making about what a stimulus is. In many real tasks, other factors have more effect on performance than any basic limits to sensory abilities. Nevertheless, it is useful to understand these sensory and perceptual processes, because they raise points which are general to all cognitive processing.
Figure 1 : The change in sensitivity of the eyes develops over a period of time in darkness. [After more time in darkness, people are able to detect dimmer lights.] There are two curves, corresponding to adaptation in colour vision [early] and in black-and-white vision at low light intensities [later].
1. Detecting
'Detection' is one of those words which may be used to refer to different things. In this section I use it to mean sensing the presence of a stimulus against a blank background. Detecting the presence of light is an example. A human eye has the ultimate sensitivity to detect one photon of electromagnetic energy in the visible wavelengths. However, we can only detect at this level of sensitivity if we have been in complete darkness for about half an hour (Figure 1). The eyes adapt so they are sensitive to a range of light intensities around the average (Figure 2), this adaptation takes time. Adaptation allows the eyes to deal efficiently with a wide range of stimulus conditions, but it means that sensing is relative [to the context] rather than absolute.
Figure 2 : The relation between objective light level, and subjective experienced light level, at three different levels of background illumination. At any particular level of adaptation, the eye is good at discrimination over a narrow range of intensities around that level. So a light which appears bright at one level of illumination may not be seen at another, and vice versa.
The two curves on the dark adaptation graph (Figure 1) indicate that the eyes have two different sensing systems, one primarily for use at high, and the other for use at low, light intensities. These two systems have different properties. At higher levels of illumination the sensing cells are sensitive to colour. There is one small area of the retina (the sensory surface inside the eye) which is best able to discriminate between spatial positions, and best able to detect stationary objects. The rest of the sensory surface (the periphery) is better at detecting moving than stationary objects. At lower levels of illumination intensity, the eyes see mainly in black and white, and peripheral vision is more sensitive for detecting position.
Therefore it is not possible to make a simple statement that 'the sensitivity of the eyes is...'. The sensitivity of the eyes depends on the environment (e.g. the average level of illumination) and on the stimulus (e.g. its movement, relative position, or colour). The sensitivity of sense organs adapts to the environment and the task, so sensitivity does not have an absolute value independent of these influences. This means it is difficult to make numerical predictions about sensory performance in particular circumstances, without testing directly.
However, it is possible to draw practical implications from the general trends in sensitivity. For example, it is important to design to support both visual sensing systems in tasks which may be done in both high and low levels of illumination, such as flying. It is also sensible to design so that the most easily detected stimuli (the most 'salient') are used for the most important signals. Visual salience depends not only on intensity but also on the colour, movement, and position of the stimulus. Very salient stimuli attract attention, they over-ride the usual mechanism for directing attention (see Section III). This means that very salient signals can either be useful as warning signals, or a nuisance as irrelevant distractions which interrupt the main task thinking.
2. Discriminating between stimuli
In this section I use the word 'discrimination' to mean distinguishing between two (or more) stimuli. As with detection, the limits to our ability to discriminate between stimulus intensities are relative rather than absolute.
The just noticeable difference between two stimuli is a ratio of the stimulus intensities. (There is a sophisticated modern debate about this, but it is not important for most practical applications). This ratio is called the Weber fraction. Again, the size of this ratio depends on the environmental and task context. For example, in visual intensity discriminations, the amount of contrast needed to distinguish between two stimuli depends on the size of the object (more contrast is needed to see smaller objects) and on the level of background illumination (more contrast is needed to see objects in lower levels of background illumination).
The Weber fraction describes the difference between stimuli which can just be discriminated. When stimuli differ by larger amounts, the time needed to make the discrimination is affected by the same factors : finer discriminations take longer, and visual discriminations can be made more quickly in higher levels of background illumination.
[Issues of sensory discrimination do not only apply to visual stimuli.]
Touch and feel (muscle and joint receptor) discriminations are made when using a control. For example, a person using a knob with tapered sides may make three times more positioning errors than when using a knob with parallel sides (Hunt & Warrick, 1957). Neither of the sides of a tapered knob actually points in the direction of the knob, so touch information from the sides is ambiguous.
Resistance in a control affects how easy it is to discriminate by feel between positions of the control. Performance in a tracking task, using controls with various types of resistance, shows that inertia makes performance worse, while elastic resistance can give the best results. This is because inertia is the same whatever the size of movement made, so it does not help in discriminating between movements. Elastic resistance, in contrast, varies with the size of movement, so gives additional information about the movements being made (Howland & Noble, 1955).
3. Absolute Judgement
The Weber fraction describes the limit to our abilities to discriminate between two stimuli when they are both present. When two stimuli are next to each other we can, at least visually, make very fine discriminations in the right circumstances. However, our ability to distinguish between stimuli when only one of them is present is much more limited. This process is called absolute judgement. The judgement limits to our sensory abilities are known in general, for many senses and dimensions (Miller, 1956). These limits can be affected by several aspects of the task situation, such as the range of possible stimuli which may occur (Helson, 1964).
When only one stimulus is present, distinguishing it from others must be done by comparing it with mental representations of the other possible stimuli. So absolute judgement must involve knowledge and/or working memory. This is an example of a sensory discrimination process which has some processing characteristics in common with what are usually considered much more complex cognitive functions. There is not always a clear distinction between 'simple' and 'complex' tasks in the aspects of processing involved.
Although our ability to make absolute judgements is limited it can be useful. For example, we can discriminate between 8 different positions within a linear interval. This means that visual clutter on scale-and-pointer displays can be reduced; it is only necessary to place a scale marker at every 5 units which need to be distinguished. But our ability is not good enough to distinguish between 10 scale units without the help of an explicit marker.
In other cases, the limitations need to be taken into account in design. For example, we can only distinguish between 11 different colour hues by absolute judgement. As we are very good at distinguishing between colours when they are next to each other, it can be easy to forget that colour discrimination is limited when one colour is seen alone. For example, a colour display might use green-blue to represent one meaning (e.g. main water supply) and purple-blue with another meaning (e.g. emergency water supply). It might be possible to discriminate between these colours, and so use them as a basis for identifying meaning, when the colours are seen together, but not when they are seen alone. (For some discussion of meaning, see Section C below.)
Again discrimination is a process in which the task context, in this case whether or not the stimuli occur together for comparison, has a strong effect on the cognitive processes involved and on our ability to make the discriminations.
4. Sensory decision making
Detections and discriminations involve decisions, about whether the evidence reaching the brain is sufficient to justify deciding that a stimulus (difference) is present. For example, detection on a raw radar screen involves deciding whether a particular radar trace is a 'blip' representing an aircraft, or something else which reflects radar waves. A particular trace may only be more or less likely to indicate an aircraft, so a decision has to be made in conditions of uncertainty. This sort of decision can be modelled by signal detection or statistical decision theory. Different techniques are now used in psychology, but this approach is convenient here because it distinguishes between the quality of the evidence and the observer's prior biases about decision outcomes.
Figure 3 : Knowledge about the occurrence of screen intensities as evidence for different events, based on past experience.
Suppose that radar decisions are based on signal intensity, and that the frequencies with which different intensities have appeared on the radar screen when there was no aircraft present have been as shown in Figure 3.a.top, while the intensities which have appeared when an aircraft was present are shown in Figure 3.a.bottom. There is a range of intensities which occurred only when an aircraft was not present, a range of intensities which occurred only when an aircraft was present, and an intermediate range of intensities which occurred both when an aircraft was present and when it was not, Figure 3.b. How can someone make a decision when one of the intermediate intensities occurs ? The decision is made on the basis of signal likelihood. The height of the curve above a particular intensity indicates how likely that intensity was to occur when there was or was not an aircraft. At the mid point between the two frequency distributions, both possibilities are equally likely. Intensities less than this mid-point are more likely not to come from an aircraft, intensities greater than this mid-point are more likely to come from an aircraft.
Note that when a stimulus is in this intermediate range, it is not always possible to be right about a decision. A person can decide a trace is not an aircraft when it actually is (a 'miss'), or can decide it is an aircraft when it is not (a 'false alarm'). These ways of being wrong are not called 'errors', because it is not mathematically possible always to be right when making uncertain decisions. The number of wrong decisions, and the time to make the decision, increase when signals are more similar (overlap more).
Note that when the radar operator is making the decision, there is only one stimulus actually present, with one intensity. The two frequency distributions, against which this intensity is compared to make the decision, must be supplied from the operator's previous experience of radar signals, stored in their knowledge base. Decisions are made by comparing the input stimulus ('bottom up' [from the environment]) with stored knowledge about the possibilities ('top down' [from knowledge of prior events]).
In addition to the uncertainty due to similarity between possible interpretations of a stimulus, the second major factor in this type of decision making is the importance or costs of the alternative outcomes. Above, the person's decision criterion, the intensity at which they change from deciding 'yes' to deciding 'no', was the point at which both possibilities are equally likely. But suppose it is very important not to miss a signal, for instance when radar watch keeping in an early warning system. Then it might be sensible to use the decision criterion in Figure 4. This would increase the number of hits. It would also increase the number of false alarms, but this might be considered a small price to pay compared with the price of missing a detection. Alternatively, imagine someone doing a job in which when they detect a signal they have to do a lot of work, but they are feeling lazy and not committed to their job. Then they might move their decision criterion in the other direction, to minimise the number of hits.
Figure 4 : An example of change in bias to account for different pay-offs. If rewarded for 'hits', the bias changes to increase hits, but 'false alarms' also increase.
This shift in decision criterion is called bias. Decision bias can be affected by probabilities and costs. The person's knowledge of the situation provides the task and personal expectations/ probabilities and costs which are used in setting the biases, so again top-down processing influences sensory decisions. There are limits to human ability to assess biases (Kahneman, Slovic & Tversky, 1992). At extreme probabilities we tend to substitute determinacy for probability. We may think something is sure to happen, when it is just highly likely. Some accidents happen because people see what they expect to see, rather than what is actually there (e.g. Davis, 1966). Inversely we may think something will never happen, when it is objectively of very low probability. For example, when signals are very unlikely, then it is difficult for a human being to continue to direct attention to watching for them (the 'vigilance' effect).
B. Visual Integration
The effects of knowledge and context are even more evident in multi-dimensional aspects of visual perception, such as colour, shape, size, and movement, in which what is seen is an inference from combined evidence. This discussion is in sections on : movement, size, and colour; grouping processes; and shape. (There are also interesting auditory integrations, much involved in music perception, but these will not be discussed here.)
1. Movement, size and colour constancies
It is actually quite odd that we perceive a stable external world, given that we and other objects move, and the wavelength of the environmental light we see by changes, so the size, position, shape, and wavelength of light reflected from objects onto the retina all change. As we do perceive a stable world, this suggests our perception is relative rather than absolute : we do not see what is projected on the retina, but a construction based on this projection, made by combining evidence from different aspects of our sensory experience. The processes by which a wide variety of stimuli falling on the retina are perceived as the same are called 'constancies'.
When we turn our head the stimulation on the retina also moves. However, we do not see the world as moving, because information from the turning receptors in the ear is used to counteract the evidence of movement from the retina. The changes on the retina are perceived in the context of changes in the head rotation receptors. When the turning receptors are diseased, or the turning movements are too extreme for the receptors to be able to interpret quickly, then the person may perceive movement which is not actually occurring, as in some flying illusions.
There is also constancy in size perception. As someone walks away from us, we do not see them becoming smaller and smaller, although there are large changes in the size of the image of that person which falls on the retina. In interpreting the size of objects, we take into account all the objects which are at the same distance from the eye, and then perceive them according to their relative size. Size constancy is more difficult to account for than movement constancy, as it involves distance perception, itself a complex process (Gibson, 1950). Distance is perceived by combining evidence about texture, perspective, changes in colour of light with distance, and overlapping (itself a construct, see below). Information from the whole visual field is used in developing a percept which makes best overall sense of the combination of inputs. Cognitive psychology uses the concept that different aspects of stimulus processing are done simultaneously, unless an aspect is difficult and slows processing down. Each aspect of processing communicates its 'results so far' to the other aspects via a 'blackboard', and all aspects work together to produce a conclusion (Rumelhart, 1977).
Colour perception is also an integrative process which shows constancy. Research on the colour receptive cells in the retina suggests that there are only three types of cell, which respond to red, green and blue light wavelengths. The other colours we 'see' are constructed by the brain, based on combinations of stimulus intensities at these three receptors. The eyes are more sensitive to some colours, so if a person looks at two lights of the same physical intensity but different wavelengths, the lights may be of different experienced intensity (brightness). The effectiveness of the colour construction process is such that there are some visual demonstrations in which people see a range of colours, even though the display consists only of black and white plus one colour.
This constructive process also deals with colour constancy. The wavelength of ambient lighting can change quite considerably, so the light reflected from objects also changes in wavelength, but objects are perceived as having stable colour. The wavelengths of light from all the objects change in the same way, and colour is perceived from the relative combinations of wavelengths, not the actual wavelengths. This constancy process is useful for perceiving a stable world despite transient and irrelevant changes in stimuli, but it does make designing colour displays more difficult. As with our response to stimulus intensity, our perception of colour is not a fixed quantity which can easily be defined and predicted. Instead it depends on the interaction of several factors in the environment and task contexts, so it may be necessary to make colour perception tests for a particular situation.
2. Grouping processes
Another type of perceptual integration occurs when several constituents of a display are grouped together and perceived as a 'whole'. The Gestalt psychologists in the 1920s first described these grouping processes, which can be at several levels of complexity.
Figure 5 : An example of grouping processes in the interpretation of a display. A Head-Up predictor display for aircraft landing, proposed by Gallaher et al (1977). [Lines with the same qualities are grouped. In use, lines which move together are grouped.]
1. Separate elements can be seen as linked into a line or lines. There are four ways in which this can happen : when the elements are close together, similar, lie on a line, or define a contour. The grouping processes of proximity and similarity can be used in the layout of displays and controls on a conventional interface, to show which items go together.
2. When separate elements move together they are seen as making a whole. This grouping process is more effective if the elements are also similar. This is used in the design of Head Up Displays and predictor displays, as in Figure 5.
3. Something which has uniform colour or a connected contour is seen as a 'whole', e.g. the four sides of a square are seen as a single square, not as four separate elements.
4. The strongest grouping process occurs when the connected contour has a 'good' form, that is, a simple shape. For example, a pull-down menu on a computer screen is seen as a distinct unit in front of other material, because it is a simple shape, and the elements within the shape are similar and (usually) different from the elements on the rest of the screen. When the visual projections of two objects are touching, then the one with the simplest shape is usually seen as in front of (overlapping) the other.
The visual processes by which shapes and unities are formed suggest recommendations for the design of symbols and icons which are easy to see (Easterby, 1970).
3. Shape constancy
Visual integrative processes ensure that we see a unity when there is an area of the same colour, or a continuous contour. The shape we see depends on the angles of the contour lines (there are retinal cells which sense angle of line). Again there are constancy processes. The shape perceived is a construction, taking into account various aspects of the context, rather than a simple mapping of what is projected from the object onto the retina. Figure 6 shows a perspective drawing of a cube, with the same ellipse placed on each side. The ellipse on the front appears as an ellipse on a vertical surface. The ellipse on the top appears to be wider and sloping at the same angle as the top. The ellipse on the side is ambiguous - is it rotated, or not part of the cube at all ? The ellipse on the top illustrates shape 'constancy'. It is perceived according to knowledge about how shapes look narrower when they are parallel to the line of sight, so a flat narrow shape is inferred to be wider. Again, the constancy process shows that knowledge about the properties of the surrounding context (in this case the upper quadrilateral) affects how particular stimuli are seen.
Figure 6 : Shape and size 'constancy' : the same cube with the same ellipse in three different positions. The three ellipses are computer generated duplicates.
The Gestalt psychologists provided dramatic examples of the effects of these inference processes, in their reversible figures as in Figure 7. The overall interpretation which is given to this drawing affects how particular elements of it are grouped together and named, for example whether they are seen as parts of the body or pieces of clothing. It is not possible to see both interpretations at the same time, but it is possible to change quickly from one to the other. As the interpretation given to an object affects how parts of it are perceived, this can cause difficulty with the interpretation of low quality visual displays, for example from infrared cameras or on-board radar.
Figure 7 : The 'wife/mother-in-law' reversible figure.
C. Naming, and simple action choices
The next functions to consider are identifying name, status, or size, and choosing the nature and size of actions. These cognitive functions may be met by a process of recoding (association) from one form of representation to another, such as :
shape : converted to . . . . . . . . name |
|
colour . . . . . . . . . . . . . . . . . . . level of danger |
|
spatial position of display . . . . name of variable |
|
name of variable . . . . . . . . . . . spatial position of its control |
|
length of line . . . . . . . . . . . . . . size of variable |
|
display . . . . . . . . . . . . . . . . . . . related control |
|
size of distance from target . . . size of action needed |
|
Identifications and action choices which involve more complex processing than this recoding will be discussed in the section on complex tasks. This section will discuss : interdependence of the processes and functions; identifying name and status : shape, colour, and location codes; size : size codes; and recoding/ reaction times. Computer displays have led to the increased use of alpha-numeric codes, which are not discussed here (see Bailey, 1989).
1. Interdependence of the functions
Perceiving a stimulus, naming it, and choosing an action are not necessarily independent. Figure 7 above shows that identification is interrelated with perception. This section gives three examples which illustrate other HF/E issues.
Figure 8 : Percent words heard correctly in different levels of noise, from vocabularies of various sizes, e.g. the top line presents the results from tests in which only two different words might occur (Miller, Heise and Lichten, 1951).
Naming difficulties can be based on discrimination difficulties. Figure 8 shows the signal/noise ratio needed to hear a word against background noise. The person listening has not only to detect a word against the noise background, but also to discriminate it from other possible words. The more alternatives there are to distinguish, the better the signal/noise ratio needs to be. This is the reason for using a minimum number of standard messages in speech communication systems, and for designing these messages to maximise the differences between them, as in the International Phonetic alphabet, and standard air-traffic control language (Bailey, 1989).
Figure 9 : Relative reading accuracy with three different digit designs (Atkinson et al , 1952). The differences between the digits are not very clear in this low quality figure.
An important aspect of maximising differences between signals can be illustrated by a visual example. Figure 9 shows some data on reading errors with different digit designs. Errors can be up to twice as high with design A than with design C. At a quick glance, these digit designs do not look very different, but each digit in C has been designed to maximise its difference from the others. Digit reading is a naming task based on a discrimination task, and the discriminations are based on differences between the straight and curved elements of the digits. It is not possible to design an '8' which can be read easily, without considering the need to discriminate it from 3, 5, 6 and 9, which have elements in common. As a general principle, design for discrimination depends on knowing the ensemble of alternatives to be discriminated, and maximising the differences between them.
Figure 10 : 'Iconic' display : Eight variables are displayed, measured outwards from the centre. When all eight variables are on target, the display has an octagon shape. It is easy to detect that there is a distortion in the shape, much more difficult to remember what problem is indicated by any given distorted shape.
However ease of detection/ discrimination does not necessarily make naming easy. Figure 10 shows an 'iconic' display. Each axis displays a different variable, and when all 8 variables are on target, the shape is symmetrical. It is easy to detect a distortion in the shape, to detect that a variable is off target. However studies show that people have difficulty with discriminating one distorted pattern from another by memory, and with identifying which pattern is associated with which problem. This display supports detection, but not discrimination or naming. It is important in task analysis to note which of the cognitive functions are needed, and that the display design supports them.
2. Shape, colour, and location codes for name and status
Conventional interfaces all too often consist of a sea of displays or controls which are identical both to sight and touch. The only way of discriminating between and identifying them is to read the label [often dirty] or learn the position. Even if labels have well designed typeface, abbreviations, and position, they are not ideal. What is needed is an easily seen 'code' for the name or status, which is easy to recode into its meaning. The codes used most frequently are shape, colour, and location. (Felt texture can be an important code in the design of controls.) The codes need to be designed for ease of discrimination, and for ease of making the translation from code to meaning.
Figure 11 : Shapes used in discrimination tests (Smith and Thomas, 1964).
Shape codes
Good shape codes are 'good' figures in the Gestalt sense, and also have features which make the alternatives easy to discriminate. However, ease of discrimination is not the primary criterion in good shape code design. Figure 11 shows the materials used in discrimination tests between sets of colours, military look-alike shapes, geometric forms, and aircraft look-alike shapes [like the a/c, not like each other] . Colour discrimination is easiest, military symbols are easier to distinguish than aircraft symbols because they have more different features, and geometric forms are discriminated more easily than aircraft shapes. (Geometric forms are not necessarily easier to discriminate. For example the results would be different if the shapes included an octagon as well as a circle.)
The results from naming tests rather than discrimination tests would be different, if geometric shapes or colours had to be given a military or aircraft name. Naming tests favour look-alike shapes, as look-alike shapes can be more obvious in meaning.
Nevertheless, using a look-alike shape (symbol or icon) does not guarantee obviousness of meaning. That people make the correct link from shape to meaning needs to be tested carefully. People can be asked, for each possible shape : what they think it is a picture of; what further meaning, such as an action, they think it represents; and, given a list of possible meanings, which of these meanings they choose as the meaning of the shape.
To minimise confusions when using shape codes, it is important not to include in the coding vocabulary any shape which is assigned several meanings, or several shapes which could all be assigned the same meaning. Otherwise there could be high error rates in learning and using the shape codes. It is also important to test these meanings on the appropriate users, naive or expert people, or an international population. For example, in Britain a favoured symbol for 'delete' would be a picture of a space villain from a children's TV series, but this is not understood by people from other European countries !
As well as the potential obviousness of their meaning, look-alike shapes have other advantages over geometric shapes. They can act as a cue to a whole range of remembered knowledge about this type of object (see below on knowledge). Look-alike shapes can also vary widely, while the number of alternative geometric shapes which are easy to discriminate is small. An interface designer using geometric shape as a code runs out of different shapes quite quickly, and may have to use the same shape with several meanings. The result of this is that a person interpreting these shapes has to notice when the context has changed to one in which a different shape-meaning translation is used, and then to remember this different translation, before they can work out what a given shape means [see example below on the meaning of colour]. This multi-stage process can be error-prone, particularly under stress. Some computer based displays have the same shape used with different meanings in different areas of the same display. A person using such a display has to remember to change the coding translation they use, every time they make an eye movement.
Colour codes
Using colour as a code poses the same problems as using geometric shape. Except for certain culture based meanings, such as red = danger, the meanings of colours have to be learned specifically, rather than being obvious. And only a limited number of colours can be discriminated by absolute judgement. The result is that a designer, who thinks colour is easy to see, rapidly runs out of different colours, and has to use the same colour with several meanings. There are computer based displays on which colour is used simultaneously with many different types of meaning, such as :
colour . . means . . substance (steam, oil, etc.)
colour . . . . . . . . . . status of item (e.g. on, off)
colour . . . . . . . . . . function of item
colour . . . . . . . . . . sub-system item belongs to
colour . . . . . . . . . . level of danger
colour . . . . . . . . . . attend to this item
colour . . . . . . . . . . click here for more information
colour . . . . . . . . . . click here to make an action
A user has to remember which of these coding translations is relevant to a particular point on the screen, with a high probability of confusion errors.
Location codes
The location of an item can be used as a basis both for identifying an item, and for indicating its links with other items.
People can learn where a given item is located on an interface, and then look or reach to it automatically, without searching. This increases the efficiency of behaviour. [See also 'acquisition' movements, below.] But this learning is effective only if the location : identity mapping remains constant, otherwise there can be a high error rate. For example Fitts and Jones (1947a), in their study of pilot errors, found that 50% of errors in operating aircraft controls were choosing the wrong control. The layout of controls on three of the aircraft used at that time shows why it was easy to be confused :
Suppose a pilot had flown a B-25 sufficiently frequently to be able to reach to the correct control without thinking or looking. If he then transferred to a C-47, two thirds of his automatic reaches would be wrong, if to a C-82 all of them. As with other types of coding, location : identity translations need to be consistent and unambiguous. Locations will be easier to learn if related items are grouped together, such as items from the same part of the device, with the same function, or the same urgency of meaning.
Locations can sometimes have a realistic meaning, rather than an arbitrary learned one. Items on one side in the real world should be on the same side when represented on an interface. (Ambiguity about the location of left/ right displays could have contributed to the Kegworth air crash, Green, 1990). Another approach is to put items in meaningful relative positions. For example, in a mimic/ schematic diagram or an electrical wiring diagram, the links between items represent actual flows from one part of the device to another. On a cause-effect diagram, links between the nodes of the diagram represent causal links in the device. On such diagrams relative position is meaningful, and inferences can be drawn from the links portrayed (see below on knowledge).
Relative location can also be used to indicate which control goes with which display. When there is a one-to-one relation between displays and controls, then choice of control is a recoding which can be made more or less obvious, consistent, and unambiguous by the use of spatial layout. Gestalt proximity processes link items together if they are next to each other. But the link to make can be ambiguous, such as in the layout : o o o o x x x x. Which x goes with which o ? People bring expectations about code meanings to their use of an interface. If these expectations are consistent among a particular group of people, the expectations are called 'population stereotypes'. If an interface uses codings which are not compatible with a person's expectations, then the person is likely to make errors.
Figure 12 : Effect of relative spatial layout (same, reversed, random) of signal lights and response buttons on response time (Fitts and Deininger, 1954).
If two layouts to be linked together are not the same, then studies show that reversed but regular links are easier to deal with than random links (Figure 12). This suggests recoding may be done, not by learning individual pairings, but by having a general rule from which the person can work out the linkage.
In multiplexed computer based display systems, in which several alternative display formats may appear on the same screen, there are at least two problems with location coding. One is that each format may have a different layout of items. We do not know whether people can learn locations on more than one screen format sufficiently well to be able to find items on each format by automatic eye movements rather than by visual search. If people have to search a format for the item they need, studies suggest this could take at least 25 seconds. This means that every time the display format is changed, performance will be slowed down while this search process interrupts the thinking about the main task (see also below on short-term memory). It may not be possible to put items in the same absolute position on each display format, but one way of reducing the problems caused by inconsistent locations is to locate items in the same relative positions on different formats.
The second location problem in multiplexed display systems is that people need to know the search 'space' of alternative formats available, where they currently are in it, and how to get to other formats. It takes ingenuity to design so that the user of a computer based interface can use the same sort of 'automatic' search skills for obtaining information that are possible with a conventional interface.
In fact there can be problems with maximising the consistency and reducing the ambiguity of all types of coding used on multiple display formats (Bainbridge, 1991). Several of the coding vocabularies and coding translations used may change between and within each format (beware the codes used in Figures in this chapter). The cues a person uses to recognise which coding translations are relevant need to be learned, and are also often not consistent. A display format may have been designed so the codes are obvious in meaning for a particular sub-task, when the display format and the sub-task are tested in isolation. But when this display is used in the real task, before and after other formats used for other sub-tasks, each of which uses different coding translations, then a task-specific display may not reduce either the cognitive processing required or the error rates.
3. Size : size codes
Usually, on an analogue interface, length of line is used to represent the size of a variable. The following arguments apply both to display scales and to the way control settings are shown. There are three aspects : the ratio of the size on the interface to the size of the actual variable; the way comparisons between sizes are made; and the meaning of the direction of a change in size.
Interface size : actual size ratio
An example of the interface size to actual size ratio is that, when using an analogue control (such as a throttle), a given size of action has a given size of effect. Once people have learned this ratio, they can make actions without having to check their effect, which gives increased efficiency (see below).
The size ratio and direction of movement are again codes used with meanings which need to be consistent. Size ratios can cause display reading confusions if many displays are used, which all look the same but differ in the scaling ratio used. Similarly, if many controls which are similar in appearance and feel are used with different control ratios, then it may be difficult to learn automatic skills in using them to make actions of the correct size. This confusion could be increased by using one multi-purpose control, such as a mouse or tracker ball, for several different actions each with a different ratio.
Figure 13 : Speed and accuracy of reading different altimeter designs (Grether, 1949). The reading times and error rates are shown by horizontal bars.
A comparison of alternative altimeter designs is an example which also raises some general HF/E points. The designs were tested for reading speed and accuracy (Figure 13). The digital display gives the best performance, and the 3-pointer design (A) is one of the worst. The 3 pointer altimeter poses several coding problems for someone reading it. The three pointers are not clearly discriminable. Each pointer is read against the same scale using a different scale ratio, and the size of pointer and size of scale ratio are inversely related (the smallest pointer indicates the largest scale, 10,000s, the largest pointer 100s).
Despite these results, a digital display is not now used.
A static reading test is not a good reflection of the real flying task. In the real task, altitude changes rapidly so a digital display would be unreadable. And the user also needs to identify rate of change, for which angle of line is an effective display. Unambiguous combination altimeter displays are now used, with a pointer for rapidly changing small numbers, and a digital display for slowly changing large numbers (D).
Before this change, many hundreds of deaths were attributed to misreadings of the three-pointer altimeter, yet the display design was not changed until these comparative tests were repeated two decades later.
This delay occurred for two reasons, which illustrate that HF/E decisions are made in several wider contexts. First the technology : in the 1940s, digital instrument design was very much more unreliable than the unreliability of the pilot's instrument readings. Secondly, cultural factors influence the attribution of responsibility for error. There is a recurring swing in attitudes, between saying that a user can read the instrument correctly so the user is responsible for incorrect readings, to saying that if a designer gives users an instrument which it is humanly impossible to read correctly reliably, then the responsibility for misreading errors lies with the designer.
Making comparisons between sizes
There are two important comparisons in control tasks : is the variable value acceptable/ within tolerance (a check reading) ? and if not, how big is the error ? These comparisons can both usually be done more easily on an analogue display. Check readings can be made automatically (i.e. without processing that uses cognitive capacity) if the pointer on a scale is in an easily recognisable position when the value is correct. And linking the size of error to the size of action needed to correct it can be done easily if both are coded by length of line.
An example shows why it is useful to distinguish cognitive functions from the cognitive processes used to meet them. Comparison is a cognitive function which may be done either by simple recoding or by a great deal of cognitive processing, depending on the display design. Consider the horizontal bars in Figure 13 above as a display from which an HF/E designer must get information about the relative effectiveness of the altimeter designs. The cognitive processes needed involve searching for the shortest performance bar by comparing each of the performance bar lines, probably using iconic (visual) memory, and storing the result in working memory, then repeating to find the next smallest, and so on. Visual and working memory are used as temporary working spaces while making the comparisons : working memory is also used to maintain the list of decision results.
This figure is not the most effective way of conveying a message about alternative designs, because most people do not bother to do all this mental work. The same results are presented in Figure 14. For a person who is familiar with graphs, the comparisons are inherent in this representation. A person looking at this does not have to do cognitive processing which uses processing capacity and is unrelated to and interrupts the main task of thinking about choice of displays. (See below for more on memory interruption, and on processing capacity.) This point applies in general to analogue and digital displays. For many comparison tasks, digital displays require more use of cognitive processing and working memory.
Figure 14 : The results from Figure 13, presented on a co-ordinate graph.
Direction of movement to meaning
The second aspect to be learned about interface sizes is the meaning of the direction of a change in size. Cultural learning is involved here, and can be quite context specific. For example, people in technological cultures know that clockwise movement on a display indicates increase, but on a tap or valve control means closure, therefore decrease. Again there can be population stereotypes in the expectations people bring to a situation, and if linkages are not compatible with these assumptions, error rates may be at least doubled.
Directions of movements are often paired. For example, making a control action to correct a displayed error involves two directions of movement, on the display and on the control. It can be straightforward to make the two movements compatible in direction if both are linear, or both are circular.
Figure 15 : Two possible designs for the aircraft attitude indicator, showing incompatible movements.
It is in combining three or more movements that it is easy to get into difficulties with compatibility. One classic example is the aircraft attitude indicator. In Fitts and;Jones' (1947b) study of pilots' instrument reading errors, 22% of errors were either reversed spatial interpretations, or attitude illusions. In the design of the attitude indicator, four movements are involved : of the external world, of the display, of the control, and of the pilot's turning receptors, see Figure 15. The attitude instrument can show a moving aircraft, in which case the display movement is the same as the joystick control movement but opposite to the movement of the external world. Or the instrument can show a moving horizon, which is compatible with the view of the external world but not with the movement of the joystick. There is no solution in which all three movements are the same, so some performance errors or delays are inevitable. Similar problems arise in the design of moving scales and of remote control manipulation devices.
4. Reaction times
Evidence about recoding difficulties comes from both error rates and the time taken to translate from one code representation to another. Teichner and Krebs (1974) reviewed the results of reaction time studies. Figure 16 shows the effect of the number of alternative items and the nature of the recoding. The effect of spatial layout was illustrated in Figure 12. Teichner and Krebs also reviewed evidence that, although unpractised reaction times are affected by the number of alternatives to choose between, after large amounts of practice this effect disappears, all choices are made equally quickly. This suggests that response choice has become automatic, it no longer requires processing capacity.
Figure 16 : Response times are affected by the number of alternatives to be responded to, the nature of the 'code' linking the signal and response, and the amount of practice [over-learned digit-voice translation is not affected by number of alternatives]. (Teichner and Krebs, 1974).
The results show the effect of different code translations : using spatial locations of signals and responses (light, key) or symbolic ones (visually presented digit, spoken digit i.e. voice). The time taken to make a digit to voice translation is constant, but this is already a highly practised response for the people tested. Otherwise, making a spatial link (light to key) is quickest. Making a link which involves a change of code type, between spatial and symbolic, (digit to key, or light to voice) takes longer. (So these data show it can be quicker to locate than to name.) This coding time difference may arise because spatial and symbolic processes are handled by different areas of the brain, and it takes time to transmit information from one part of the brain to another. The brain does a large number of different types of coding translation (e.g. Barnard, 1987).
Figure 17 : Effect of preview and type of material on response time. These data come from a study of expert typists, given more or less preview, and typing either random letters or prose (Shaffer, 1973).
The findings presented so far come from studies of reacting to signals which are independent and occur one at a time. Giving advance information about the responses which will be required, which allows people to anticipate and prepare their responses, reduces response times. There are two ways of doing this, illustrated in Figure 17. One is to give preview, allowing people to see in advance the responses needed. This can more than halve reaction time. The second method is to have sequential relations in the material to be responded to. Figure 16 showed that reaction time is affected by the number of alternatives : the general effect underlying this is that reaction time depends on the probabilities of the alternatives. Sequential effects change the probabilities of items. One way of introducing sequential relations is to have meaningful sequences in the items, such as prose rather than random letters.
Figure 18 : The speed-accuracy tradeoff : error rates at different response times (unpublished results, see also Rabbitt and Vyas, 1970).
Reaction time and error rate are interrelated. Figure 18 shows that when someone reacts very quickly, they choose a response at random. As they take a longer time, and can take in more information before initiating a response, there is a tradeoff between time and error rate. At longer reaction times there is a basic error rate which depends on the equipment used.
D. Action execution
This chapter does not focus on physical activity, but this section will make some points about cognitive aspects of action execution. The section will be in two parts, on acquisition movements, and on continuous control or tracking movements.
The speed, accuracy and power a person can exert in a movement depend on its direction relative to the body position. Human biomechanics and its effects on physical performance, and the implications for workplace design, are large topics which will not be reviewed here (Pheasant, 1991). Only one point will be made. Workplace design affects the amount of physical effort needed to make an action, and the amount of postural stress a person is under. These both affect whether a person is willing to make a particular action, or to do a particular job. So workplace design can affect performance in cognitive tasks. Factors which affect what a person is or is not willing to do are discussed more in the section on workload.
1. Acquisition movements
When someone reaches to something, or puts something in place, this is an 'acquisition' movement. Reaching a particular end point or target is more important than the process of getting there. The relation between the speed and accuracy of these movements can be described by Fitts Law (1954), in which movement time depends on the ratio of movement length to target width. However, detailed studies show that movements with the same ratio are not all carried out in the same way.
Figure 19 : Detailed evidence about the execution of two movements, both with the same movement length to target size ratio (8 : 1) and the same overall movement time (Crossman and Goodeve, 1963).
Figure 19 shows that an 80/ 10 movement is made with a single pulse of velocity. A 20 /2.5 movement has a second velocity pulse, suggesting the person has sent a second instruction to their hand about how to move. Someone making a movement gives an initial instruction to their muscles about the direction, force and duration needed, then monitors how the movement is being carried out, by vision and/ or feel. If necessary they send a corrected instruction to their muscles, to improve the performance, and so on. This monitoring and revision is called using feedback. A finer movement involves feedback to and a new instruction from the brain. A less accurate movement can be made with one instruction to the hand, without needing to revise it. An unrevised movement ('open-loop' or 'ballistic') probably involves feedback within the muscles and spinal cord, but not visual feedback to and a new instruction from the brain.
Figure 20 : Double use of feedback in learning to make movements (Bainbridge, 1978). One feedback loop [lower in diagram] is used in adjusting the movement while it is being made, the other [upper] loop adjusts the way successive actions are made, as the result of learning.
Movements which are consistently made the same way can be done without visual feedback, once learned, as mentioned in the section on location coding. Figure 20 indicates the double use of feedback in this learning. A person chooses an action instruction which they expect will have the effect they want. If the result turns out not to be as intended, then the person needs to adjust their knowledge about the expected effect of an action. This revision continues each time they make an action, until the expected result is the same as the actual result. Then the person can make an action with minimal need to check that it is being carried out effectively. This reduces the amount of processing effort needed to make the movement. Knowledge about expected results is a type of meta-knowledge. Meta-knowledge is important in activity choice, and will be discussed again.
2. Control or tracking movements
Control movements are ones in which someone makes frequent adjustments, with the aim of keeping some part of the external world within required limits. They might be controlling the output of an industrial process, or keeping an aircraft straight and level. In industrial processes, the time lag between making an action and its full effect in the process may be anything from minutes to hours, so there is usually time to think about what to do. By contrast in flying, events can happen very quickly and human reaction time plus neuromuscular lag, adding up to half a second or more, can have a considerable effect on performance. So different factors may be important in the two types of control task.
There are two ways of reducing the human response lag (cp. Figure 17). Preview allows someone to prepare actions in advance and therefore to overcome the effect of the lag. People can also learn something about the behaviour of the track they are following, and can then use this knowledge to anticipate what the track will do and so prepare their actions [soccer goal-keepers can have surprising ability to do this !].
There are two ways of displaying a tracking task. In a pursuit display, the moving target and the person's movements are displayed separately. A compensatory display system computes the difference between the target and the person's movements, and displays this difference relative to a fixed point. Many studies show human performance is better with a pursuit display, e.g. Figure 21. As mentioned above, people can learn about the effects of their actions, and about target movements, and both types of learning lead to improved performance. On the pursuit display, the target and human movements are displayed separately, so a person using this display can do both types of learning. In contrast, the compensatory display only shows the difference between the two movements. It is not possible for the viewer to tell which part of a displayed change is due to target movements and which is due to their own movements, so these are difficult to learn.
Figure 21 : Errors in tracking performance when using pursuit and compensatory displays (Briggs and Rockway, 1966).
[This comes from the original figure and is a good example of confused location coding - in the code menu, pursuit is above, in the results it is below. This makes it take longer at first to understand the figure.]
A great deal is known about human fast tracking performance (Sheridan and Ferrell, 1974; Rouse, 1980). A person doing a tracking task is acting as a controller. Control theory provides tools for describing some aspects of the track to be followed and how a device responds to inputs. This has resulted in the development of a 'human transfer function', a description of a human controller as if they were an engineered control device. The transfer function contains some components which describe human performance limits, and some which partially describe human ability to adapt to the properties of the device they are controlling. This function can be used to predict combined pilot - aircraft performance. This is a powerful technique with considerable economic benefits. However it is not relevant to this chapter as it describes performance, not the underlying processes, and it only describes human performance in compensatory tracking tasks. It also focuses attention on an aspect of human performance which can be poorer than that of fairly simple control devices. This encourages the idea of removing the person from the system, rather than appreciating what people can actively contribute, and designing support systems to overcome their limitations.
E. Summary and implications
Theory
The cognitive processes underlying classic HF/E can be relatively simple, but not so simple that they can be ignored. Cognitive processing is carried out to meet cognitive functions. Five cognitive functions were discussed in this section of this chapter : distinguishing between stimuli; building up a percept of an external world containing independent entities with stable properties; naming; choosing an action; and comparison.
Figure 22 : A contextual overview of cognitive processes in simple tasks. [Interpreting the meaning of external stimuli and choosing an appropriate action is done within the context of knowledge and expectations about the situation.]
This section suggests these functions could be met in simple tasks by three main cognitive processes. (What happens when these processes are not sufficient has been mentioned briefly and is discussed in the next main section.) The three processes are :
- deciding between alternative interpretations of the evidence;
- integrating data from all sensory sources, together with knowledge about the possibilities, into an inferred percept which makes best sense of all the information;
- recoding, i.e. translating from one type of code to another.
Five other key aspects of cognitive processing have been introduced :
1. Sensory processing is relative rather than absolute.
2. The cognitive functions are not necessarily met by processes in a clearly distinct sequence. Processes which are 'automated' may be done in parallel. The processes communicate with each other via a common 'blackboard', which provides the context within which each process works, as summarised in Figure 22. As processing is affected by the context in which it is done, behaviour is adaptive. However, for HF/E practitioners this has the disadvantage that the answer to any HF/E question is always 'it depends'.
3. The processing is not simply input driven : all types of processing involve the use of knowledge relevant in the context. (It can therefore be misleading to use the term 'knowledge based' to refer to one particular mode of processing.)
4. Preview and anticipation can improve performance.
5. Actions have associated meta-knowledge about their effects, which improves with learning.
Practical aspects
The primary aim of classic HF/E has been to minimise unnecessary physical effort. The points made here emphasise the need to minimise unnecessary cognitive effort.
Task analysis should not only note which displays and controls are needed, but might also ask such questions as : What cognitive functions need to be carried out ? by what processes ? Is the information used in these processes salient ?
In discrimination and integration : What is the ensemble of alternatives to be distinguished ? Are the items designed to maximise the differences between them ? What are the probabilities and costs of the alternatives ? How does the user learn these ?
In recoding : What coding vocabularies are used (shape, colour, location, size, direction, alpha-numeric):
- in each sub-task ?
- in the task as a whole ?
- are the translations unambiguous, unique, consistent, and if possible obvious ?
Do reaction times limit performance, and if so can preview or anticipation be provided ?
Further sections of this chapter :
III. Mental workload, learning, and errors.
Access to other papers via the Home page
©2021 Lisanne Bainbridge
Comments
Post a Comment