A Computer HINTerface
Stuart Mealing
Faculty of Art, Design & Technology, Cardiff Institute.
Abstract
This paper considers visual devices which might be used to enable and enhance computer-based communication, both human/ computer and human/human. It proposes their initial implementation within a potentially responsive iconic environment.
Introduction
A computer interface typically offers the user explicit information and options. Communication between humans however, usually includes implicit information, perhaps in the form of intonation, gesture or expression which serves to qualify the given message. Indeed these value-added features of spoken language can often communicate more than the words themselves, particularly if the words are in a foreign language.
C o m p u t e r - b a s e d communication would move closer to the richness and subtlety of direct human communication if it could offer similar supporting nuances; hints to clarify or reinforce the message. A proposed human/ computer interface offering this additional level of communication with its user will be called a HINTerface. Devices used by the HINTerface could be abstract or mimetic, static or kinetic. They could offer clues, intimations, implications, imputations, insinuations or warnings.
The current computer interface is predominantly visual and whilst any of the human senses could be involved in extracting meaningful information it is intended here to consider primarily visual clues. It is also intended that the ideas developed here, which grew from previous work on computer-based iconic communication, will be employed initially in that context in order to create a value-added iconic environment. It is expected, however, that the implicit communication features will find wider application in a range of man/ machine interfaces.
Even a simple desktop computer now offers dynamic display resources, as yet little tapped in this context. It has the ability to display static and moving images which can change in scale, colour, intensity, and position (both absolute and relative). Temporal changes can allow visual interaction between images and also between images and their ground (such as fades and wipes for instance) and can accommodate a varying timebase. These resources have the potential for being mapped to the extralinguistic features of human/human communication. The embodiment of such features in computer-based communication suggests a fresh approach to syntax, perhaps pointing towards a free-form environment. (David Orcutt, the inventor of [the pictorial language] Worldsign, sees it as a link between people of different nations and languages, based on a kind of 'think-feeling' which stimulates interaction between the transmitter and receiver [Jones 86] His language does not rely on formal syntax).

Figure 1: Worldsign 'Pain'
Changes of colour, position and scale are trivial to achieve, as are affine transformations, but more comprehensive changes in the form of icons (or of elements within icons) have greater potential for mapping to the dynamic elements in human/ human communication such as gesture. These changes could be pantomimic as in Worldsign or formalised, as in sign language for the deaf.
At its simplest, an emotion such as anger (the displayed presence of which would give strong inflexion to any communication) could be mapped to the desktop or background colour on an harmonic scale from red to blue. The whole context of the communication could thus be 'coloured' by the underlying emotion. Volume, a potential inflector in verbal communication, could be readily allocated a visual equivalent. Similarly, a scale of agreement could be mapped to a shape varying from an abrupt triangle at the negative pole to a soft rounded shape at the positive pole. Colour, shape, typeface, font style and layout can all be used to similar effect.
Whilst it is possible to move from a thought to its expression without recourse to natural language, it is likely that grammatical elements will often be recognisable. When this arises it might be that verbs are represented by kinetic icons and nouns by static icons, though it is possible for the tense of a verb or number of a noun to be implicit in its context.
If familiarity with cinema and television is assumed, then the HINTerface can also call on the devices used by film and video to stretch and compress time, to move between past, present and future, and to establish relationships between consecutive images or image sequences. These media can also create diegetic space [Armes 88], the space on the screen together with what is implied off-screen, for instance when a character looks out of shot. Together, these devices offer further scope for developing visual communication. The sophistication of a contemporary Western audience in understanding the role of such devices should not, however, be automatically taken as present in all users world-wide.

Figure 2: Possible mappings of Agreement and Anger
Gesture and Sign
The kinetic accompaniment to verbal communication, the instinctive shrugs, waves, mimes, grimaces and body stance, have, in some cases, been absorbed into formalised sign languages. For instance gestures can express defined concepts, such as direction by means of pointing, actions through demonstration, and past and future time by indicating backwards and forwards. It is possible too to give absent objects some kind of pseudo visual form by drawing or sketching them in the air [Jones 86](Fig 3).
Figure 3: Amerind 'Table'
Study of these sign languages gives clues about the suitability of gestural icons in the HINTerface. These languages are systematic, rule governed and able to express abstract thought [Grove 82] but only understood intuitively in limited cases. In both British Sign Language
(BSL) and American Sign Language (ASL) there occurs a middle ground between sign and pantomime; signs can be made more mimetic to make description more accurate, or where an appropriate lexical sign is not available.
A communication continuum can be conceived [Carter 81] which moves from verbal, through prosodic (stress, rhythm, intonation), to paralinguistic (expression, gesture etc.) and nonlinguistic (reflexes such as laughter and crying which may convey meaning and emotion). Paralinguistic information is often redundant, but assumes more significance in situations where speech is felt to be ambiguous, or less than adequate for clear communication. [such as] in a conversation with someone who seems not to understand you [Grove 82]. It is interesting that Grove goes on to note that fluent signers in conversation watch one another's faces, not one another's hands; also that if you translate only the manual signs, vital information is lost.
Most sign languages do not have a high iconicity; their signs are rarely clear pictures of what they represent. Their meaning is sometimes accessible to the naive audience, a sign such as a brushing teeth being easy to guess, and in this case being known as transparent. Translucent signs are not as immediately obvious but an association can be readily made between sign and meaning once revealed. Both the semantic context and the positional context of the sign often improves iconicity and thus enables understanding. For instance, in BSL signs made near the head are often concerned with thinking and signs made near the chest or stomach to do with feeling. It is also possible for a single sign to express as much information as a complex sentence and this serves as a reminder that we need not, in constructing our icons, think only of the agglutination of small units of meaning.
Icons
For our current purposes the term icon will be used to refer to a discrete visual unit of representation whatever its scope of meaning or degree of symbolism/pictorialism. This will therefore cover pictures, representational pictographs, associational pictographs and arbitrary symbols, both static and moving. The icon construction set can include forms which are already universally understood, such as scientific formulae which can communicate scientific ideas across language barriers. It could also use a restricted number of recognised modifying symbols which could function as qualifiers, quantifiers, restrictors, affirmers, confirmers and negators, such as the diagonal band used on road signs to indicate the end of a signed state.
The icon set thus embraces both hot and cold forms of image communication [Mealing 90] and it will be context and user familiarity which determine the appropriate temperature of icon form. Icons of mixed temperatures could be presented simultaneously however, for example with symbols overlaid onto a background of live video.
It is easy to conceive of the movement of anthropomorphic elements within an icon but it is also possible to personify the icon itself through animation. It is a stock-in-trade of animators to make shapes and inanimate objects move expressively, so that a cube or letter form can 'swagger' or 'sulk' as easily as Tom or Jerry. Applied even to an icon with a high level of abstraction, additional clarity of meaning could be given.

Figure 4: A taxonomy of image information content
Although the HINTerface will add subtlety to messages there are clearly occasions when it is inappropriate for an icon to be subtle. For instance, when a genus risks becoming an instance through the addition of superfluous information. This is a crucial issue in the design of icons and is clearly illustrated in the case of a pictographic icon by considering the potential interpretation of an image standing for man [Mealing 91].

Figure 5: Two 'Man' icons
It is possible for an icon to contain several levels of meaning however, so that specificity can be derived from a general case by interrogation, in the same way that a compound, hierarchical icon could be interrogated to provide its meaning [Mealing 90].
In addition to being self-explaining, icons could also be self-adapting and self-constructing. If imbued with knowledge of their role and with behavioural rules they could change themselves according to the context in which they were placed, perhaps changing scale, colour or position or, more radically, amalgamating with their colleagues to form new icons. They could also contribute to the making of changes in their environment and might even change the temperature of their presentation as communication develops, adopting a more symbolic form once context is clearly established. Distributed artificial intelligence (DAI ) might perhaps contribute to this issue which, in some ways, recalls intelligent agents interacting in a blackboard system [Bond 88].
Whilst icons can readily be hardwired to deliver a message they are less amenable to the potentially diverse requirements of a reply. Selection from an iconic library could be adequate for the compilation of a response within a limited domain but it is unrealistic to expect it to permit unlimited dialogue. A response library could, however, be self-modifying in the light of context or elapsed dialogue; it could also limit response presentation to a layered, multiple choice format.
Context
The relationship between icons can be more expressive than in the standard (Western), left-to-right, top-to-bottom, static, linear layout. A computer facilitates the varying of many presentational factors such as the order and timing of the appearance of elements. The method in which the elements appear on the screen (i.e. fade in, snap in, zoom in, slide in, 'walk' in etc.) can be telling, as can the spatial relationship they adopt to other icons and the relative scale of various icons. The language time-base can also be varied from cumulative presentation over time (perhaps running at a speed matching that of reading, speaking or comprehension), to presentation all at once (like a page of text) which is subsequently translated over time, or sequential appearance (like spoken words).
The 'backcloth' in this iconic message theatre can function as either another actor or as a domain clue, being anything from a blank screen through static picture to live video. A photographic image, for instance, offers a close, almost one to-one relationship with its subject which is immediately accessible to most people, though making a photograph stand for a general case is not easy. A drawing derived from a photograph would often prove preferable, as a controlled degree of abstraction from the source can be achieved. Even if the subject of such an image is clear however, its meaning in a given context might not be. A photograph of a car, for instance, might refer to the particular make of vehicle, might be intended to stand for 'transport', or might be intended to mean 'take to the road'.
Decoding
Easy decodability is crucial. The essence of the HlNTerface is that it should be accessible to the naive user and should therefore require no pre-learning. Any explanation or hand-holding that is required would be provided by the system, for instance by the use of selfexplaining icons.
Most logographical forms have evolved from whole-word pictorial representation through syllabic representation to symbolic alphabets. The former are highly understandable but inefficient, whilst the latter are highly efficient but difficult to decode without a lot of learning. We have noted that some signs in sign language have iconic features, i.e. they are to some extent representational, visually suggesting or resembling the meaning of the sign. As a general principle, however, signs (like words) are arbitrary and..not universally understandable. It is interesting, however, to find that half or more of the total number of Amerind signals (a sign language derived from that of American Indians) can be understood by the uninstructed observer Jones 86].
Conclusion
There is a range of static and kinetic visual forms which can be used in the construction of icons for use in computer-based communication. Additionally, computers allow us to combine the dynamic advantages of paralinguistic language with one great advantage of written language - that the element being read is surrounded by the preceding and following text which is therefore always available for contextual reference. This contextual aid can be extended to background imagery. These elements can all be usefully combined in both iconic communication systems and in the HINTerface. The elements of the HlNTerface have potential application in human/machine interfacing and computer-based human/human communication.
With growing international interaction and with electronic, global developments such as the much hyped Information Superhighway, a means of communication which bypasses the need for natural language translation becomes increasingly valuable. It also offers the possibility of being understandable to groups with restricted or absent language such as the mentally handicapped. The very focused display space offered by a computer screen might also have some advantages to such groups over the background distractions of the real world.
References
Armes, R (1988) On Video, Routledge
Bond, A (1988) Readings in Distributed Artificial Intelligence, Orientation
Carter, M (1981) Some issues involved in a linguistic analysis of BSL in Research on Deafness, 1979 - 1980, Vol 1: School of Education Research Unit, Bristol University 1981
Grove, N (1982) Linguistics in Sign Language, Makaton Vocabulary Development Project - Research Information Service, Vol 2 Issue 1 Jones, P and A Cregan (1986) Sign and Symbol Communication for Mentally Handicapped People, Croom Helm, London
Mealing, S and M Yazdani (1990) A computer based iconic language, Intelligent Tutoring Media, Vol 1 No 3
Mealing, S (1991) Talking pictures, Intelligent Tutoring Media, Vol 2 No 2