November 30, 2005 —
 |
How big a wow? New formula offers an snswwer.
|
Two Southern California engineers have
created a mathematical theory of human surprise, working from basic
principles of digital communications. Experiments recording eye
movements of volunteers watching video appear to provide confirmation
of the theory.
Laurent Itti of the University of Southern California's
Viterbi School of Engineering and Pierre Baldi of the University of
California Irvine's Institute for Genomics and Bioinformatics,
presented their results December 7, at the Neural Information Processing
Systems (NIPS) Conference in Vancouver, B.C.
Itti
is the principal investigator for a new grant awarded by the National
Science Foundation to carry on further research in the field with Baldi
and electrophysiologist Douglas Muñoz of Queens University.
In developing their theory, Itti and Baldi went back to
fundamental principles developed by Claude Shannon in his classic 1948
paper, "A Mathematical Theory of
Communication." The pair's mathematical theory of surprise
proposes an alternative mode, a subjective one, for characterizing and
quantifying
information, distinct from Shannon's model.
Itti, who is a
assistant professor in the Viterbi School's Department of
Computer Science, says Shannon’s technique is not about a specific
observer, but about any observer
seeking to pick out a message from its noisy environment, or to send
one
with an assurance it will be read accurately.
Communicators package their messages to
survive in a noisy environmental buzz of activity that itself contains
crucial information — information that is not in message form. These
include potential threats or opportunities. To deal with the flood of
information bombarding their senses, Individuals develop
mechanisms by which they devote attention to certain stimuli, while
ignoring others.
As Itti and Baldi write, “efficient and rapid attentional allocation is
key to predation, escape, and mating — in short, to survival.”
According to the researchers, previous computational work on the
problem has been phrased in the vocabulary of the stream of electronic
data making up a video image, as a proxy for the much more complex
mixture of sights, sounds, smells and other data found in a real
environment. Analyzing such a stream, researchers can isolate
stimuli with visual
attributes that are unique in the mix by breaking down the signal into
“feature channels,” each describing a particular attribute (i.e,,
color) in the mix. Such features are called “salient.”
Itti previously developed a measure of saliency.
A parallel analysis performs similar operations, but does so over time,
not space, by looking for new suddenly-appering elements. This approach
is said to model “novelty.”
Finally, an analysis can be done purely in terms of Shannon’s original
equations, which can measure the level of organization or detail found
in the data flow, or its entropy.
Itti and Baldi say that in current research, the definition of both
saliency and novelty are empirical, based on analysis of visual
streams, rather than predictions about them derived from basic principles,
Their theory boldly proposes to make just such predictions. The probability theory involved
is that known as “Bayesian,” which is a method for structuring
events observed over time in the past into predictions about the future.
The equation for making this guess is well known, having been developed from
the probability studies of the English mathematician Thomas Bayes
(1702-61). Itti and Baldi devised a way of applying it to the data in
a video stream, providing a measure of how observing new data will
affect the set of beliefs an observer has developed about the world on
the basis of data previously received.
"Data that does not change your
beliefs is not surprising," says itti.
Their next step was to use this theory
to analyze a video stream and describe what streams had the most
most “surprising” features. Finally,
having performed this analysis, they checked it by looking at the eye
movements of observers who were watching the images, to see if the eyes
followed
their measure of surprise.
The two researchers measured the success of their “surprise”
prediction against
two other analyses. The first was the version of saliency that Itti
co-developed as a graduate student studying under Christof Koch at
Caltech.
The second was a computation of Shannon entropy by C.M. Privatera and
L.W. Stark. Surprise, they say, outperformed entropy and
saliency, “exhibiting a
stronger human bias toward surprising locations than towards entropic
or salient regions.” The pair say they have confirmed these results
with a larger study.
The authors conclude: “At the foundation of our model is a simple
theory that describes a principled approach to computing surprise in
data streams. While surprise is not a new concept, it had lacked a
formal definition, broad enough to capture the intuitive meaning of the
term, yet quantitative and computable…. Beyond vision, computable
surprise could guide the development of data mining, as it can in
principle be applied to any type of data, including visual, auditory or
text.”
i
 |
Calcuating
Wow!: The experiment analysed a set of video images according in
three modes: classic Shannon entropy (measuring how organzied the data
appeared; "saliency," and for suprise. The surprise result most closely
predicted the eye movements of human observers watching the video. |