|
|
Questions and Summary
by Alexandra Horowitz and Adrian Robert
Brief, and sometimes over-pithy, summary of chs. 4 and 5, Fluid Concepts and Creative
Analogies
Alexandra Horowitz
Cognitive Science Department, UCSD
ahorowit@cogsci.ucsd.edu
In chapter 4, Hofstadter et al quickly review the technological and ideological
work that preceded their creation of the analogy-forming "Copycat" program.
They are motivated both by their confidence that analogies are a core creation
of the human cognition we are interested in; and, too, by the failure of
previous attempts to capture this cognitive accomplishment. A few important
definitions and delineations channel their approach. First, no one in
cognitive science at UCSD will be alarmed to hear their stout insistance that
perception is importantly influenced by -- and influences -- other,
"higher-level," cognitive processes. Further, they want to distinguish both
low and high ("semantic:" where "concepts" come in) levels of perception, for
while they will claim that many analogy programs to date have neither, Copycat
will rely on the separation thereof. High level perception is characterized by
its flexibility, and thus can be seen as an important part of the creative
process: part of deciding what is relevant in input, and the ordering of the
information being considered. A mistake that most AI researchers have made,
the authors claim, is of assuming that perception and cognition are separable
at all: that you can consider one without the other, and then later go back and
fill in the missing half.
A second important definition made in this chapter is that of analogy itself.
To Hofstadter, an analogy succeeds when it gets to the "essence" of two
situations or objects. The authors go on to divvy up the processes of analogy
into two (non-temporally separate) parts: the representation, and the mapping.
Specifically, salient attributes of the situations at hand must be chosen for a
kind of working-memory representation; next, mapping across corresponding
elements of the situations occurs. Ta-da, a lovely analogy. This distinction
informs the crux of their complaints about other computational models of
analogy. Most make _explicit_ the representation, by hand-coding in exactly
the bits of knowledge that are helpful to forming an analogy. Thus only the
mapping (for H, the less interesting -- and certainly easier -- element) is
left to the computer. A typical program might, e.g., consist of predicate
logical structures describing "Iran-Contra" that have _exactly the same_ shape
to those describing "Watergate;" all the program must do is draw the lines
between the corresponding shape-parts. The representations are rigid; as a
result, the program can only do exactly as much as their programmers code into
them, even if to the human eye there are other appealing and obvious
connections to be drawn. As another example, the success of an earlier program
in "discovering" Kepler's third law of planetary motion anew clearly seems to
lie in the fact that the program was provided with just the information about
planets and periods of rotation that are constituents of the principle, with no
extra, irrelevant facts intervening. It is Hofstadter's strong contention that
it is, instead, the ability to filter through the range of facts and conceptual
frameworks available, to the _relevant_ ones, that gives a discovery such as
Kepler's power.
some idle questions: -- One of the reasons H rejects previous models of analogy
is that the representations are built in ready-to-cook -- i.e, the relevant
information is provided. A counter might be levied that such programs could in
theory take large representations instead, in which the program must search
through the attribute shapes, and that this would solve H's problem. In fact,
many programmers have gone ahead and provided their programs with dummy
information that they must learn to ignore; this seems to show the ability to
match shapes of predicates, even over a non-trivial representation space. Is
this computationally plausible? if so, would it be a satisfactory model of
analogy?
-- Copycat ignores low-level perception. Can it, thus, amount to anything
substantively better than what other analogy-making programs do (i.e., ignore
all perception)?
_____________________________________________________________________________
Chapter 5, as told by Alix (for comparison, see Chapter 5, as told by Adrian)
Mitchell and Hofstadter created Copycat, a microdomain analogy maker (or, as
occasionally caveated, a "fluid concept" maker), in response to the sorts of
issues raised in chapter 4. A brief survey of its superficial output is in
order to get the gist of the program: if the string of letters "abc" is changed
to "abd", how can you need to change the string "ijk" in "the same way"?
M&H are interested in the fact that we quick-simile-makers will easily come
up with "ijl", and, if pushed, will maybe come up with other possibilities
(which we will likely judge to be less satisfactory, in some nebulous way).
Given strong intuitions about how this process takes place in humans, M&H
starts to characterize the types of considerations that go into our making of
the analogy. It is these very complex and well-formulated intuitions that give
Copycat its interest; they, too, might be the source of its downfall, in
considering its applicability outside the tiny microdomain of strings of
letters.
A central concept in M&H's formulation is that there are pressues on the
analogy-making system -- be it human or computer -- that make us see features
of the input that _stand out_ to us. In "abc", it is clear that these are
successive alphabetic letters. The "succession" property, in particular, is
what allows us to start to connect its form to that of "ijk". M&H will
give Copycat a finite number of similar such properties, as well as
relationships and other attributes, to look out for. Very generally, the
strength of a property will vary according to what other concepts are nearby,
what has been activated so far, and the abstractness of a property (consider,
for example, that what we see as the "nicest" analogies often are correlated in
very deep ways -- not just on the mapping of letter to letter).
Interestingly, many possible attributes of strings-of-letters are _not_ given
to the system to consider as possibly relevant to the formation of analogies:
Copycat will never form an analogy that "a:e" as "l:p" (where the correlation
is the number of letters that separate the objects). This would not be a
difficult analogy for a human, after a moment's thought.
The architecture of the program is threefold: (1) there are "concepts" -- nodes
that represent a kind of core idea, with a halo of more context-dependent ideas
surrounding it -- in the "Slipnet" memory-site. Concepts are: a, b, left,
sameness, etc. Links betwen concepts are variable in strength and duration,
and are what bring concept-halos to overlap. A good feature of this
architecture is that it accommodates closeness -- and thus co-activation of --
types of concepts that might make for good analogies but that are meaningfully
opposite: e.g., allowing the simile "abc:abd as xyz:wyz", where the concepts
"successor" and "predecessor", and "leftmost" and "rightmost" are linked. (2)
In the "Workspace," objects are considered by little "codelets" , from (3), the
"Coderack" space, who either scout out the possible future object trails or
construct or destroy concepts or bonds. Activation is reciprocal between the
workspace and slipnet: just the sort of communication that might mirror a
perceptual-conceptual communication in humans.
A number of important features emerge from the Copycat set-up. The codelets
are chosen, on any given run, randomly. But over time the behavior of the
system as a whole starts to look deterministic, because it gauges its place by
measures of pressures, "temperature" sensitivity to goodness and to amount of
randomness, and so on. Thus the system has feedback between levels, a (in
some sense) non-deterministic course of decision-making, an appreciation of
deep concepts over more superficial ones, the emergence of "themes" in
analogy-forming: i.e., fluidity, just as a human mind does.
there are infinitely more details. but! on to Questions!
-- Although H starts with definite intuitive notions of how we form analogies,
in the end he sidesteps the comparison of Copycat's data to human data. Would
such a comparison hold up?
-- In formulating "concepts" as he did, H chooses a kind of prototype-based
model of concepts. Would a model like his work with some other equally-viable
notion of concept (say, exemplar-based)? i.e., if not, the model is perh
overly committed to the notion of concept chosen, and thus the conceptual form
is imbued with a kind of untested power.
How is the delineation of the features (symmetry, alphabetically-first, etc)
that the program may attend to _not_ giving it exactly the pre-determined
knowledge -- the "Representation" -- that undermines the other analogy
programs? There are an infinite number of possible conceptual predicates;
isn't the most mysterious question how we sort through all of those and come up
with something coherent?
-- One of the reasons this program is of such interest, besides being a success
in its own tiny world, is if -- as alluded to several times -- it is extendible
beyond that microdomain. Have any tests been done to see if the strategy
scales up (e.g., to large numbers of features)?
-- The method with which Copycat deals with a "snag" (as when presented with
the example "abc:abd as xyz:???", for the program does not see "a" as following
"z" in its helpful way) is portrayed as something of an epiphany; is it
substantially different than other strategy-changes?
Here is a little bit more on the Hofstadter readings, overlapping somewhat
with what Alix sent but hopefully still useful! The last chapter is not
really covered, so people will have to read that for themselves. :-)
Adrian Robert
Cognitive Science Department, UCSD
arobert@cogsci.ucsd.edu
----------------------
1) The Approach
Copycat is actually only one of a family of similar models that have been
built by Hofstadter's group (henceforth H) over the years, but it is one of
the most extensively developed ones and good for presenting their methods.
H's approach is shaped by two beliefs: first, that pattern-finding is at the
core of intelligence and that analogy-making is central to pattern-finding,
and, second, that cognition and perception are inseparably intertwined.
The first belief determines *what* they model: SeekWhence extrapolates
number sequences, Jumbo and Numbo solve anagrams and number puzzles by
finding and combining based on known patterns, Copycat solves letter sequence
analogies, Tabletop solves tabletop utensil arrangement analogies.
The second belief determines how they model it. How you cognize about
something is determined by how you represent it, and representation is founded
ultimately in perception. H argues that the essence of human problem solving
often -- if not always -- comes down to finding the (a) right way of looking
at the problem, as opposed to, say, doing some long combinatoric manipulation
on a given initial way of looking at it. This view will sound familiar to
followers of Ed Hutchins's work, and in fact it was also the view of the
Gestalt psychologists, particularly Max Wertheimer. However, while Ed focuses
on transformations of representation that occur outside individual minds, and
Wertheimer studied mainly one-step, "aha" transformations, H likes to look at
temporally-extended representation-building, of the sort that occurs when
solving, for example, anagrams or letter sequence analogies.
H identifies in these cases a kind of accretive process, in which
representations are gradually built up in a quasi-hierarchical fashion, and
which is plausibly employed by humans habitually in all sorts of more serious
domains (such as hacking together linguistic theories). For example, in
solving an anagram, we might first start by noticing combinations of letters
that form common suffixes, like 'ion' or 'ed' and then holding these in mind
while we check if the other letters form a stem. This can be hierarchical as
well -- we might add 'ot' to 'ion' and look for stems, then if this doesn't
work, break it up but still hold the part 'ion' for further research.
Another example is a typical "Copycat problem". Given the pair
"abc->abd", what does "wxy" map onto? Here, one way to go would be to
'see' "abc" as an ascending sequence and then "abd" as a sequence skipping
one at the end (giving solution "wxz"). But if you had "xyz" to start with
instead of "wxy", you would be foiled here, but holding onto this idea of
sequence, you might build a further representation of "abc" as a sequence
'away from one end of the alphabet', and so answer "wyz".
So, introspectively, you have this process of gradually building up a way
of looking at something in some kind of goal directed manner. H's goal in
modeling is to try to make something like this and, by being forced to work
out details to get it to work, gain a better idea of what might be going on
beneath this introspective view.
This leads to the concept of using an artificial "microdomain" rather than
some part of a real domain: much of the action seems to lie in the
hierarchical building up of representations, so you want to be able to have
structures at several levels in your model, and you want each level to be
sufficiently rich to give play for flexibility in building combinations.
2) The Architecture
Copycat works by building up perceptual/conceptual representations in a
kind of working memory or 'current view' space called the "workspace". The
initial workspace contains the raw input, and the representation is built by
the action of operators called "codelets". Codelets are short
recognition-action pieces of code that evaluate a particular perceptual
grouping and possibly implement it by binding the elements together. For
example, a codelet might check whether two letters form an alphabetic
sequence and if so bind them into a new unit. Typically binding occurs in
several stages: early on, things are provisionally bound and their
subelements can still participate in other, possibly conflicting
combinations; later, they are strongly bound and the subelements are parts of
this group only. The set of codelets that can potentially be activated at
any given time sit on the "coderack" and are probabilistically chosen for
execution based on their urgency weightings. Certain codelets, if they
succeed in their groupings, will cause other ones to be placed on the rack.
Usually, for example, a codelet producing a weak binding will generate a
codelet that could produce a stronger binding of the same type. Importantly,
a bound set of elements acquires new interactive properties that change how
codelets can evaluate and operate on it, and also, a bound set can be broken
down again under certain conditions.
Thusfar this sounds a little like a production system, and even more like
a biochemical situation in which there are a number of chemical components
(the representations) and enzymes (the codelets) which act on them. But
added to this is a further twist, a network called the "slipnet", which
guides the generation of "top-down" codelets. (A bunch of "bottom-up"
codelets, which build and bind low-level structures, are automatically thrown
into the soup at the beginning.) The slipnet is a permanent structure with
nodes which stand for per/concepts like predecessor or successor which, if
highly activated, place particular codelets on the coderack. Links connect
the nodes, allowing spreading activation, and there are also connections from
nodes to links, which modulate their weights. Basically, the slipnet
represents permanent conceptual knowledge and the structure of associations
between concepts. For example, when we think of successor, we may also think
of predecessor, because it is a complementary relation. This would be
represented by a link between nodes for "predecessor" and "successor" in the
slipnet, and also a link to this link from a node for "complement".
3) Evaluation
All of this just to solve letter analogies, you say! (And I have left out
a lot!) One of H's points is that you NEED this kind of complexity to really
capture what is interesting about human thought processes. For this reason
they criticize models like SME. It is not that SME is not valid for doing
what Forbus and colleagues say it is doing, modeling one aspect of human
thought mechanisms. But they question how much insight you get from just
looking at one component in isolation. ...
Q: How many levels do you need in a model to really do what H wants to
do? Maybe the microdomains are still too shallow and impoverished to lend any
insight other than into the activity of hacking up programs to solve a certain
class of problems... Also related to this, some modelers such as Edelman and
colleagues (Neural Darwinism) have argued forcefully for the need to include a
motor component in any model, otherwise important interactions are left out.
(They make the same argument for perc-motor as H does for perc-cog. In the
field of neuroscience, at least, evidence for the utility of this motor
"philosophy" has been building...)
Q: Regarding the slipnet, the idea of having modulating links from nodes
to other links seems to capture very well the idea that in thinking you can
trigger _relations_ by association, not just other entities (see example
above). But I wonder whether this mechanism is too powerful, in that it can
potentially generate useful associations far more readily than a human with
the same knowledge. Or maybe it is too weak, and humans are able to leap
intuitively to associations that the slipnet could never dream of. From both
intution and discussions of the other models (SME, etc.), this seems to be a
delicate area.
Q: One of the most distasteful features of the slipnet to many is the fact
that it is hardwired, and some of this problem may come from the fact that
many of its nodes represent quite high level things that clearly have a
complex structure. Has the idea been explored of building the slipnet with a
kind of hierarchical feature-entity-proposition structure like the LISA
network or something similar? With more structure visible, it would begin to
seem more plausible as the kind of thing that humans could acquire through
experience, and perhaps it would open a way to add learning to the model.
|