|
|
Questions and Summary
by Mike Hayward and Ezra Van Everbroeck
MARKMAN SUMMARY
Mike Hayward
Dept. of Cognitive Science
This summary covers the higher-level aspects of Markman's work on analogy: the
constraints governing the mapping process (the "Constraints on analogical
inference" paper) and some of the issues he and his colleagues debate with
Hofstadter (the "Analogy just looks like..." paper). The lower-level details
of the MAC/FAC and SME models are described in the other summary.
Paper 1: Constraints on analogical inference
--------------------------------------------
This paper focuses on identifying the principles/constraints guiding the
following key processes: (1) Creating a mapping between corresponding elements
of a base and target (2) Carrying out a set of inference procedures which copy
over some subset of the structure of the base into the target.
Specifically, Markman is hoping to identify principles which determine *which*
information will be copied over.
Mappings are correspondences between structured mental representations, which
consist of entities (e.g. the sun), attributes (e.g. hot(sun)), relations
(e.g. orbit(planets,sun)), and functions (e.g. mass(sun)). All of these are
"elements" which may be mapped, copied, substituted, etc.
According to Markman, the most general description of the candidate inference
generation process is "copying with substitution and generation" (CWSG): for
any element in the base domain with a corresponding element in the target,
copy over all of the representational structure "attached" to that element. In
the copy, substitute target-domain entities whenever a base-domain entity with
a known mapping occurs. If no such mapping exists, copy the base-domain entity
unchanged.
Here's a simple example. Given:
base: English Department
facts: causes( obtain( Eng_faculty, grant ), hire( Eng_faculty, RAs ) )
target: Computer Science Department
facts: obtain( CS_faculty, grant )
we would get the inference
causes( obtain( CS_faculty, grant ), hire( CS_faculty, RAs ) )
because the obtain() predicate matched, so the entire structure it was
embedded in (the "causes" predicate) was copied. "Eng_faculty" and "grants"
had mappings into target domain objects (by virtue of their being arguments of
the shared predicate "obtain"), so they were substituted. "RAs" had no such
mapping, and so was copied unchanged.
But, this CWSG is not very particular about what gets copied - even the most
insignificant correspondence between entities (like "grant" in each domain)
might cause huge structures to be copied over. Unwarranted inferences may
result.
So, what additional constraints are needed? Markman focuses on two:
(1) Systematicity. Matching on minor entities (like "grants") is not good
enough; we need correspondences between relational structures. Matching
predicate structures are referred to as "shared system facts", and only
material connected to shared system facts will be copied over. The bigger the
shared system of interrelated facts, the better.
(2) One-to-one mappings. Sometimes correspondences can occur from many base
elements to one target element (or vice-versa). This is problematic for CWSG,
because there may be more than one possible substitution for an element. The
key question here is: do we build a single many-to-one homomorphism, or many
one-to-one isomorphisms? Markman suggests that humans do the latter,
generating several candidate interpretations each of which is fully specified
and internally consistent.
The paper then goes on to describe a series of human experiments exploring the
behavioral evidence for these constraints.
The Experiments
---------------
The basic design is as follows: subjects are given descriptions of three
departments at a base school, and three (different) departments at a target
school. The base school descriptions contain 2 key conditional statements of
the form "X causes Y" for each department. The target school descriptions
contain 2 key facts per department, which correspond to antecedents of the
conditionals found in the base descriptions. So, if "X causes Y" in the
Computer Science department of the base school, and "X" is true of the Music
department in the target school, then an appropriate analogical inference
would be that "Y" is true of the Music department in the target school (with
appropriate substitutions made into Y).
The studies mix up the correspondences, so that a single target department
might have facts that correspond to antecedents in two different base
departments, or vice-versa.
Subjects were then asked which departments were in correspondence with each
other (with explicit instructions to allow many-to-one mappings), as well as
what outcomes they might predict given the key facts in the target domain.
Results
-------
On the question of systematicity:
Many more of the subjects' inferences were based on shared system facts than
would be expected by chance. That is, they commonly matched key predicates in
the target domain to the identical antecedents in the base domain. This is
taken as evidence that systematicity plays an important role in determining
which inferences are made. In should be noted, however, that in many (most?)
cases shared system facts accounted for only about half of the inferences
made.
On the question of the one-to-one mapping: In all of the studies, there was
not a single instance of an inconsistent object substitution (of the kind CWSG
was prone to, if it were to allow many-to-one relations). Markman takes this
as evidence that people were in fact strictly conforming to the one-to-one
mapping constraint. Note that, as I understand it, such an inconsistent
substitution would require a subject to respond with something as implausible
as: "If the CS department received the grant, I predict the Music department
would hire more RAs".
Markman concludes with a brief review of models, summarized here:
SME (Forbus, Gentner, etc) - provides both of these key constraints
IAM (Keane et al) - provides both of these key constraints
ACME (Holyoak & Thagard) - doesn't enforce the strict one-to-one
mapping; therefore subject to inconsistent
substitutions
LISA (Hummel & Holyoak) - unconstrained in current design,
but these constraints could be integrated
Paper 2: Analogy just looks like high level perception:
Why a domain-general approach to analogical mapping is right
------------------------------------------------------------
Doug Hofstadter has levelled many criticisms against structure-mapping and
related approaches that Markman & many others use to model analogy. This
paper is presented as a reply to those criticisms, though much of it is spent
actively attacking Hofstadter's model, Copycat.
The structure-mapping approach decomposes analogy into representation, access,
mapping, evaluation, adaptation, verification, and schema- abstraction. SME
implements this theoretical framework, applying it to structured
representations composed of formal logic predicates reminiscent of symbolic
AI. Ezra's summary should describe SME in some detail.
For the rest of this summary, I will preface comments by Hofstadter's camp with
"CFH:" (Chalmers, French, & Hofstadter), and by Markman's camp with "FGMF:"
(Forbus, Gentner, Markman, & Ferguson).
CFH:
* analogy is a form of high-level perception;
* the mapping process cannot be separated from the perceptual process;
FGMF Reply:
* cognitive and perceptual processes are mutually dependent but not
inseparable.
* perceptual processing *can* be modularized (see example on pg. 16)
CFH:
* the key to analogy is gist extraction
FGMF:
* the key to analogy is mapping & inference
CFH:
* SME uses very small representations with the key information already
handpicked & brought to the forefront; this isn't remotely as rich as
real human experience & most of the hard work is already done.
FGMF Reply:
* SME *is* given some irrelevant information that it must ignore
* SME *has* been used on some "large" representations (Markman lists some
examples, such as Phineas, which learns physical theories by analogy).
* SME does not necessarily require hand-constructed data; it has been run
on the output of other programs (again, Phineas is an example of this).
CFH:
* SME is not psychologically realistic
FGMF Reply:
* SME has been used to explain the observed developmental shift from object
matching to relational matching
* SME predicts the confusion observed when two equally attractive mappings
are possible
* SME has led to a new framework for understanding "dissimilarity" (as
discussed in Markman's talk of 02/25)
CFH:
* SME is too limited to model complete "discovery" of an analogy.
FGMF Reply:
* those simulations were not intended to model the full discovery process,
just the mapping subprocess.
CFH:
* analogies can flexibly & gradually evolve in a situation; Copycat's
simulated annealing system deals with this. SME, on the other hand, has
to match predicates exactly & inflexibly.
FGMF Reply:
* SME can create more than one interpretation (e.g., one that matches
objects and one that matches relations).
* SME allows non-identical function mappings in the case where higher-order
structures are being mapped.
* SME uses domain-general constraints (like structural consistency), as
compared to Copycat's domain-specific ones
CFH:
* research task is to get output that matches human behavior, as observed
in "casual discussions with a handful of friends".
* aims to model the analogy-generation of brilliant minds, not typical ones.
FGMF:
* must model the *processes*, not just the output.
* a range of methods are needed, and traditional techniques from cognitive
psychology apply.
* a general model applicable to *all* humans is the goal
FGMF on the offensive:
* Copycat *can* be decomposed cleanly into perceptual/representational and
cognitive/comparative components.
* Copycat has a restricted an input format that doesn't much resemble
real-world perceptual stimuli.
* Copycat, with its dependence on perception, has left out memory access
as a source of input.
* Copycat lacks schema-abstraction/learning mechanisms
* Copycat can only make classes of correspondences that were foreseen &
hard-coded by its designers.
* As a direct consequence of its unification of perception and comparison,
CFH's model predicts that there should be *no* domain-independent theories
of analogy.
Some Additional Issues to Consider
----------------------------------
* CFH argue that SME has no "depth" to its concepts; they're just labels. The
FGMF reply is basically: "neither does Copycat". For instance, Copycat has no
representation of geometric similarities of letters, only identity and
sequencing. Is the argument that Copycat could *not* capture geometric
similarities, or that it simply doesn't yet have that component? Could it
completely characterize the conceptual space of such a small microworld, or is
it the case that even the microworld approach cannot escape the combinatorial
explosion?
* In a related issue, structured representations typically cannot represent
similarity between predicates intrinsically, like a distributed representation
can. Does this make the handling of "fuzzy" matches (like "big()" to "tall()")
inelegant/impossible?
* The paper briefly alludes to a process of re-representation which might help
with the n-ary restriction (the fact that the unary predicates tall(A) &
short(B) can't match the binary predicate taller_than(C,D) the way humans
might). What principles might govern such a general process as
re-representation?
* Both groups recognize the need to scale the models up to larger, more
complex (and thus more psychologically realistic) knowledge bases. What might
the major hurdles be, when trying to scale up SME?
* When Holyoak visited, he commented on SME's "utter disregard for the
capacity limits of working memory and attention"; can this model work in some
serial manner that wouldn't overstress working memory with a giant "copy with
substitution"?
Summary and questions by Ezra Van Everbroeck
Dept. of Linguistics
ezra@ling.ucsd.edu
Here's the complement to Mike's summary. I'll be dealing with the lower-
level, implementational aspects of the models described in the MAC/FAC paper,
and the 'Analogy just looks like ...' paper. I'll start with the former,
because it's older (and presupposed in the latter), and because it provides a
much more detailed description of a working model.
MAC/FAC: A Model of Similarity-based Retrieval
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Forbus, Gentner & Law (henceforth FGL) set out to build a comprehensive
computational model of analogy. This is somewhat harder than one might at
first expect, as they point out that there are actually three kinds of
analogy:
* Analogies just based on surface similarities between two items: e.g.
someone's round glasses (i.e. the target) may remind you of the wheels of a
bicycle (i.e. the base), two round objects which are also connected by a metal
frame.
* Analogies just based on structural similarities between base and target:
e.g. visualizations of the structures of molecules always reminds me of the
linguistic analysis of sentences; in both cases, you have independent elements
(atoms/words) which are grouped into more complex structures.
* Analogies based on both structural and surface similarities: e.g. seeing
one computer may remind you of another one. In this kind of 'literal
similarity' match, base and target both tend to appear alike for our senses,
and also share many other functional and structural properties.
Any model of human analogy will have to be able to come up with these three
kinds of analogies. Moreover, psychological experiments have also shown that
surface similarities dominate in retrieval processes, but that structural
similarities are considered more important for higher-level inference
processes. So, despite the roundness of both bicycle wheels and glasses, you
are unlikely (I hope) to infer that glasses need to be inflated once in a
while. On the other hand, knowing that the words in sentences are bound
together by certain forces like semantic restrictions, syntactic
subcategorization frames, and pragmatic factors is much more likely to suggest
to me that the atoms in a molecule are also kept together because of some
other forces. Thus, the inferencing process works when you have the target
item in working memory, you are reminded of the base item (stored in LTM), and
you transfer knowledge you have about the base to the target. In the framework
of FGL, such transfer is subject to two constraints, however:
* One-to-one mapping: one and the same item in the base cannot be linked to
more than one item in the target - so, trivially, there is much less chance of
a monocle reminding you of a bicycle than someone's glasses.
* Parallel connectivity: if you map function(A, B) in the base onto function
(C, D) in the target, then the properties of A and B should also hold of C and
D.
According to FGL, the two older generations of analogy models fail to
account for the known facts about human analogy: (symbolic) case-based
reasoning models are very good at working with structured representations of
memory items, but are not scalable, and they cannot account for the fact that
people pay more attention to surface similarities in retrieval;
(connectionist) feature-based models, on the other hand, are much more
scalable and can pay attention to surface similarities, but have serious
problems modeling the structural relations which are so important for
inferencing. Hence, the need for MAC/FAC (Many Are Called/but Few Are Chosen)
to combine the best of both worlds.
The MAC/FAC model works in two stages: in the first one (MAC), a
computationally cheap process looks for similarities between the item in
working memory and the items in LTM, and then selects the one which matches
best, plus any other items which match closely (within 10% of the winner), for
further processing by the FAC. The similarity matching at this stage is
implemented by comparing normalized feature vectors: each item in LTM has a
vector associated with it which specifies which predicates (+- concepts) play
a role in the definition of the item, and how often each predicate occurs in
the definition. Crucially, though, these vectors do not contain structured
representations, just dumb sums. It is not hard to see that it is much faster
to compare such feature vectors than to compare intricate structures the
latter inherently requires serial processing, while the former can be done in
a parallel fashion. The speed of MAC does come at the price of loss of
accuracy, because there is no guarantee that the items being compared share
any complex representation whatsoever, just that the same predicates are used
somewhere in their definitions.
Once MAC has found the best overall matches in memory for the target item,
the FAC module takes a more critical look at its output. Using parallel SME
(Structure Mapping Engine) processors, a full-blown structure comparison is
made between the target and each of the items which were output by MAC. The
aim of this comparison is to find the best global match, as determined by the
number of correspondences which can be made between the target and the base,
their soundness (i.e. whether the individual correspondences lead to motivated
structural correspondences), their depth (i.e. elaborate structural mappings
are more important than a collection of more superficial correspondences), and
by the number of candidate inferences which the base suggests for the target.
It is at this stage that the constraints of one-to-one mapping and parallel
connectivity mentioned above play an important role in making sure that the
mappings remain sound. In this way, FAC looks for literal similarity between
target and base, acting as a structural filter for the more superficial
similarity-oriented output of MAC. Just like its companion, FAC outputs the
best match it has found, along with any other items whose match scores are
within 10% of the winner.
With all this is mind, it is reassuring to find that FGL found that MAC/FAC
scored qualitatively similar to human subjects on a story retrieval and
inference task. The setup was that the subjects had to read a large number of
stories, and were later asked which of these stories they were reminded of
when they read new stories. Roughly, the following retrievability hierarchy
was observed:
+Surface +Surface -Surface -Surface
+Structure >= -Structure >= +Structure >= -Structure
However, when these same subjects were asked to judge how good the
inferences between stories were, they rated the ones which shared structural
features as much more useful than the ones which just shared surface
similarities. FGL are happy to report that MAC/FAC was able to simulate these
two results: the MAC part was somewhat more sensitive to surface similarities
and output a number of candidates; the FAC part then further reduced the
number of candidates by selecting the ones with the best structural match. (No
comparisons of numbers between human subjects and MAC/FAC, though, because FGL
don't provide them.)
In the second half of the paper, FGL present three sensitivity analyses to
give some insight into how MAC/FAC works. First, they show that the output
selection percentages (i.e. the 10% window for MAC and FAC) are important,
because the model stops working if the MAC percentage goes above 20% (so it's
not the case that the model is so powerful that it could learn anything
anyhow). Second, they demonstrate that the feature vectors of MAC can be
changed considerably without truly affecting the qualitative behavior of the
model (so nothing crucially depends on the input representation). Third, they
illustrate that both surface and structural information is needed in the
representations for the model to work correctly (so it's like humans in
needing both).
Then they move on to a comparison of MAC/FAC with the ARCS model of Thagard
et al. Unsurprisingly, we find that MAC/FAC outperforms ARCS in many different
ways (speed, accuracy, similarity to psychological data.
Finally, the discussion section presents a number of weaknesses of MAC/FAC
and suggestions as to how the model could be extended to deal with them. On
the double, the issues raised here are:
* Retrieval failure: sometimes people are not reminded of anything at all,
so MAC/FAC should be able to fail in a similar manner;
* Goal awareness: there is some evidence which suggests that people's goals
will influence (perhaps even drive) their cognitive behavior, so MAC/FAC could
benefit from a type of representation which takes overall goals into account;
* Vector size: in their current incarnation, the MAC representations would
grow without bounds as the number of predicates increases, so these vectors
may need to either become more abstract, or factorized into a small set of
primitives to remain computationally cheap and efficient;
* Inter-item effects: there is some psychological evidence that different
possible analogies for a given target interact, so MAC/FAC should probably not
be processing different candidates in complete isolation of each other;
* Iterative access: psychological evidence suggests that people can use one
analogy to find a better one, so MAC/FAC may need an output-to-input
connection, so that it can reconsider its own previous output at a later
stage;
* Performance-oriented models: MAC/FAC only models analogy, so it should be
part of something bigger and more comprehensive in terms of human cognition
and behavior;
* Expert behavior: psychological studies show that intensive training and
encoding of certain information makes it easier to access and more likely to
be used for structural similarities, so MAC/FAC needs a way of distinguishing
expert knowledge from other types of knowledge.
So, it seems that there is still some work left for FGL.
Comments
^^^^^^^^
* What is the relationship between analogy and metaphor? The latter too is
usually described as 'understanding in terms of', with knowledge about one
domain helping you out in another one. Presumably, metaphor is just a type of
analogy, but then it would be interesting to see whether MAC/FAC can work with
poetic materials.
* The one-to-one mapping constraint may be overly strict: it could prevent
us from mapping the three people making up a Roman triumvirate onto a single
dictator.
* The choice of the vocabulary for the different representations is of
crucial importance. FGL point at re-representations as one way of dealing with
this (e.g. ORANGE(X) and RED(X) are unrelated, but COLOR(X, RED) and COLOR(X,
ORANGE) are not), but this brings us to COLOR as a primitive term.
* The same issue arises with the potential complexity of the MAC content
vectors: they should contain all the information associated with an item
(including zeros for the non-relevant predicates), but that would be an
enormous amount for a psychologically plausible model. Either using more
abstract terms or more primitive ones (or both) makes it even more important
to know how we can know to choose which terms to use.
* The content vector mechanism would also rank an item defined with 10
statements and an item defined with the negation of these 10 statements as
highly similar - they share all features, and the second one has some extra
ones. Do we want opposites to be this salient?
* MAC and FAC work on different representations of the same items. To what
extent is this dissociation desirable?
* With a MAC selection window of 10%, the FAC window can go up to 100% and
the model data still conform to the human retrievability hierarchy. Hence, we
can get rid of the FAC part?
* Far out: should the weird associations one experiences in dreams also be
explained by a theory of analogy?
Analogy just looks like high level perception
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Mike's summary of this paper already covers most of the material in this
paper, and a lot of the technical details are repetitions of the ones given in
the MAC/FAC paper. So, I'll just describe Phineas here, a model developed for
learning physical theories by analogy.
Phineas has a memory of physical systems, with qualitative descriptions of
their behaviors (e.g. what happens to an object in terms of volume or
temperature when it is heated or cooled). When it is presented with a similar
description of new objects, it first checks whether it can straightforwardly
apply one of its existing models. If Phineas can't, it tries analogy. Using
the SME mechanism, it compares the structure of the new phenomenon with the
those of the phenomena it already knows about, again looking for
correspondences between both surface and structural properties, trying to
develop sound and deep structural correspondences, and checking how fruitful
the overall match is with respect to the number of candidate inferences. Such
inferences can lead Phineas to take another look at the new object and see
whether they apply or not. If they do, Phineas stores what it has learnt about
the new object for future use. To make it somewhat more concrete: the example
in the paper has Phineas map heat flow onto the flow of liquids.
Hofstadter's Copycat is a completely different beast. It lives in a world
populated with alphabetic strings and faces problems like: "If the string
'abc' is transformed into 'abd', then what happens to 'aabbcc'?". In order to
find the answer, it has to use its blackboard architecture in which multiple
parallel and probabilistic rules exchange hypotheses and conclusions. These
rules are also only useful for the domain of alphabetic strings: e.g. that 'd'
follows 'c', and that 'z' is the last letter in the alphabet. In Copycat, all
the concepts employed by the rules are connected in a Slipnet, which specifies
the likelihood of a concept 'slipping into' another one; e.g. 'successor'
becoming 'predecessor'. When it's running, Copycat uses a form of simulated
annealing to guide its processes: at the start, it has a high temperature,
which makes it easier for concepts to slip; as the system cools down, Copycat
will settle on a solution.
As its critics point out, the weights on the connections of the Slipnet are
hand-coded, the concepts have been determined at the start, and Copycat cannot
learn at all. Its domain-specificity also prevents it from being applicable
for any other purpose (unlike SME), and raises the specter of the unscalable
toy model. And even in terms of model complexity, Phineas is already capable
of dealing with larger representations than Copycat is. Taken together with
the higher-level issues, this suggests that Copycat is not the panacea
Hofstadter proclaims it to be.
Comments
^^^^^^^^
* What exactly is the difference between a single module doing perception
and cognition, and two modules which work interleaved? Bandwidth?
* Sometimes, the discussion slips from theoretical issues into
implementational issues: demonstrating that Copycat is no good either, does
not make Phineas a better model.
* Phineas has actually published three papers already! (See page 20.)
* To argue for the superiority of SME, the authors use Phineas. But Phineas
has many more components than SME, so it's unclear which components are doing
the (good) work.
* The authors mention (top of p.7) that not all information about a concept
will feed into MSE, only the part relevant to the task at hand. But how do we
determine this without first checking the relevance of each and every bit of
information? If it's a general mechanism of attention, then we would like to
see a model of that too.
Ezra Van Everbroeck - - - ezra@ling.ucsd.edu - - - Linguistics
Computing Support - - +1 (619) 534-8239 - - UC San Diego
|