About
I am a Asst. Professor at Michigan Technological University . My research focuses on applied cognitive science, including decision making, memory, and computational/mathematical modeling. This includes studying crossword puzzle players,
aerial search and rescue and methods for modeling shared knowledge, and techniques for measuring and modeling the impacte of physiological stressors on cognitive abilities.
Background
Prior to coming to MTU, I was a Senior Research Scientist at Appied Research Associates of ARA, Inc. Before that, I was a postdoctoral researcher
Prof. Richard
Shiffrin in the Memory and Perception
Lab at Indiana University's Department of Psychology. I obtained my doctoral degree
in the Cognition and Perception program of the University of Michigan's Psychology
Department. While at Michigan, I worked in the Brain, Cognition, and Action
Laboratory under the direction of David
E. Meyer. I graduated summa cum laud from Drew University with degrees in Mathematics and Psychology.
Research Interests
I study the human cognitive, perceptual, and memory systems using
empirical, computational, mathematical, and statistical techniques. My
primary research interest is in developing models of how human memory
systems represent knowledge, and how people use that knowledge to
accomplish tasks. This ranges from low-level representations of the
perceptual systems to high-level decisions made on the basis of
expert knowledge.
My C.V.
|
| Decision Theory |
Recognition-based Decision Making. Much research on decision theory is about choice and option
comparison. But in naturalistic situations, developing a course
of action is often intuitive and based on recognizing flexible
plans that have worked in the past. In order to better understand
these processes, I have developed
the The Bayesian Recognitional
Decision Model (BRDM), a computational model
of recognition-primed
decision making that combines some of my past work on
recognition memory with theories of naturalistic decision making.
BRDM is used to account for how experts make decisions based on
cues in the environment and their episodic memory and semantic
knowledge.
 | |
Decision Noise. Standard implementations
of signal
detection theory attributes all noise to perceptual processes. I
have developed the Decision
Noise Model (DNM), a modification of Signal Detection Theory that
incorporates noise into the response process as well. The DNM
illustrates how inconsistencies in the response can produce ROC
functions that violate the assumptions of SDT, but that others have
argued point out a fundamental inadequacy of that theory. This claim
led to a published critique of our paper
which we responded to
here. The paper also introduces and discusses a novel ROC
analysis we called the DS-ROC function that can help measure
the influence of perceptual processes in signal detection tasks.
DS-ROC can be a useful way to make an ROC function when you have
measureable noise in the stimulus, but want to avoid the problems and complications with confidence ratings.
| |
Non-parametric measures of decision
sensitivity. Standard measures of sensitivity and bias in
signal detection theory rely on assumptions about signal and noise distributions. As an alternative, researchers have sometimes used the 'non-parametric' measure A' (a-prime), which does not rely on the same types of distribution-based assumptions. In truth, it is probably usually used because it is still defined, even when you have 100% or 0% hits/false alarms. Yet even though A' is commonly used, it does not measure what most people using it have claimed it measures (which is the average between the minimum and maximum ROC functions that can pass through a point). I
have helped publish a correction to the formula for A' (called A), as well as a related measure of response bias. Some further
results were published here. A spreadsheet that can be used to compute A
More information by clicking | |
Choice models incorporating
perceivability, bias, and similarity. Luce's choice theory
offers a popular alternative to Signal Detection Theory, but unlike
SDT which incorporates bias and sensitivity, when choice theory is
applied to confusion data, it typically incorporates only bias and
similarity, ignoring sensitivity or "perceivability" as a distinct
concept. This is somewhat remarkable, and probably stems mostly from
the inability of the estimation models to fit independent measures of
similarity, bias, and percievability. I have
an in-progress paper examining
this issue, which uses variable selection methods to estimate the
three conceptual variables for the same data set, in the context of
visual letter similarity. The graph on the
right shows data from that paper, which illustrates the need to consider
perceivability and similarity separately.
| |
Cultural modeling |
Identifying Cultural ConsensusOne prominent view of culture is that it is
the set of shared beliefs, attitudes, and mental models that people
living in a shared context develop. The question then becomes how you
identify which beliefs are shared, in order to identify whether there
is indeed shared culture. This problem has been addressed previously
in the form of Batchelder's Cultural Consensus Theory, which is a true
innovation in anthropological methods, but not without its
limitations. In response to some of its liitations, I have
developed Cultural Mixture
Modeling, a method using finite mixture models and E-M
optimization to identify cultural consensus, and to identify whether
multiple cultures of belief exist within a single population. Nhe
image on the right shows one application of CMM in the context of the
Afrobarometer survey, which identified three basice pan-african
cultural group views on democracy, with different distributions across nations.
| |
The role of consistent knowledge in
cultural transmission Much research on the transmission of
culture takes the form of agent-based simulations. A common
assumption made by these models I refer to as the "bounded influence
conjecture", which states that people do not listen to views that are
too different from their own. This assumption is often used to create
groups with distinct sets of opposing opinions (as can be identified
with CMM, above). Yet this conjecture has a certain circularity to
it, and so I have developed a
simulation model to illustrate how certain assumptions about the
representation of the knowledge that is maintained and passed between
agents can produce the same behavior as the bounded influence
conjecture. As a side effect, the model produces a novel account of
group polarization effect. The image on the right illustrates how,
through interaction, agents holding many different belief states tend
to converge to a small number of states that are both stable and
polarized.
 | |
Measurement and Metrics |
Psychology Experiment Building Language (PEBL).
I am the originator and primary developer of PEBL
(the Psychology Experiment
Building Language), and the PEBL test battery, which distributes
a suite of about 50 free commonly-used psychological tests. PEBL is
used by researchers, instructors, and clinicians around the world
to conduct experiments, to allow students to get first-hand experience
with standard psychological paradigms, and to test patients. I've
written a detailed user manual for PEBL that you can download or
buy, and I blog about PEBL roughly twice a month
at peblblog.blogspot.com. Image on the right shows a screenshot of one of PEBL's tests, the Tower of London. A have a paper that showing how many tasks in the Cognitive Decathlon (see below) have been implemented using PEBL.
| I wrote a user's manual for PEBL. It can be downloaded
for free at pebl.sf.net, but it is
also available for purchase
via Lulu
press in bound format. It can also be purchased via amazon.com,
but the small cut I get of the low purchase price is even smaller
there.
|
The Cognitive Decathlon. For DARPA's BICA
(Biologically-Inspired Cognitive Architecture) program, I developed
what we called the "Cognitive Decathlon", a set of select behavioral measures that would have been used to
assess the capability and validity of the artificially-intelligent agents being developed for that program.
I have argued that the Cognitive Decathlon was a form of
the Turing test which was intended to measure the embodied
intelligence of these agents. As discussed above, I also have a
paper showing how parts of the Decathlon have
been implemented using PEBL..
Subsequently, I have helped organize several symposia related to
the BICA concept. Image on
the right depicts a high-level organization of the tasks in the
Cognitive Decathlon. This work was featured
on Wired's
Danger Room Column, (apparently) the Russian Popular Mechanics, and Wikipedia. | |
Predicting Human performance under stress I am
currently conducting research to characterize and model how the
physical insults produced by protective gear (as worn by our
military, firefighters, police, astronauts, farmers, etc.) impact
cognitive functioning. As part of this work, we conduct detailed
behavioral experiments to assess the decline in performance from
aspects of the gear (such as heat strain), and we have developed
statistical models using the "Task-Taxon-Task" method to
characterize these
decrements. A paper I wrote
on this won a "Best of conference" award at the BRiMS
conference. The figure on the right illustrates one of the points of
that paper, which is that aspects of protective gear impact the
speed of performance in many different ways, including how you
do the task. In this case, people stopped overlapping their
movements because they could no longer pick up by things by feel alone.
| Human Memory |
Representation and development of knowledge in
episodic and semantic memory. I have done some research
looking at ways to represent knowledge in memory. One concept I
explored was the way in which the meanings of concepts are biased
by the context you see them in. To understand
these Contextual Semantic representations, I developed the
REM-II model of human memory. This bayesian model of human
memory accounts for how experiences form meaningful memories
through semantic knowledge, and how semantic knowledge form
through the accrual of meaningul episodic memories. Essentially,
REM-II learns through experience and develops representations
akin to what LSA produces through matrix algebra and SVD.
Furthermore, knowledge in REM-II is represented in ways that
allows biased representations maintain separate meanings in
different contexts. I have
several conference
papers on this work.
 | |
Phonological Similarity. In order
to measure phonological similarity which affects performance in
immediate serial recall, I developed software tools and techniques
called PSIMETRICA: Phonological SIMilarity METRIC Analysis, (published
in the 2003 article Theoretical
implications of articulatory duration, phonological similarity, and
phonological complexity on verbal working memory.) This paper
describes techniques and methods for measuring phonological similarity
and the articulatory duration of words, two important factors in
predicting immediate serial recall accuracy. The article has been
called "The most careful and sophisticated analysis of the roles of
spoken duration and phonological similarity in verbal STM" (Baddeley,
2003). The lisp-code for the PSIMETRICA software is available below, and slides from a talk on PSIMETRICA
I have given are available as well.
 | |
Strategy in Verbal Working Memory. My dissertation investigated recall strategies in immediate serial
recall using the EPIC Computational Architecture.
Results showed that the recency effect, whose magnitude differs
greatly across experiments, is modulated by the strategic goals of
the research participants. The research on strategy involves both
experimental research and computational modeling using the EPIC cognitive
architecture.
 | |
Duration of the verbal short-term memory trace
Standard textbook accounts of short-term memory typically claim that
"verbal short-term memory lasts about two seconds". This turns out to
have little grounding in the literature, and is probably just confused
interpretetation of a simple model. I've shown this
conjecture cannot be true,
and that a conservative estimate (assuming that memory loss stems only
from decay) is at least 4-5 seconds. The image on the right shows
inferred decay distributions from a number of different conditions,
along with the standard 2-second decay distribution commonly
discussed.
| | | | |
Other Research
The Experiential User Guide
Intelligent software tools One of the big problems users of
intelligent software tools is that they do not understand the complex
reasoning the system is doing, and so do not know when to trust the
system and when not to. The Experiential User Guide is a way to help
people learn about software that uses intelligent processes. It does
so by focusing on lessons that illustrate the boundaries of proper and
improper tool use, and so gives the ultimate user a better
understanding of the underlying system.
Alphabetic Letter Similarity This project investigates the
visual representation of letter stimuli, and collects a
number of data sets which have reported the visual similarity of the English
alphabet, using a number of different methods. These are available in
the Letter Similarity Data Set Archive, which are also summarized in a manuscript that is under revision.
Tree Distance Measures One way to represent the
similarity structure of a set of stimuli is using a
hierarchical clustering tree. To help evaluate such trees,
I have created a set an R (or S-Plus) implementation
of the tree distance metrics described by Boorman & Olivier
[Boorman, S. A., and Olivier, D. C. (1973) Metrics on Finite
Trees. Journal of Mathematical Psychology, 10, 26-59.] The
C-distance metrics replicate those found in the above paper;
the D-metrics do not--this may be an error in the program or
in the original paper. R software is available here.
Select Papers, Posters and Slides from Talks
Below contains links to published arcticles, as well as slides, posters, conference papers, from different presentation venues. See also my curriculum vitae.
Decision Making:
- Mueller, S. T. (2009). A Bayesian
Recognitional Decision Model. To appear in Journal of Cognitive
Engineering and Decision Making.
- Rodriguez, J., McClelland, G., Grome, B., Crandall, B., & Mueller,
S. T. (2008). Modeling human factors involved in chemical/biological
warning and reporting. Chemical Biological Defense (CBD) Physical
Science and Technology Conference, New Orleans, LA, November
2008. Winner, Best Research: Information Systems Technology
- Veinott, E. & Mueller, S. T. (2008). Indecision in the pocket: An
analysis of the relative success of fast and slow quarterback passing
decisions. Northern California Symposium on Statistics and Operations
Research in Sports, Menlo Park, CA, Oct., 2008.
Semantic Representation in Episodic Memory:
- Mueller, S. T. (2007).
Inferring contextual semantics from text using a model of human
episodic memory and conceptual knowledge formation. Paper read at
the 4th Midwest Computational Linguistics Consortium, West Lafayette,
IN.
- Mueller, S. T. (2006). REM-II: A
Bayesian model of the organization of semantic and episodic memory
systems. Poster presented at the annual meeting of the Cognitive
Neuroscience Society, April 2006.
- Mueller, S. T. & Shiffrin,
R. M. (2006) REM II: A
Model of the Developmental Co-Evolution of Episodic Memory and
Semantic Knowledge. Paper presented at the International
Conference on Learning and Development (ICDL), Bloomington, IN, June,
2006. poster.
- Mueller, S. T. (2006). Examining
representations formed by the co-evolution of episodic and semantic
memory. Paper presented at
the Hoosier Mental Life Conference, March 31-April 2, 2006, Bloomington, IN.
Measurement and Metrics
- McClelland, G., Mueller, S. T., Cox, D., & Anno,
G. (2008). Cognitive performance prediction with the T3
methodology. Chemical Biological Defense (CBD) Physical Science and
Technology Conference, New Orleans, LA, November 2008.
- Samsonovich, A., & Mueller, S. T. (2008). Toward a growing
computational replica of the human mind. Preface to the Papers from
the AAAI Fall Symposium, Biologically Inspired Cognitive
Architectures, Menlo Park, AAAI Press.
- Mueller, S. T., & Minnery, B. (2008). Adapting the Turing Test
for Embodied Neurocognitive Evaluation of Biologically-Inspired
cognitive agents. Keynote address at AAAI Fall Symposium on
Biologically Inspired Cognitive Architectures, November, 2008,
Arlington, Virginia.
- Mueller, S. T. & Veinott,
E. S. (2008). Cultural
mixture modeling: Identifying cultural consensus (and disagreement)
using finite mixture modeling. In B. C. Love, K. McRae, &
V. M. Sloutsky (Eds.), Proceedings of the 30th Annual Conference of
the Cognitive Science Society (pp. 64-70). Austin, TX: Cognitive
Science Society.
- Mueller S. T. (2008). Is the
Turing Test Still Relevant? A plan for developing the Cognitive
Decathlon to test intelligent embodied behavior. Proceedings of
the Nineteenth Midwest Artificial Intelligence and Cognitive Science
Conference (MAICS 2008), Cincinnati, OH, April, 2008
- Mueller, S. T., & Veinott, E. S. (2008). Cultural Mixture Modeling: A
method for identifying cultural consensus. ARA Technology Review, 4,
39-45.
- Mueller S. T., Sieck, W. R., & Veinott,
E. S. (2008). The Culture of
Teams: Methods and Metrics. Poster presented at the ARL Advanced
Decision Archictures Collaborative Technology Alliance Technical
Exchange Meeting, Boulder, CO, Feb., 2008.
- Mueller, S. T., Jones, M., Minnery, B. S., & Hiland,
J. M. H. (2007). The BICA Cognitive Decathlon: A test suite for
biologically-inspired cognitive agents. Paper read at the Behavior
Representation in Modeling and Simulation (BRIMS) conference, Norfolk,
VA, March,
2007. Slides. paper.
- Mueller, S. T. (2007). The PEBL Manual, Version 0.08. Available through the Lulu Press, or via free download from http://pebl.sf.net.
- Mueller, S. T. (2004). An
introduction to PEBL: The Psychology Experiment Building Language.
Annual meeting of the Society for Computers in Psychology (SCiP),
Minneapolis, MN, November, 2004.
Signal Detection Theory
- Weidemann, C. T. & Mueller,
S. T. (2008). Decision noise may mask criterion shifts: Reply to
Balakrishnan and MacDonald (2008). Psychonomic Bulletin and Review,
15, 1031-1034.
- Mueller, S. T., & Weidemann,
C. T. . (2008). Decision
Noise: An explanation for observed violations of Signal Detection
Theory. Psychonomic Bulletin and Review, 15, 465-494.
- Mueller, S. T. & Zhang, J. (2006). A non-parametric ROC-based
measure of sensitivity. Appearing in the Third workshop on ROC
analysis in Machine Learning, Pittsburgh, USA, July 2006.
- Mueller, S. T., & Weidemann,
C. T. (2005). Is the use of confidence
ratings in signal detection tasks fundamentally flawed? Poster
presented at the Annual Meeting for Judgment and Decision Making,
Toronto, ON, CA
Visual Representations of words and letters
- Mueller,S. T., Weidemann,
C. T. (2008). Alphabetic
letter perceivability, similarity, and bias. Manuscript under review.
- Mueller, S. T., & Shiffrin, R. M. (2005). A transformation based model of letter
string identification. Paper read at the Annual Meeting of the
Society for Mathematical Psychology, Memphis, TN.
- Mueller, S. T., Weidemann,
C. T. , & Shiffrin, R. M. (2004). Alphabetic Letter Similarity
Matrices. Talk given at the IU Psychology Department colloquium
series.
Phonological Similarity and Verbal Working Memory
- Mueller, S. T., & Krawitz, A. (2009). Reconsidering the two-second decay hypothesis in verbal working memory. Journal of Mathematical Psychology, 53, 14–25.
- Mueller, S. T. (2006). PSIMETRICA: A
Tool for measuring phonological similarity Presentation to the
Indiana University Speech Laboratory.
- Krawitz, A., Mueller, S. T., Kieras, D. E., & Meyer,
D. E. (2004). Executive control
operations for updating of verbal working memory. Poster presented
at the Cognitive Neuroscience Summer School on Working Memory, Bled,
Slovenia, June 2004.
Software
I have created a number of software tools to assist in
creating, designing and running experiments. They are
described below with instructions on how to acquire them.
PSIMETRICA: Phonological SImilarity METRIC Analysis
This software has been
under development and refinement since 1998, and is a series of lisp
routines that allows words to be represented according
to their phonological content, and then compared and evaluated
according to numerous dimensions of phonological similarity. This
software is described in the paper above. The
software includes definitions and measures on many of the classic data
sets used to demonstrate the phonological similarity effect in
short-term memory.
Nonword and Word Evaluation and Creation Software
This software, available for
Linux and other unix platforms, allows for the creation and evaluation
of nonwords according to the conditional probabilities of letters in
the written text. Included are utilities to extract these conditional
probabilities from text, routines to generate new nonwords, a list of
the words from the CMU machine-readable dictionary, pre-computed
conditional probability databases based on the dictionary and the
Kucera/Francis corpus, and routines to evaluate these nonwords (or
actual words) based on these conditional probabilities to determine
their 'wordleness', or regularity according to English letter
combination probabilities. Additionally, routines are included that
allow Levenshtein (edit) distance to be computed between words, as
well as an experimental 'Partial' Levenshtein distance. These are
used by a final set of utilities that determine the neighborhood
distribution of a word or nonword: the distribution of Levenshtein distances
between a word and all other words in the dictionary. For those who
are unable to run the program, it contains 1000 sample non-words of
each length 4 through 10, their wordliness scores and their
neighborhood distributions. Additionally, it contains the same values
for all the words in the CMU dictionary.
Similarity-Based List Generation and Stimulus Selection
This software contains the
seven-dimension similarity matrices for the words of the Toronto Noun
Pool. These include five measures of phonological similarity
developed using PSIMETRICA: Onset, nucleus, coda, initial phoneme,
and stress similarity. Additionally, a measure of semantic similarity
based on LSA is included, as well as a measure of graphemic similarity (edit
or Levenshtein distance). Together, these dimensions can be used to
select from the noun pool subsets of a specified size that are either
similar or different on the different dimensions. For example,
varying onset similarity while holding nucleus similarity constant.
Included are a bunch of pre-selected lists, along with their measures
on the relevant dimensions. Many of these lists have the virtue of
being similar without being 'obvious'; they may not rhyme or share
many phonemes, but they are still similar in subtle and useful ways.
Contact
Email:
smueller at obereed dot net
shanem at mtu dot edu
Personal Information
|