Shane T Mueller, Ph.D.


I am a Asst. Professor at Michigan Technological University . My research focuses on applied cognitive science, including decision making, memory, and computational/mathematical modeling. This includes studying crossword puzzle players, aerial search and rescue and methods for modeling shared knowledge, and techniques for measuring and modeling the impacte of physiological stressors on cognitive abilities.


Prior to coming to MTU, I was a Senior Research Scientist at Appied Research Associates of ARA, Inc. Before that, I was a postdoctoral researcher Prof. Richard Shiffrin in the Memory and Perception Lab at Indiana University's Department of Psychology. I obtained my doctoral degree in the Cognition and Perception program of the University of Michigan's Psychology Department. While at Michigan, I worked in the Brain, Cognition, and Action Laboratory under the direction of David E. Meyer. I graduated summa cum laud from Drew University with degrees in Mathematics and Psychology.

Research Interests

I study the human cognitive, perceptual, and memory systems using empirical, computational, mathematical, and statistical techniques. My primary research interest is in developing models of how human memory systems represent knowledge, and how people use that knowledge to accomplish tasks. This ranges from low-level representations of the perceptual systems to high-level decisions made on the basis of expert knowledge. My C.V.

Decision Theory

  • Recognition-based Decision Making. Much research on decision theory is about choice and option comparison. But in naturalistic situations, developing a course of action is often intuitive and based on recognizing flexible plans that have worked in the past. In order to better understand these processes, I have developed the The Bayesian Recognitional Decision Model (BRDM), a computational model of recognition-primed decision making that combines some of my past work on recognition memory with theories of naturalistic decision making. BRDM is used to account for how experts make decisions based on cues in the environment and their episodic memory and semantic knowledge.
  • Decision Noise. Standard implementations of signal detection theory attributes all noise to perceptual processes. I have developed the Decision Noise Model (DNM), a modification of Signal Detection Theory that incorporates noise into the response process as well. The DNM illustrates how inconsistencies in the response can produce ROC functions that violate the assumptions of SDT, but that others have argued point out a fundamental inadequacy of that theory. This claim led to a published critique of our paper which we responded to here. The paper also introduces and discusses a novel ROC analysis we called the DS-ROC function that can help measure the influence of perceptual processes in signal detection tasks. DS-ROC can be a useful way to make an ROC function when you have measureable noise in the stimulus, but want to avoid the problems and complications with confidence ratings.
  • Non-parametric measures of decision sensitivity. Standard measures of sensitivity and bias in signal detection theory rely on assumptions about signal and noise distributions. As an alternative, researchers have sometimes used the 'non-parametric' measure A' (a-prime), which does not rely on the same types of distribution-based assumptions. In truth, it is probably usually used because it is still defined, even when you have 100% or 0% hits/false alarms. Yet even though A' is commonly used, it does not measure what most people using it have claimed it measures (which is the average between the minimum and maximum ROC functions that can pass through a point). I have helped publish a correction to the formula for A' (called A), as well as a related measure of response bias. Some further results were published here. A spreadsheet that can be used to compute A
  • More information by clicking
  • Choice models incorporating perceivability, bias, and similarity. Luce's choice theory offers a popular alternative to Signal Detection Theory, but unlike SDT which incorporates bias and sensitivity, when choice theory is applied to confusion data, it typically incorporates only bias and similarity, ignoring sensitivity or "perceivability" as a distinct concept. This is somewhat remarkable, and probably stems mostly from the inability of the estimation models to fit independent measures of similarity, bias, and percievability. I have an in-progress paper examining this issue, which uses variable selection methods to estimate the three conceptual variables for the same data set, in the context of visual letter similarity. The graph on the right shows data from that paper, which illustrates the need to consider perceivability and similarity separately.
  • Cultural modeling

  • Identifying Cultural ConsensusOne prominent view of culture is that it is the set of shared beliefs, attitudes, and mental models that people living in a shared context develop. The question then becomes how you identify which beliefs are shared, in order to identify whether there is indeed shared culture. This problem has been addressed previously in the form of Batchelder's Cultural Consensus Theory, which is a true innovation in anthropological methods, but not without its limitations. In response to some of its liitations, I have developed Cultural Mixture Modeling, a method using finite mixture models and E-M optimization to identify cultural consensus, and to identify whether multiple cultures of belief exist within a single population. Nhe image on the right shows one application of CMM in the context of the Afrobarometer survey, which identified three basice pan-african cultural group views on democracy, with different distributions across nations.
  • The role of consistent knowledge in cultural transmission Much research on the transmission of culture takes the form of agent-based simulations. A common assumption made by these models I refer to as the "bounded influence conjecture", which states that people do not listen to views that are too different from their own. This assumption is often used to create groups with distinct sets of opposing opinions (as can be identified with CMM, above). Yet this conjecture has a certain circularity to it, and so I have developed a simulation model to illustrate how certain assumptions about the representation of the knowledge that is maintained and passed between agents can produce the same behavior as the bounded influence conjecture. As a side effect, the model produces a novel account of group polarization effect. The image on the right illustrates how, through interaction, agents holding many different belief states tend to converge to a small number of states that are both stable and polarized.
  • Measurement and Metrics

  • Psychology Experiment Building Language (PEBL). I am the originator and primary developer of PEBL (the Psychology Experiment Building Language), and the PEBL test battery, which distributes a suite of about 50 free commonly-used psychological tests. PEBL is used by researchers, instructors, and clinicians around the world to conduct experiments, to allow students to get first-hand experience with standard psychological paradigms, and to test patients. I've written a detailed user manual for PEBL that you can download or buy, and I blog about PEBL roughly twice a month at Image on the right shows a screenshot of one of PEBL's tests, the Tower of London. A have a paper that showing how many tasks in the Cognitive Decathlon (see below) have been implemented using PEBL.
  • I wrote a user's manual for PEBL. It can be downloaded for free at, but it is also available for purchase via Lulu press in bound format. It can also be purchased via, but the small cut I get of the low purchase price is even smaller there.
  • The Cognitive Decathlon. For DARPA's BICA (Biologically-Inspired Cognitive Architecture) program, I developed what we called the "Cognitive Decathlon", a set of select behavioral measures that would have been used to assess the capability and validity of the artificially-intelligent agents being developed for that program. I have argued that the Cognitive Decathlon was a form of the Turing test which was intended to measure the embodied intelligence of these agents. As discussed above, I also have a paper showing how parts of the Decathlon have been implemented using PEBL.. Subsequently, I have helped organize several symposia related to the BICA concept. Image on the right depicts a high-level organization of the tasks in the Cognitive Decathlon. This work was featured on Wired's Danger Room Column, (apparently) the Russian Popular Mechanics, and Wikipedia.
  • Predicting Human performance under stress I am currently conducting research to characterize and model how the physical insults produced by protective gear (as worn by our military, firefighters, police, astronauts, farmers, etc.) impact cognitive functioning. As part of this work, we conduct detailed behavioral experiments to assess the decline in performance from aspects of the gear (such as heat strain), and we have developed statistical models using the "Task-Taxon-Task" method to characterize these decrements. A paper I wrote on this won a "Best of conference" award at the BRiMS conference. The figure on the right illustrates one of the points of that paper, which is that aspects of protective gear impact the speed of performance in many different ways, including how you do the task. In this case, people stopped overlapping their movements because they could no longer pick up by things by feel alone.
  • Human Memory

  • Representation and development of knowledge in episodic and semantic memory. I have done some research looking at ways to represent knowledge in memory. One concept I explored was the way in which the meanings of concepts are biased by the context you see them in. To understand these Contextual Semantic representations, I developed the REM-II model of human memory. This bayesian model of human memory accounts for how experiences form meaningful memories through semantic knowledge, and how semantic knowledge form through the accrual of meaningul episodic memories. Essentially, REM-II learns through experience and develops representations akin to what LSA produces through matrix algebra and SVD. Furthermore, knowledge in REM-II is represented in ways that allows biased representations maintain separate meanings in different contexts. I have several conference papers on this work.
  • Phonological Similarity. In order to measure phonological similarity which affects performance in immediate serial recall, I developed software tools and techniques called PSIMETRICA: Phonological SIMilarity METRIC Analysis, (published in the 2003 article Theoretical implications of articulatory duration, phonological similarity, and phonological complexity on verbal working memory.) This paper describes techniques and methods for measuring phonological similarity and the articulatory duration of words, two important factors in predicting immediate serial recall accuracy. The article has been called "The most careful and sophisticated analysis of the roles of spoken duration and phonological similarity in verbal STM" (Baddeley, 2003). The lisp-code for the PSIMETRICA software is available below, and slides from a talk on PSIMETRICA I have given are available as well.
  • Strategy in Verbal Working Memory. My dissertation investigated recall strategies in immediate serial recall using the EPIC Computational Architecture. Results showed that the recency effect, whose magnitude differs greatly across experiments, is modulated by the strategic goals of the research participants. The research on strategy involves both experimental research and computational modeling using the EPIC cognitive architecture.
  • Duration of the verbal short-term memory trace Standard textbook accounts of short-term memory typically claim that "verbal short-term memory lasts about two seconds". This turns out to have little grounding in the literature, and is probably just confused interpretetation of a simple model. I've shown this conjecture cannot be true, and that a conservative estimate (assuming that memory loss stems only from decay) is at least 4-5 seconds. The image on the right shows inferred decay distributions from a number of different conditions, along with the standard 2-second decay distribution commonly discussed.
  • Other Research

    The Experiential User Guide

    Intelligent software tools One of the big problems users of intelligent software tools is that they do not understand the complex reasoning the system is doing, and so do not know when to trust the system and when not to. The Experiential User Guide is a way to help people learn about software that uses intelligent processes. It does so by focusing on lessons that illustrate the boundaries of proper and improper tool use, and so gives the ultimate user a better understanding of the underlying system.

    Alphabetic Letter Similarity

    This project investigates the visual representation of letter stimuli, and collects a number of data sets which have reported the visual similarity of the English alphabet, using a number of different methods. These are available in the Letter Similarity Data Set Archive, which are also summarized in a manuscript that is under revision.

    Tree Distance Measures

    One way to represent the similarity structure of a set of stimuli is using a hierarchical clustering tree. To help evaluate such trees, I have created a set an R (or S-Plus) implementation of the tree distance metrics described by Boorman & Olivier [Boorman, S. A., and Olivier, D. C. (1973) Metrics on Finite Trees. Journal of Mathematical Psychology, 10, 26-59.] The C-distance metrics replicate those found in the above paper; the D-metrics do not--this may be an error in the program or in the original paper. R software is available here.

    Select Papers, Posters and Slides from Talks

    Below contains links to published arcticles, as well as slides, posters, conference papers, from different presentation venues. See also my curriculum vitae.

    Decision Making:

    • Mueller, S. T. (2009). A Bayesian Recognitional Decision Model. To appear in Journal of Cognitive Engineering and Decision Making.
    • Rodriguez, J., McClelland, G., Grome, B., Crandall, B., & Mueller, S. T. (2008). Modeling human factors involved in chemical/biological warning and reporting. Chemical Biological Defense (CBD) Physical Science and Technology Conference, New Orleans, LA, November 2008. Winner, Best Research: Information Systems Technology
    • Veinott, E. & Mueller, S. T. (2008). Indecision in the pocket: An analysis of the relative success of fast and slow quarterback passing decisions. Northern California Symposium on Statistics and Operations Research in Sports, Menlo Park, CA, Oct., 2008.

    Semantic Representation in Episodic Memory:

    Measurement and Metrics

    Signal Detection Theory

    Visual Representations of words and letters

    Phonological Similarity and Verbal Working Memory


    I have created a number of software tools to assist in creating, designing and running experiments. They are described below with instructions on how to acquire them.

    PSIMETRICA: Phonological SImilarity METRIC Analysis

    software has been under development and refinement since 1998, and is a series of lisp routines that allows words to be represented according to their phonological content, and then compared and evaluated according to numerous dimensions of phonological similarity. This software is described in the paper above. The software includes definitions and measures on many of the classic data sets used to demonstrate the phonological similarity effect in short-term memory.

    Nonword and Word Evaluation and Creation Software

    This software, available for Linux and other unix platforms, allows for the creation and evaluation of nonwords according to the conditional probabilities of letters in the written text. Included are utilities to extract these conditional probabilities from text, routines to generate new nonwords, a list of the words from the CMU machine-readable dictionary, pre-computed conditional probability databases based on the dictionary and the Kucera/Francis corpus, and routines to evaluate these nonwords (or actual words) based on these conditional probabilities to determine their 'wordleness', or regularity according to English letter combination probabilities. Additionally, routines are included that allow Levenshtein (edit) distance to be computed between words, as well as an experimental 'Partial' Levenshtein distance. These are used by a final set of utilities that determine the neighborhood distribution of a word or nonword: the distribution of Levenshtein distances between a word and all other words in the dictionary. For those who are unable to run the program, it contains 1000 sample non-words of each length 4 through 10, their wordliness scores and their neighborhood distributions. Additionally, it contains the same values for all the words in the CMU dictionary.

    Similarity-Based List Generation and Stimulus Selection

    This software contains the seven-dimension similarity matrices for the words of the Toronto Noun Pool. These include five measures of phonological similarity developed using PSIMETRICA: Onset, nucleus, coda, initial phoneme, and stress similarity. Additionally, a measure of semantic similarity based on LSA is included, as well as a measure of graphemic similarity (edit or Levenshtein distance). Together, these dimensions can be used to select from the noun pool subsets of a specified size that are either similar or different on the different dimensions. For example, varying onset similarity while holding nucleus similarity constant. Included are a bunch of pre-selected lists, along with their measures on the relevant dimensions. Many of these lists have the virtue of being similar without being 'obvious'; they may not rhyme or share many phonemes, but they are still similar in subtle and useful ways.



    smueller at obereed dot net
    shanem at mtu dot edu

    Personal Information