| |
In this opinionated review, I draw attention to some of the contributions reinforcement learning can make to questions in the philosophy of mind. In particular, I highlight reinforcement learning's foundational emphasis on the role of reward in agent learning, and canvass two ways in which the framework may advance our understanding of perception and motivation. | |
There has recently been much interest in the role of attention in controlling action. The role has been mischaracterized as an element in necessary and sufficient conditions on agential control. In this paper I attempt a new characterization of the role. I argue that we need to understand attentional control in order to fully understand agential control. To fully understand agential control we must understand paradigm exercises of agential control. Three important accounts of agential control—intentional, reflective, and goal-represented control—do not (...) fully explain such exercises. I argue that understanding them requires understanding how deployments of visual attention implement flexible occurrent control, or a capacity to flexibly adjust the degree of control that individuals exercise over their actions. While such deployments of attention are neither necessary nor sufficient for exercising agential control, they constitute an attentional skill for controlling action, understanding which is central to fully understanding agential control. We can appreciate its centrality if we appreciate that this attentional skill for controlling action is plausibly crucial to acting non-negligently. (shrink) | |
I propose that the successes and contributions of reinforcement learning urge us to see the mind in a new light, namely, to recognise that the mind is fundamentally evaluative in nature. No categories | |
Inspired by, and in close relation with, the contributions of this special issue, Kuperberg elegantly links event comprehension, production, and learning. She proposes an overarching hierarchical generative framework of processing events enabling us to make sense of the world around us and to interact with it in a competent manner. | |
Hohwy et al.’s (2008) model of binocular rivalry (BR) is taken as a classic illustration of predictive coding’s explanatory power. I revisit the account and show that it cannot explain the role of reward in BR. I then consider a more recent version of Bayesian model averaging, which recasts the role of reward in (BR) in terms of optimism bias. If we accept this account, however, then we must reconsider our conception of perception. On this latter view, I argue, organisms (...) engage in what amounts to policy-driven, motivated perception. (shrink) | |
Skilled action typically requires that individuals guide their activities toward some goal. In skilled action, individuals do so excellently. We do not understand well what this capacity to guide consists in. In this paper I provide a case study of how individuals shift visual attention. Their capacity to guide visual attention toward some goal (partly) consists in an empirically discovered sub-system – the executive system. I argue that we can explain how individuals guide by appealing to the operation of this (...) sub-system. Understanding skill and skilled action thus requires appreciating the role of the executive system. (shrink) | |
No categories | |
The purpose of this paper is to defend what I call the action-oriented coding theory (ACT) of spatially contentful visual experience. Integral to ACT is the view that conscious visual experience and visually guided action make use of a common subject-relative or 'egocentric' frame of reference. Proponents of the influential two visual systems hypothesis (TVSH), however, have maintained on empirical grounds that this view is false (Milner & Goodale, 1995/2006; Clark, 1999; 2001; Campbell, 2002; Jacob & Jeannerod, 2003; Goodale & (...) Milner, 2004). One main source of evidence for TVSH comes from behavioral studies of the comparative effects of size-contrast illusions on visual awareness and visuo- motor action. This paper shows that not only is the evidence from illusion studies inconclusive, there is a better, ACT-friendly interpretation of the evidence that avoids serious theoretical difficulties faced by TVSH. (shrink) | |
This paper aims to conceptualize the phenomenology of attentional experience as ‘embodied attention.’ Current psychological research, in describing attentional experiences, tends to apply the so-called spotlight metaphor, according to which attention is characterized as the illumination of certain surrounding objects or events. In this framework, attention is not seen as involving our bodily attitudes or modifying the way we experience those objects and events. It is primarily conceived as a purely mental and volitional activity of the cognizing subject. Against this (...) view, the phenomenology of Maurice Merleau-Ponty shows that attention is a creative activity deeply linked with bodily movements. This paper clarifies and systematizes this view and brings it into dialogue with current empirical findings as well as with current theoretical research on embodied cognition. By doing this, I spell out three main claims about embodied attention: the transcendentalism of embodiment for attention, the bodily subjectivity of attention, and the creativity of embodied attention. (shrink) | |
A theory of the structure and cognitive function of the human imagination that attempts to do justice to traditional intuitions about its psychological centrality is developed, largely through a detailed critique of the theory propounded by Colin McGinn. Like McGinn, I eschew the highly deflationary views of imagination, common amongst analytical philosophers, that treat it either as a conceptually incoherent notion, or as psychologically trivial. However, McGinn fails to develop his alternative account satisfactorily because (following Reid, Wittgenstein and Sartre) he (...) draws an excessively sharp, qualitative distinction between imagination and perception, and because of his flawed, empirically ungrounded conception of hallucination. His arguments in defense of these views are rebutted in detail, and the traditional, passive, Cartesian view of visual perception, upon which several of them implicitly rely, is criticized in the light of findings from recent cognitive science and neuroscience. It is also argued that the apparent intuitiveness of the passive view of visual perception is a result of mere historical contingency. An understanding of perception (informed by modern visual science) as an inherently active process enables us to unify our accounts of perception, mental imagery, dreaming, hallucination, creativity, and other aspects of imagination within a single coherent theoretical framework. (shrink) | |
The temporal structure of behavior contains a rich source of information about its dynamic organization, origins, and development. Today, advances in sensing and data storage allow researchers to collect multiple dimensions of behavioral data at a fine temporal scale both in and out of the laboratory, leading to the curation of massive multimodal corpora of behavior. However, along with these new opportunities come new challenges. Theories are often underspecified as to the exact nature of these unfolding interactions, and psychologists have (...) limited ready-to-use methods and training for quantifying structures and patterns in behavioral time series. In this paper, we will introduce four techniques to interpret and analyze high-density multi-modal behavior data, namely, to: (1) visualize the raw time series, (2) describe the overall distributional structure of temporal events (Burstiness calculation), (3) characterize the nonlinear dynamics over multiple timescales with Chromatic and Anisotropic Cross-Recurrence Quantification Analysis (CRQA), (4) and quantify the directional relations among a set of interdependent multimodal behavioral variables with Granger Causality. Each technique is introduced in a module with conceptual background, sample data drawn from empirical studies and ready-to-use Matlab scripts. The code modules showcase each technique’s application with detailed documentation to allow more advanced users to adapt them to their own datasets. Additionally, to make our modules more accessible to beginner programmers, we provide a “Programming Basics” module that introduces common functions for working with behavioral timeseries data in Matlab. Together, the materials provide a practical introduction to a range of analyses that psychologists can use to discover temporal structure in high-density behavioral data. (shrink) | |
Joint attention has been extensively studied in the developmental literature because of overwhelming evidence that the ability to socially coordinate visual attention to an object is essential to healthy developmental outcomes, including language learning. The goal of this study was to understand the complex system of sensory-motor behaviors that may underlie the establishment of joint attention between parents and toddlers. In an experimental task, parents and toddlers played together with multiple toys. We objectively measured joint attention—and the sensory-motor behaviors that (...) underlie it—using a dual head-mounted eye-tracking system and frame-by-frame coding of manual actions. By tracking the momentary visual fixations and hand actions of each participant, we precisely determined just how often they fixated on the same object at the same time, the visual behaviors that preceded joint attention and manual behaviors that preceded and co-occurred with joint attention. We found that multiple sequential sensory-motor patterns lead to joint attention. In addition, there are developmental changes in this multi-pathway system evidenced as variations in strength among multiple routes. We propose that coordinated visual attention between parents and toddlers is primarily a sensory-motor behavior. Skill in achieving coordinated visual attention in social settings—like skills in other sensory-motor domains—emerges from multiple pathways to the same functional end. (shrink) No categories | |
This paper contrasts two enactive theories of visual experience: the sensorimotor theory (O’Regan and Noë, Behav Brain Sci 24(5):939–1031, 2001; Noë and O’Regan, Vision and mind, 2002; Noë, Action in perception, 2004) and Susan Hurley’s (Consciousness in action, 1998, Synthese 129:3–40, 2001) theory of active perception. We criticise the sensorimotor theory for its commitment to a distinction between mere sensorimotor behaviour and cognition. This is a distinction that is firmly rejected by Hurley. Hurley argues that personal level cognitive abilities emerge (...) out of a complex dynamic feedback system at the subpersonal level. Moreover reflection on the role of eye movements in visual perception establishes a further sense in which a distinction between sensorimotor behaviour and cognition cannot be sustained. The sensorimotor theory has recently come under critical fire (see e.g. Block, J Philos CII(5):259–272, 2005; Prinz, Psyche, 12(1):1–19, 2006; Aizawa, J Philos CIV(1), 2007) for mistaking a merely causal contribution of action to perception for a constitutive contribution. We further argue that the sensorimotor theory is particularly vulnerable to this objection in a way that Hurley’s active perception theory is not. This presents an additional reason for preferring Hurley’s theory as providing a conceptual framework for the enactive programme. (shrink) | |
No categories | |
Displays of eye movements may convey information about cognitive processes but require interpretation. We investigated whether participants were able to interpret displays of their own or others' eye movements. In Experiments 1 and 2, participants observed an image under three different viewing instructions. Then they were shown static or dynamic gaze displays and had to judge whether it was their own or someone else's eye movements and what instruction was reflected. Participants were capable of recognizing the instruction reflected in their (...) own and someone else's gaze display. Instruction recognition was better for dynamic displays, and only this condition yielded above chance performance in recognizing the display as one's own or another person's. Experiment 3 revealed that order information in the gaze displays facilitated instruction recognition when transitions between fixated regions distinguish one viewing instruction from another. Implications of these findings are discussed. (shrink) | |
Decision making in any brain is imperfect and costly in terms of time and energy. Operating under such constraints, an organism could be in a position to improve performance if an opportunity arose to exploit informative patterns in the environment being searched. Such an improvement of performance could entail both faster and more accurate (i.e., reward-maximizing) decisions. The present study investigated the extent to which human participants could learn to take advantage of immediate patterns in the spatial arrangement of serially (...) presented foods such that a region of space would consistently be associated with greater subjective value. Eye movements leading up to choices demonstrated rapidly induced biases in the selective allocation of visual fixation and attention that were accompanied by both faster and more accurate choices of desired goods as implicit learning occurred. However, for the control condition with its spatially balanced reward environment, these subjects exhibited preexisting lateralized biases for eye and hand movements (i.e., leftward and rightward, respectively) that could act in opposition not only to each other but also to the orienting biases elicited by the experimental manipulation, producing an asymmetry between the left and right hemifields with respect to performance. Potentially owing at least in part to learned cultural conventions (e.g., reading from left to right), the findings herein particularly revealed an intrinsic leftward bias underlying initial saccades in the midst of more immediate feedback-directed processes for which spatial biases can be learned flexibly to optimize oculomotor and manual control in value-based decision making. The present study thus replicates general findings of learned attentional biases in a novel context with inherently rewarding stimuli and goes on to further elucidate the interactions between endogenous and exogenous biases. (shrink) | |
This study investigated an evolutionary-adaptive explanation for the cultural ubiquity of choreographed synchronous dance: that it evolved to increase interpersonal aesthetic appreciation and/or attractiveness. In turn, it is assumed that this may have facilitated social bonding and therefore procreation between individuals within larger groups. In this dual-dancer study, individuals performed fast or slow hip-hop choreography to fast-, medium-, or slow-tempo music; when paired laterally, this gave rise to split-screen video stimuli in which there were four basic categories of dancer and (...) music synchrony: synchronous dancers, synchronous music; synchronous dancers, asynchronous music; asynchronous dancers, one dancer synchronous with music; and asynchronous dancers, asynchronous music. Participants’ pupil dilations and aesthetic appreciation of the dancing were recorded for each video, with the expectation that these measures would covary with levels of synchronization. While results were largely consistent with the hypothesis, the findings also indicated that interpersonal aesthetic appreciation was driven by a hierarchy of synchrony between the dancers: stimuli in which only one dancer was synchronous with the music were rated lower than stimuli in which the dancers were asynchronous with each other and with the music; i.e., stimuli in which the dancers were unequal were judged less favorably than those in which the dancers were equal, albeit asynchronously. Stimuli in which all elements were synchronous, dancers and music, were rated highest and, in general, elicited greater pupil dilations. (shrink) | |
Embodiment matters to perception and action. Beyond the triviality that, under normal circumstances, we need a body in order to perceive the world and act in it, our particular embodiment, right here, right now, both enables and constrains our perception of possibilities for action. In this chapter, we provide empirical support for the idea that the structural and morphological features of the body can narrow the set of our possible interactions with the environment by shaping the way we perceive the (...) possibilities for action provided. We argue that this narrowing holds true in the perception of what we call strongly embodied affordances, that is, relevant micro-affordances that have a genuinely demanding characteristic, as well as in the perception of actions performed by others. In particular, we show that perceptual contents are shaped by fine-grained morphological features of the body, such as specific hand-shapes, and that they change according to our possibility to act upon them with this body, in this situation, at this moment. We argue that these considerations provide insight into distinguishing a variety of experienced affordance relations that can aid us in better understanding the relevance of embodiment for perception and experience. (shrink) | |
Human performance in natural environments is deeply impressive, and still much beyond current AI. Experimental techniques, such as eye tracking, may be useful to understand the cognitive basis of this performance, and “the human advantage.” Driving is domain where these techniques may deployed, in tasks ranging from rigorously controlled laboratory settings through high-fidelity simulations to naturalistic experiments in the wild. This research has revealed robust patterns that can be reliably identified and replicated in the field and reproduced in the lab. (...) The purpose of this review is to cover the basics of what is known about these gaze behaviors, and some of their implications for understanding visually guided steering. The phenomena reviewed will be of interest to those working on any domain where visual guidance and control with similar task demands is involved. The paper is intended to be accessible to the non-specialist, without oversimplifying the complexity of real-world visual behavior. The literature reviewed will provide an information base useful for researchers working on oculomotor behaviors and physiology in the lab who wish to extend their research into more naturalistic locomotor tasks, or researchers in more applied fields who wish to bring aspects of the real-world ecology under experimental scrutiny. Part of a Research Topic on Gaze Strategies in Closed Self-paced tasks, this aspect of the driving task is discussed. It is in particular emphasized why it is important to carefully separate the visual strategies driving from visual behaviors relevant to other forms of driver behavior. There is always a balance to strike between ecological complexity and experimental control. One way to reconcile these demands is to look for natural, real-world tasks and behavior that are rich enough to be interesting yet sufficiently constrained and well-understood to be replicated in simulators and the lab. This ecological approach to driving as a model behavior and the way the connection between “lab” and “real world” can be spanned in this research is of interest to anyone keen to develop more ecologically representative designs for studying human gaze behavior. (shrink) | |
In contrast to Constructivist Views, which construe perceptual cognition as an essentially reconstructive process, this article recommends the Deictic View, which grounds perception in perceptual-demonstrative reference and the use of deictic tracking strategies for acquiring and updating knowledge about individuals. The view raises the problem of how sensory-motor tracking connects to epistemic and integrated forms of tracking. To study the strategies used to solve this problem, we report a study of the ability to track distal individuals when only their directions (...) can be perceived and not their locations. We introduce a new experimental paradigm named the 'Modified Traveling Salesman Problem' (MTSP), which requires subjects to visit n invisible targets in a 2D display once each. Surprisingly, subjects are competent at this task for up to 10 targets. We consider two types of tracking strategies that subjects might use: 'location-based' strategies and 'deictic direction-based' strategies. A number of observations suggest that subjects used the latter, at least for larger numbers of targets. We hypothesize that subjects used perceptual-demonstrative reference and deictic strategies (i) to perform the sensory-motor tracking of directional segments, (ii) to bind the segments with their updated status in the task, and (iii) to perform the epistemic tracking of invisible targets by means of perception-based inferences. (shrink) | |
In recent years, a multitude of datasets of human–human conversations has been released for the main purpose of training conversational agents based on data‐hungry artificial neural networks. In this paper, we argue that datasets of this sort represent a useful and underexplored source to validate, complement, and enhance cognitive studies on human behavior and language use. We present a method that leverages the recent development of powerful computational models to obtain the fine‐grained annotation required to apply metrics and techniques from (...) Cognitive Science to large datasets. Previous work in Cognitive Science has investigated the question‐asking strategies of human participants by employing different variants of the so‐called 20‐question‐game setting and proposing several evaluation methods. In our work, we focus on GuessWhat, a task proposed within the Computer Vision and Natural Language Processing communities that is similar in structure to the 20‐question‐game setting. Crucially, the GuessWhat dataset contains tens of thousands of dialogues based on real‐world images, making it a suitable setting to investigate the question‐asking strategies of human players on a large scale and in a natural setting. Our results demonstrate the effectiveness of computational tools to automatically code how the hypothesis space changes throughout the dialogue in complex visual scenes. On the one hand, we confirm findings from previous work on smaller and more controlled settings. On the other hand, our analyses allow us to highlight the presence of “uninformative” questions (in terms of Expected Information Gain) at specific rounds of the dialogue. We hypothesize that these questions fulfill pragmatic constraints that are exploited by human players to solve visual tasks in complex scenes successfully. Our work illustrates a method that brings together efforts and findings from different disciplines to gain a better understanding of human question‐asking strategies on large‐scale datasets, while at the same time posing new questions about the development of conversational systems. (shrink) | |
We study the claim that multisensory environments are useful for visual learning because nonvisual percepts can be processed to produce error signals that people can use to adapt their visual systems. This hypothesis is motivated by a Bayesian network framework. The framework is useful because it ties together three observations that have appeared in the literature: (a) signals from nonvisual modalities can “teach” the visual system; (b) signals from nonvisual modalities can facilitate learning in the visual system; and (c) visual (...) signals can become associated with (or be predicted by) signals from nonvisual modalities. Experimental data consistent with each of these observations are reviewed. (shrink) | |
New technologies and new ways of thinking have recently led to rapid expansions in the study of perceptual learning. We describe three themes shared by many of the nine articles included in this topic on Integrated Approaches to Perceptual Learning. First, perceptual learning cannot be studied on its own because it is closely linked to other aspects of cognition, such as attention, working memory, decision making, and conceptual knowledge. Second, perceptual learning is sensitive to both the stimulus properties of the (...) environment in which an observer exists and to the properties of the tasks that the observer needs to perform. Moreover, the environmental and task properties can be characterized through their statistical regularities. Finally, the study of perceptual learning has important implications for society, including implications for science education and medical rehabilitation. Contributed articles relevant to each theme are summarized. (shrink) | |
Visual learning has been intensively studied in higher mammals, both during development and in adulthood. What is less clear is the extent and properties such plasticity may acquire following permanent damage to the adult visual system. Answering this question is important. Aside from improving our understanding of visual processing in the absence of an intact visual circuitry, such knowledge is essential for the development of effective therapies to rehabilitate the increasing number of people who suffer the functional consequences of damage (...) at different levels of their visual cortical hierarchy. This review summarizes the known characteristics of visual learning after adult visual cortex damage and begins to dissect some of the neural correlates of this process. (shrink) | |
The efficient coding hypothesis posits that sensory systems are tuned to the regularities of their natural input. The statistics of natural image databases have been the topic of many studies, which have revealed biases in the distribution of orientations that are related to neural representations as well as behavior in psychophysical tasks. However, commonly used natural image databases contain images taken with a camera with a planar image sensor and limited field of view. Thus, these images do not incorporate the (...) physical properties of the visual system and its active use reflecting body and eye movements. Here, we investigate quantitatively, whether the active use of the visual system influences image statistics across the visual field by simulating visual behaviors in an avatar in a naturalistic virtual environment. Images with a field of view of 120° were generated during exploration of a virtual forest environment both for a human and cat avatar. The physical properties of the visual system were taken into account by projecting the images onto idealized retinas according to models of the eyes' geometrical optics. Crucially, different active gaze behaviors were simulated to obtain image ensembles that allow investigating the consequences of active visual behaviors on the statistics of the input to the visual system. In the central visual field, the statistics of the virtual images matched photographic images regarding their power spectra and a bias in edge orientations toward cardinal directions. At larger eccentricities, the cardinal bias was superimposed with a gradually increasing radial bias. The strength of this effect depends on the active visual behavior and the physical properties of the eye. There were also significant differences between the upper and lower visual field, which became stronger depending on how the environment was actively sampled. Taken together, the results show that quantitatively relating natural image statistics to neural representations and psychophysical behavior requires not only to take the structure of the environment into account, but also the physical properties of the visual system, and its active use in behavior. (shrink) |