Until now, we have primarily focused on sensing sound and music as auditory experiences. While hearing is central, it is equally important to recognise that we also perceive sound and music through our bodies. This week, we will explore how the body shapes music perception and cognition. We will begin by examining the body’s role in musical experience, then review key aspects of anatomy and biomechanics, and finally consider different methods for analysing human movement.
Embodied music cognition¶
Recall that you have been introduced to the theories of embodied music cognition at the beginning of this course. An embodied approach to music cognition emphasises the role of the body in producing, perceiving, and understanding music. Rather than viewing musical experience as only and auditory phenomenon—orpurely mental or abstract—this approach highlights how bodily sensations, movements, and actions are integral to musical meaning.
Some key ideas in embodied music cognition include:
Sensorimotor coupling: Listening to music frequently elicits spontaneous bodily responses—such as tapping your foot, nodding your head, or swaying—demonstrating the tight connection between perception and action. This coupling underpins the action–perception loop, where perceiving music can trigger movement, and movement, in turn, shapes how music is experienced. Such interactions are fundamental to musical engagement, learning, and expression.
Multimodal integration: Musical experience is inherently multimodal, involving the simultaneous processing and integration of auditory (sound), visual (sight), and kinesthetic (movement and bodily sensation) information. The body acts as a central hub for these sensory streams, allowing us to coordinate what we hear, see, and feel. For example, watching a performer’s gestures can enhance our understanding of musical phrasing, while feeling vibrations or movement can deepen our sense of rhythm and timing.
Gesture and imagery: Expressive body movements—ranging from large gestures to subtle motion—are essential in music performance, aiding in communication, phrasing, and emotional expression. These gestures not only shape the sound produced but also influence how performers and listeners mentally represent music. Even in the absence of overt movement, musicians and listeners often engage in motor imagery: mentally simulating gestures or actions associated with music, which can support learning, memory, and interpretation.
The Belgian systematic musicology professor Marc Leman popularised the term embodied music cognition in the early 2000s. In his book, Embodied Music Cognition and Mediation Technology, he illustrates the different processes in the bodies of both performers and perceivers. Here, he described how musical intentionality is based on sonic and visual communication between performer and perceiver:

An illustration of Marc Leman’s model of embodied music cognition. (Illustration from Sound Actions).
Embodied music cognition is explored through multiple disciplines, including musicology, psychology, neuroscience, and human movement science. As a result, the literature presents a variety of perspectives and methodologies. Most research in this area relies on empirical studies—systematically collecting and analysing data about musical experiences and bodily responses. This interdisciplinary approach enriches our understanding of how the body shapes musical perception, performance, and meaning.
4E Cognition¶
In recent years, the concept of 4E cognition has gained prominence, building on the “embodied turn” in cognitive science and philosophy of mind. The 4E framework—standing for Embodied, Embedded, Enactive, and Extended cognition—challenges traditional views that treat cognition as a process confined solely to the brain.
Pioneering researchers such as Francisco Varela, Evan Thompson, Alva Noë, and Andy Clark have argued that understanding cognition requires considering the body, the environment, and the dynamic interactions between them. According to the 4E perspective, cognitive processes are:
Embodied: Our perception and understanding of music are deeply rooted in the body. The way we move, breathe, and physically interact with instruments or our environment shapes how we experience and interpret sound. For example, tapping your foot to a beat or feeling vibrations through your body are embodied ways of sensing music.
Embedded: Musical experience is shaped by the context in which it occurs. Our interactions with sound are influenced by the physical and social environment — such as the acoustics of a concert hall, the presence of other listeners, or the cultural context. These factors embed our musical cognition within a broader context.
Enactive: We actively participate in creating musical meaning through our actions. Listening is not a passive act; we anticipate, move, and respond to music, co-constructing the experience. For instance, a performer’s gestures or a listener’s dance movements are examples of enactive engagement with music.
Extended: Tools and technologies can extend our cognitive processes. Musical instruments, recording devices, and even smartphones become part of the system through which we sense, produce, and understand music. Motion capture systems, for example, extend our ability to analyse and reflect on musical movement.
The 4E cognition perspective encourages us to study music as a holistic, interactive process—one that involves the whole person, situated in a specific context, engaging with both physical and digital tools.
Despite its influence, the 4E cognition framework has faced several criticisms and ongoing debates, which have also been discussed in the recent Handbook of 4E Cognition. Critics point out that the definitions of “embodied,” “embedded,” “enactive,” and “extended” can be vague or overlapping, making it challenging to distinguish 4E cognition from traditional cognitive science in practice. Some question whether there is sufficient empirical evidence for all aspects of the framework, particularly the claim that cognition can be genuinely “extended” into tools or the environment. Others warn against overextending the concept, attributing cognitive status to objects or processes (such as smartphones or musical instruments) that may not truly participate in cognition. Additionally, some argue that 4E approaches can underplay the central role of neural mechanisms emphasised in traditional neuroscience, and philosophical disagreements persist about whether 4E cognition represents a radical shift or reframes existing ideas about mind, body, and environment. Nevertheless, the 4E perspective has stimulated valuable interdisciplinary discussion and inspired new research directions in music cognition and related fields.
The body in music performance¶
When it comes to music-related body motion, we can generally separate between performers and perceivers. Their roles are distinct, yet both rely on bodily processes to engage with music. Let us begin with performers.
Music performance is inherently physical. Musicians use coordinated actions—goal-directed and time-limited motion sequences—to produce sound, shape musical phrases, and communicate with others. The body acts as both the source and the interpreter of musical ideas, translating intention into audible and visible actions. This physicality is present in all forms of music-making, including singing, playing instruments, conducting, and dancing.
We can categorise music-related body motion in performers into four main types:
Sound-Producing Actions: Movements that directly generate sound on an instrument or interface. These include selection (choosing notes or sounds), excitation (initiating sound, such as striking, plucking, or bowing), and modification (altering sound qualities like pitch or timbre). Examples: pressing piano keys, bowing a violin, or turning a synthesiser knob.
Sound-Facilitating Actions: Movements that support or enhance sound production but do not themselves create sound. These include maintaining posture, shaping phrasing, and entraining the body to rhythm. Examples: a pianist’s arm movement for dynamics, a clarinettist’s breath support, or tapping a foot to keep time.
Sound-Accompanying Actions: Motion that reflects or mimics musical features without producing sound. These include tracing sound contours in the air, mimicking instrumental gestures, or engaging in air performance. Examples: moving a hand upward with rising pitch, playing air guitar, or dancing to music.
Communicative Gestures: Gestures intended to convey meaning, emotion, or instructions to other performers or the audience. These may be expressive, regulatory, or linguistic. Examples: a conductor’s baton movements, a nod to cue an entrance, or expressive hand gestures to convey emotion.
These categories of action and motion can be understood as existing along a continuum of connection to musical sound, as illustrated below:

Relationship between motion and sound (Illustration: Jensenius 2022)
While we distinguish these categories for clarity, in practice, they frequently overlap and interact. For example, sound-facilitating actions are often inseparable from sound-producing actions, and performers may use communicative gestures simultaneously with playing. Recognising this interplay is essential for a holistic understanding of music-related body motion.
The body in music perception¶
Many of the same types of actions can be found in perceivers, people “listening” to music. Here we use “listen” to emphasise that we experience music with our whole body, which is at the core of embodied music cognition. Perceivers also make sounds during performance, whether involuntary (body sounds like breathing) or voluntary (clapping, singing along, etc.).
Music-related perceiver motion can be both voluntary and involuntary, and it plays a significant role in how we experience and understand music. Examples include:
Dancing: Engaging the whole body in rhythmic movement, often in response to the beat, melody, or emotional content of the music. Dance can be highly structured (as in ballroom or folk dance) or spontaneous and improvised. Dancing is a direct way for listeners to embody musical structure, rhythm, and emotion physically, and is a universal aspect of musical cultures.
Air performance: Imitating the actions of playing an instrument or singing, such as “air guitar,” “air drumming,” or lip-syncing. These gestures reflect an embodied connection to the music and can enhance engagement and enjoyment. Air performance allows listeners to simulate the experience of being a performer, reinforcing their understanding of musical gestures and techniques.
Finger-tapping or foot-stamping: Involuntarily or deliberately tapping fingers on a surface, or stamping feet to the beat or rhythm of the music. These actions are widespread, often automatic responses that help listeners synchronise with musical timing and structure. Finger-tapping, in particular, is widely used in research as a behavioural measure of beat perception and sensorimotor synchronisation, providing insight into how the brain and body interact during music listening.
Involuntary swaying or nodding: Subtle body motion, such as swaying or head-nodding, that occurs without conscious intent. These responses are often linked to the brain’s sensorimotor coupling with musical rhythm and can be observed across cultures and age groups. Such movements usually arise spontaneously and reflect the deep integration of auditory and motor systems.
These bodily responses are not merely byproducts of listening—they are integral to musical perception and cognition. Moving to music can enhance memory, emotional response, and even social connection, illustrating the profound connection between the body and sound in musical experiences.
Research shows that bodily engagement while listening to music can aid in rhythm and beat perception by helping listeners internalise and predict rhythmic patterns, making it easier to follow complex or syncopated music. Movement also enhances emotional engagement, intensifying responses such as joy, excitement, or nostalgia. Additionally, gestures and movement support learning and memory of melodies, lyrics, and rhythms, which is why movement is often incorporated into music education. Finally, group movement—such as dancing or clapping together—facilitates social bonding by fostering a sense of unity and shared experience among listeners.
Anatomy and Biomechanics¶
Before introducing different approaches to motion capture, it is essential to understand the structure of the human body and how it moves. This section provides an overview of anatomy and biomechanics relevant to motion analysis.
Anatomical position and planes¶
Anatomy studies the structure of the body. To ensure consistency when describing locations and actions, we use the anatomical position: standing upright, head and eyes forward, arms at the sides with palms facing forward, and feet parallel and pointing ahead.
The human body in the anatomical position, with labelled regions.
The body is divided into regions: head, neck, trunk, upper limbs, and lower limbs, each with further subdivisions. For movement analysis, distinguishing between areas such as the arm and forearm, or the thigh and leg, is essential.
To describe positions and movements in three dimensions, we use anatomical planes—imaginary divisions of the body that provide standard reference points for anatomical terminology and motion analysis. The sagittal plane divides the body into left and right sections (with the median plane being exactly in the middle), the frontal (coronal) plane separates the body into front (anterior) and back (posterior) portions, and the transverse plane divides the body horizontally into upper (superior) and lower (inferior) parts. These planes are essential for accurately describing the direction and type of movement in both clinical and research contexts.

The three main anatomical planes: sagittal (divides left and right), frontal/coronal (divides front and back), and transverse (divides upper and lower parts of the body).
To describe movement directions, the belly button (navel) is often used as a reference point for the whole body, although other anatomical landmarks may be chosen for specific body segments. Movements are typically characterised along three primary axes: medial–lateral (side-to-side or left–right), anterior–posterior (front-to-back), and superior–inferior (up–down). These axes correspond to the anatomical planes—sagittal, frontal, and transverse—and help standardise descriptions of motion in both research and clinical contexts.

Common directions of human body motion, illustrated with arrows.
The Muscular System¶
The musculoskeletal system is fundamental for producing and controlling movement, comprising two main components: the muscular system and the skeletal system. The muscular system consists of muscles that act on the skeleton to move or position body parts. In contrast, the skeletal system includes bones, joints, and cartilage that provide structure and protection. In the following section, we will begin by looking at the muscular system.
There are three types of muscle tissue: cardiac (heart), smooth (organs), and skeletal (attached to bones). Skeletal muscles are responsible for voluntary movement.
A skeletal muscle consists of a thick, red muscle belly and narrow, white tendons at each end, which anchor the muscle to bones. When a muscle contracts, it pulls on the tendons, causing the attached bone to move.

When a skeletal muscle contracts, it pulls on the attached bone via tendons, resulting in movement at the joint.
Muscles can only pull, not push, so movement typically involves several muscles working together in coordinated roles. The agonist is the main muscle responsible for generating a specific movement, while synergists assist the agonist by adding force or stabilising the origin bone (sometimes called fixators). In contrast, the antagonist produces the opposite action, allowing for controlled and smooth movement by balancing or resisting the agonist’s force.
With over 600 skeletal muscles, only the major superficial muscles are highlighted here:

Major muscle groups of the human body, shown from the front (left) and back (right).
The Skeletal System¶
The adult skeleton consists of approximately 206 bones, which form the body’s framework. Bones serve as levers for movement and provide attachment points for muscles. Many bones have distinct landmarks—features that serve as sites for muscle attachment and can often be felt on your own body.

Major bones and bone groups of the human body, shown from the front (left) and back (right).
Joints are the connections between bones that allow the skeleton to move. The structure and each joint determine the possible directions and range of motion. Understanding joint movement is essential for analysing how the body produces complex actions, such as those involved in music performance.
Joint movements are commonly described as pairs of opposite actions, always referenced from the anatomical position. These include flexion and extension, which occur in the sagittal plane; abduction and adduction, which take place in the frontal plane; and internal (medial) rotation and external (lateral) rotation, which are movements in the transverse plane.

Major movement types at the joints, including flexion/extension, abduction/adduction, and internal/external rotation. Movements are always described relative to the anatomical position.
Degrees of freedom (DoF) refer to the number of independent directions in which a joint can move. Each DoF represents a specific type of movement (e.g., flexion/extension, abduction/adduction, rotation). Range of motion (RoM) describes how far a joint can move within each DoF, typically measured in degrees. Understanding DoF and RoM is essential for analysing joint function, movement capabilities, and limitations in both everyday activities and specialised tasks, such as music performance.
Biomechanics: Principles of Human Movement¶
Biomechanics is the study of how mechanical principles apply to living organisms, particularly the human body. Key areas and concepts in biomechanics include:
Statics: Examines bodies at rest or in equilibrium (e.g., standing, holding a posture).
Dynamics: Focuses on bodies in motion (e.g., walking, playing an instrument). It can be subdivided into kinematics and kinetics.
To analyse movement, we often refer to a reference frame, which provides a coordinate system for describing positions and motions. A global reference frame is fixed relative to the environment, such as the laboratory or stage, and serves as an external standard for measuring movement. In contrast, a local reference frame is attached to a specific body segment, such as the hand relative to the forearm, allowing for the analysis of motion in relation to other parts of the body. Using both global and local reference frames enables the creation of precise and context-sensitive descriptions of human movement.
Kinematics¶
Kinematics focuses on describing motion—how body parts move—without considering the forces that cause the movement. It answers questions about what moves, where, and how fast.
Key concepts in kinematics include:
Position: The specific location of a point or segment in space, typically given as X, Y, Z coordinates.
Displacement: The straight-line change in position from the starting point to the endpoint (a vector quantity).
Distance: The total length of the path travelled, regardless of direction (a scalar quantity).
Speed: The rate at which an object moves, without regard to direction (scalar).
Velocity: Speed in a particular direction (vector), indicating both how fast and in which direction something moves.
Acceleration: The rate at which velocity changes over time, describing how quickly an object speeds up or slows down.

The figure above shows the difference between displacement (the shortest path from start to end) and distance (the total path travelled).
Kinematic analysis is necessary for understanding movement patterns in music performance, such as tracking the trajectory of a violinist’s bow or the hand movements of a pianist.
Kinetics¶
Kinetics examines the forces and torques that produce or result from movement, focusing on why and how motion occurs.
Key concepts in kinetics include:
Force: Any push or pull that can alter the motion of a body. Muscles generate internal forces, while external forces include gravity, ground reaction, and contact with objects or instruments.
Torque (Moment): A rotational force that causes a body segment to rotate around a joint or axis. Muscles create torque to move limbs and control posture.
Power: The rate at which work is done or energy is transferred during movement. In music performance, power is evident in actions like striking a drum or bowing a string.
Balance: The ability to maintain the body’s centre of mass over its base of support, both when stationary and during movement. Good balance is crucial for stable and controlled performance.
Centre of Gravity (CoG): The point at which the body’s mass is equally distributed in all directions. The position of the CoG changes with posture and movement.
Base of Support: The area beneath the body that provides stability, typically defined by the space between the feet or points of contact with the ground.

A person remains balanced as long as the line of gravity from their CoG falls within their base of support.
Understanding kinetics helps analyse how musicians generate, control, and coordinate movement, as well as for preventing injury and optimising performance.
Motion capture¶
There are various ways to study human body motion. While many associate “motion capture” with suits, markers, or sensors, the term can be interpreted more broadly to include any method that systematically records human movement. This encompasses both qualitative and quantitative approaches. In practice, qualitative and quantitative methods are often combined. For example, researchers may use both video and sensors, and analyses may include both interpretive and numerical components. For clarity, this course distinguishes between qualitative and quantitative methods, while acknowledging that mixed-method approaches are also common.
Qualitative approaches¶
Qualitative motion analysis focuses on understanding movement through observation, reflection, and descriptive frameworks rather than numerical measurement.
Introspection: Involves self-reflection on one’s own movement experiences. Musicians and researchers may evaluate their own performance, notice sensations of effort or discomfort, or describe how specific movements feel during the music-making process.
Observation: Entails systematically watching others move—either live or via video—and annotating features of their motion. This can include noting posture, gesture, timing, and expressiveness. While some may not consider observation “proper” motion capture, it is a structured and repeatable way to document movement without specialised technology.
Observation-based methods are widely used in clinical, educational, sports, and artistic settings. The use of video recordings allows for repeated viewing, slow-motion analysis, and collaborative review, making it easier to identify subtle details and patterns.
Music researchers have been inspired by the qualitative analysis methods developed by the dancer and choreographer Rudolf Laban (1879–1958). He created two influential systems in the early to mid-20th century:
Labanotation: A symbolic notation system for recording and analysing human movement, especially in dance. It uses standardised symbols to represent body parts, directions, levels, and timing, enabling detailed documentation of movement sequences.
Laban Movement Analysis (LMA): A comprehensive framework for describing the qualitative aspects of movement. LMA focuses on four main components: body (what moves), effort (how it moves), shape (the form the body takes), and space (where it moves). The “effort” component is particularly relevant in music, describing motion in terms of space (direct/indirect), time (quick/sustained), weight (strong/light), and flow (bound/free).
Qualitative approaches are valuable for capturing the expressive, communicative, and contextual dimensions of movement. They often complement quantitative methods by providing insights into aspects of motion that are difficult to measure numerically, such as emotion, intention, and style.
Quantitative approaches¶
Quantitative methods rely on numerical representations of motion. For example, video can serve as a quantitative tool if features are extracted and measured, rather than just observed. Quantitative analysis often involves plotting measurements and applying statistical or machine learning techniques. We can differentiate between two main types of motion capture: camera-based and sensor-based motion capture. Both approaches have their strengths and limitations, and the choice between them depends on the research context, required precision, and practical considerations.
Camera-based motion capture¶
This approach uses cameras—either standard video cameras or specialised systems (such as infrared, stereo, or depth cameras)—to record and analyse movement. Markers may be placed on the body to help track specific points, or markerless systems can use computer vision algorithms to estimate body positions. Camera-based systems are widely used in biomechanics, animation, and music research because they can capture detailed, full-body motion in three dimensions. However, they often require controlled environments, careful calibration, and can be sensitive to lighting and occlusion.

An example of an infrared, marker-based motion capture system, allowing for precise measurements of the body.
At the University of Oslo, we have multiple camera-based systems available, both at RITMO and at the Department of Musicology. Here you can get a sneak peek into one of the mocap labs:
Sensor-based motion capture¶
This method relies on wearable sensors attached directly to the body. Standard sensor types include inertial measurement units (IMUs), accelerometers, gyroscopes, magnetometers, and sometimes physiological sensors (such as EMG for muscle activity). Sensor-based systems are generally more portable and less dependent on the environment, making them suitable for field studies or situations where cameras are impractical. They can provide precise data on joint angles, acceleration, and orientation, but may require careful placement and calibration, and can be affected by sensor drift or interference.

An example of a sensor-based motion capture suit used in a performance with the Stavanger Symphony Orchestra in 2023.
Video visualisation¶
Video visualisation can be seen as a middle ground between qualitative and quantitative approaches. It is based on regular video recordings, but aims to extract relevant features from the video stream. Instead of simply watching and describing movement, video visualisation techniques use computational tools to analyse and represent motion data visually.
For example, software can track the position of specific body parts or objects frame by frame, generating plots of movement trajectories, velocity, or acceleration over time. Other techniques include motion history images, which overlay multiple frames to highlight areas of frequent movement, or heatmaps that show where most activity occurs. These visualisations help reveal patterns, timing, and coordination in musical performance or listening that may not be obvious through observation alone.
Video visualisation is particularly useful for identifying subtle or complex movement features, comparing performances, or communicating findings to others. It also allows for the integration of both subjective interpretation and objective measurement, making it a valuable tool in interdisciplinary research on music-related movement.
Summary¶
In summary, the body plays a fundamental role in shaping how we perform, perceive, and analyse music. By considering embodied and 4E cognition frameworks, we recognise that musical experience is not confined to the mind or ears alone, but emerges from dynamic interactions between the body, environment, and technology. Understanding anatomy and biomechanics provides essential context for analysing movement, while both qualitative and quantitative motion capture methods offer complementary insights into the complexities of music-related actions. Integrating these perspectives allows for a richer, more holistic understanding of the interplay between sound, movement, and meaning in musical practice.
Questions¶
What are the four main types of music-related body motion in performers, and how do they differ?
How does the 4E cognition framework expand our understanding of music perception and performance?
What is the difference between kinematics and kinetics in the context of biomechanics?
Describe the advantages and limitations of camera-based versus sensor-based motion capture systems.
Why are both qualitative and quantitative approaches critical in the study of music-related movement?
- Leman, M. (2007). Embodied Music Cognition and Mediation Technology. The MIT Press. 10.7551/mitpress/7476.001.0001
- Jensenius, A. R. (2022). Sound Actions: Conceptualizing Musical Instruments. The MIT Press. 10.7551/mitpress/14220.001.0001
- The Oxford Handbook of 4E Cognition. (2018). Oxford University Press. 10.1093/oxfordhb/9780198735410.001.0001