The Big Question

The big question is: "What is Music?".

It seems an easy question to answer, because we are all so familiar with music. But being familiar with something is not the same thing as knowing what it is.

When I ask myself this question about music, I ask it as a scientific question. So I want an answer which relates music to our scientific understanding of the world. And I want an answer that makes predictions about music (and for bonus points, it should make predictions about other things besides music).

I don't know for sure that I've got any closer to answering this question yet. Over the years (and the decades), I've had some promising ideas. Some of those ideas still seem promising, and others now seem less promising.

I'm fairly sure that no one else has yet answered the question. The production of new music that people want to listen to continues to be more of an art than a science – which means that even the people making music don't really know what it is.

So, let me introduce myself. My name is Philip Dorrell, and I am a music science addict.

The Daydream Hypothesis: Music Creates an Altered State of Mind

My current major hypothesis about the nature of music is that music perception is not actually the perception of anything at all, and that music is a trigger for an altered state of mind, where the altered state of mind is the only thing that matters.

And how does music alter the mind?

When you listen to music that you like, your brain forms a strong connection to the music. The music has no intrinsic meaning, but, nevertheless, you connect to it. As a result of this connection to the music, your brain partially disconnects from immediate reality. And as a result of this partial disconnection, your brain more fully experiences the emotions of daydreams.

This hypothesis implies that the specific nature of music itself is somewhat arbitrary – that the brain's criterion for deciding what is and what isn't musical is arbitrary.

It does seem odd that something as apparently complex as music has evolved as a purely arbitrary signal to cause the brain to enter a temporarily altered state. However, the complexity of music may result from some relatively simple set of rules that define that music. Also, the criterion for music may have evolved as some simple alteration to an existing criterion for something else that the brain responds to – this is the hypothesis of the musical precursor.

Given the many similarities between music and speech, the most likely candidate for such a pre-cursor is a component of speech perception. That is, our response to music is a mutated copy of some component of our response to speech. (For a more detailed analysis of what the pre-cursor might be, see Music: The Third Reality.)

Important Ideas about Music

The study of music is part of biology.

Music exists because people create it, perform it and listen to it. People are living organisms, and biology is the study of living organisms.

Any theory of music that claims to be complete must be able to pass the Luxury Yacht Test.

If you succeeded in developing a complete theory of music, you would be able to use that theory to compose strong original music, which you could then sell, and use the proceeds to purchase a luxury yacht. Be suspicious of anyone claiming to completely understand what music is who does not own a luxury yacht. (And no, I do not own a luxury yacht. It follows that I do not yet have a full theory of music. I can only claim only my various hypotheses are plausible and that they perhaps explain more about music than anyone else's theories.)

The human brain is an information processing system.

An information processing system has four basic components: input, output, calculation and storage. Applying this framework to the analysis of music, music appears to represent the input. What kind of information is the output, and what does it mean? How is it calculated?

(If, as the Daydream Hypothesis implies, music is a somewhat arbitrary signal, then we cannot determine that the content of music has any intrinsic meaning, beyond the fact that it constitutes a signal.)

Music has multiple aspects, somewhat analogous to the multiple aspects of speech perception.

Music has melody and rhythm. Speech also has "melody" and "rhythm", but the melody and rhythm of speech lack the discreteness and regularity of the melody and rhythm of music.

Hypothesis: Every aspect of music is analogous to some aspect of speech, and the analogy is the same analogy in each case.

By investigating each aspect of music, we can make an intelligent guess as to the nature of the cortical map which responds to the musical aspect, and then we can determine what the response of that same cortical map would be to speech, and finally we can determine what role the cortical map plays in the perception of ordinary speech.

"Normal" stimuli for specific aspects of speech perception may lack properties of corresponding aspects of music.

Musical harmony consists of simultaneous pitch values, yet perception of simultaneous pitch values from multiple melodies has no relevance to speech perception (i.e. we almost always only listen to one person speaking at a time). The normal function of the cortical map that responds to consonant relationships between different notes occurring at the same time within harmonies and chords must be the perception of consonant relationships between pitch values occurring at different times within the same speech melody.

Normal speech melody is not constructed from musical notes selected from a musical scale. The normal function of the cortical map that responds to discontinuous musical melodies constructed from musical scales must be the perception of continuous speech melody.

The rhythm of speech is not as regular and structured as the rhythms of music. The normal function of the cortical maps that respond to the regular rhythms of music must be the perception of irregular speech rhythm.

Dance is an aspect of music.

In other words, dance is not just something which accompanies music, dance actually is music. Music perception is analogous in some way to speech perception, but speech perception is not just the perception of sounds: it also includes perception of the speaker's movements such as facial expressions, body language and hand gestures. Dance, and especially our response to watching other people dance, can be identified as being analogous in the same way to this component of speech perception.

There are at least five and possibly six symmetries of music.

These are:

  • Pitch translation invariance
  • Time translation invariance
  • Time scaling invariance
  • Amplitude scaling invariance
  • Octave translation invariance
  • Pitch reflection invariance
Each of these symmetries represents an invariance of some aspect of the perceived quality of music under the corresponding set of transformations.

For each symmetry we can ask "Why?" and "How?".

The first four symmetries are functional symmetries in that they satisfy a requirement for invariance of perception, i.e. for each symmetry in this group our perception of speech should be invariant under the set of transformations that define the symmetry. For example, perception of speech melody is invariant under pitch translation so that people with different frequency ranges can speak the same speech melodies, and have those melodies perceived as being the same. The last two are implementation symmetries which play an internal role in the perception of music. (For example see the next item on octave translation invariance.)

In some cases the "how" part of the question has an answer less trivial than one might assume at first. It turns out that we can identify components of speech perception from hypotheses about the existence cortical maps that respond to aspects of music, and these components reflect the need to achieve perception of speech melody invariant under pitch translation and time scaling.

Octave translation invariance is an implementation symmetry which facilitates the efficient subtraction of pitch values.

Octave translation invariance is the result of splitting the representation of pitch into a precise value modulo octaves and an imprecise absolute value. This split enables the more efficient representation and processing of pitch values, particularly when one pitch value must be "subtracted" from another to calculate interval size.

Our perception of relative pitch must be calibrated somehow.

This explains the importance of consonant intervals in music perception. Consonant intervals correspond to the intervals between the harmonic components of voiced sounds in human speech, and they provide a natural standard for calibrating the comparison of pitch intervals between different pairs of pitch values. Our accurate ability to calculate and compare pitch intervals enables the pitch translation invariant perception of speech melody.

Musicality corresponds to the occurrence of constant activity patterns in cortical maps.

The regularities of time, pitch and repetition in music cause the cortical maps responding to music to become divided up into active and inactive zones, where the division remains constant for all or part of a tune.

This hypothesis answers two important questions about music.

Firstly it tells us what the analogy is between each aspect of music perception and a corresponding aspect of speech perception, where that analogy is the same in each case.

Secondly, it accounts for the initial development of music perception as a mutated copy of some component of speech perception, where the primary effect of the mutation was to create a response to these patterns of activity and inactivity.

(Even it is assumed that a single mutation defined the original distinction between speech and music, we cannot rule out the possibility that music perception has evolved other differences over time, and this could account for some of the differences between music and speech which are not readily explained in terms of the Constant Activity Patterns hypothesis.)

The question, "What is Music?", and its many answers ...
Historical Interest Only: The Superstimulus Hypothesis

My initial "big idea" about music was the Superstimulus Theory, that music is a super-stimulus for some aspect of speech perception.

This theory drove much of my original investigation, and resulted in the development of many specific hypotheses about specific aspects of music.

I wrote a whole book about it.

Yet, I no longer believe it to be the correct answer, and it is effectively replaced by the Daydream Hypothesis.

But, many of the more specific hypotheses remain relevant.

What is Music? (front cover)
(Paperback, 324 pages, 6" by 9")
View of back cover

(Note: this historical section has not been edited – it contains the original blurb for my book about the super-stimulus theory.)

What is Music?: Solving a Scientific Mystery is a book by Philip Dorrell which explains a new scientific theory about music: the super-stimulus theory.

The main idea of the theory is that music is a super-stimulus for the perception of musicality, where "musicality" is actually a perceived property of speech. "Musicality" refers to the property of music that determines how "good" it is, how strong an emotional effect it has, and how much we enjoy listening to it.

The theory implies that ordinary speech also has this property, in a manner which may vary as a person speaks. The musicality of speech is much more subtle than that of music, but it provides important information which the listener's brain processes (without conscious awareness of the processing), in order to derive some information about the internal mental state of the speaker. This information is applied to modulate the listener's emotional response to speech, and this accounts for the emotional effect of music.

What distinguishes the super-stimulus theory from all other serious attempts to explain music scientifically is that it starts from a simple assumption that music perception must be an information processing function, and this assumption results in quite specific explanations of how major aspects of music such as scales, regular beat and harmony are processed in the brain. It is the first theory to explain the perception of musical scales without a priori assuming the existence of musical scales. (The theory has to do this, because it is a theory of music perception as an aspect of speech perception, and musical scales do not occur in normal speech.)

The book is now available as a free download.

 Copyright © 2006-2015 Philip Dorrell