What is Music?: Solving a Scientific Mystery is a book by Philip Dorrell which explains a new scientific theory about music: the super-stimulus theory.
The main idea of the theory is that music is a super-stimulus for the perception of musicality, where "musicality" is actually a perceived property of speech. "Musicality" refers to the property of music that determines how "good" it is, how strong an emotional effect it has, and how much we enjoy listening to it.
The theory implies that ordinary speech also has this property, in a manner which may vary as a person speaks. The musicality of speech is much more subtle than that of music, but it provides important information which the listener's brain processes (without conscious awareness of the processing), in order to derive some information about the internal mental state of the speaker. This information is applied to modulate the listener's emotional response to speech, and this accounts for the emotional effect of music.
What distinguishes the super-stimulus theory from all other serious attempts to explain music scientifically is that it starts from a simple assumption that music perception must be an information processing function, and this assumption results in quite specific explanations of how major aspects of music such as scales, regular beat and harmony are processed in the brain. It is the first theory to explain the perception of musical scales without a priori assuming the existence of musical scales. (The theory has to do this, because it is a theory of music perception as an aspect of speech perception, and musical scales do not occur in normal speech.)
The book is now available as a free download.
An alternative hypothesis to the superstimulus theory is that music perception is not actually the perception of anything at all, and that music is a trigger for an altered state of mind, where the altered state of mind is the only thing that matters.
Under this hypothesis, music 'perception' has evolved from a precursor, which was a component of speech perception (and which may possibly co-exist with music perception). The evolution of music perception from this precursor accounts for the similarities between music and speech, and indeed accounts for the 'constant activity patterns' hypothesis. But the content of music, per se, has no intrinsic meaning, and music does not serve any communicative purpose, even though sometimes it feels that way.
The study of music is part of biology.
Music exists because people create it, perform it and listen to it. People are living organisms, and biology is the study of living organisms.
Any theory of music that claims to be complete must be able to pass the Luxury Yacht Test.
If you succeeded in developing a complete theory of music, you would be able to use that theory to compose strong original music, which you could then sell, and use the proceeds to purchase a luxury yacht. Be suspicious of anyone claiming to completely understand what music is who does not own a luxury yacht. (And no, I do not own a luxury yacht. It follows that the theory revealed in my book is not complete. I claim only that it is plausible and that it explains more about music than anyone else's theories.)
The human brain is an information processing system.
An information processing system has four basic components: input, output, calculation and storage. Applying this framework to the analysis of music, music appears to represent the input. What kind of information is the output, and what does it mean? How is it calculated?
Music is a super-stimulus for the perception of musicality in speech.
Musicality is a perceived attribute of speech, which tells the listener important information about the speaker and the speech. Music is a super-stimulus for this perceived musicality, i.e. music is "speech" that has been contrived to have an unnaturally high level of musicality.
Each aspect of music is a super-stimulus for a corresponding aspect of the perception of musicality of speech.
By investigating each aspect of music, we can make an intelligent guess as to the nature of the cortical map for which the musical aspect is a super-stimulus, and then we can determine what the response of that same cortical map would be to speech, and finally we can determine what role the cortical map plays in the perception of ordinary speech.
"Normal" stimuli for specific aspects of speech perception may lack properties of corresponding musical super-stimuli.
Musical harmony consists of simultaneous pitch values, yet perception of simultaneous pitch values from multiple melodies has no relevance to speech perception (i.e. we almost always only listen to one person speaking at a time). The normal function of the cortical map that responds to consonant relationships between different notes occurring at the same time within harmonies and chords must be the perception of consonant relationships between pitch values occurring at different times within the same speech melody.
Normal speech melody is not constructed from musical notes selected from a musical scale. The normal function of the cortical map that responds to discontinuous musical melodies constructed from musical scales must be the perception of continuous speech melody.
The rhythm of speech is not as regular and structured as the rhythms of music. The normal function of the cortical maps that respond to the regular rhythms of music must be the perception of irregular speech rhythm.
Dance is an aspect of music.
In other words, dance is not just something which accompanies music, dance actually is music. Music is a super-stimulus for aspects of speech perception, but speech perception is not just the perception of sounds: it also includes perception of the speaker's movements such as facial expressions, body language and hand gestures. Dance can be identified as the super-stimulus for this component of speech perception.
There are at least five and possibly six symmetries of music.
For each symmetry we can ask "Why?" and "How?".
The first four symmetries are functional symmetries in that they satisfy a requirement for invariance of perception, i.e. for each symmetry in this group our perception of speech should be invariant under the set of transformations that define the symmetry. For example, perception of speech melody is invariant under pitch translation so that people with different frequency ranges can speak the same speech melodies, and have those melodies perceived as being the same. The last two are implementation symmetries which play an internal role in the perception of music. (For example see the next item on octave translation invariance.)
In some cases the "how" part of the question has an answer less trivial than one might assume at first. It turns out that we can identify components of speech perception from hypotheses about the existence cortical maps that respond to aspects of music, and these components reflect the need to achieve perception of speech melody invariant under pitch translation and time scaling.
Octave translation invariance is an implementation symmetry which facilitates the efficient subtraction of pitch values.
Octave translation invariance is the result of splitting the representation of pitch into a precise value modulo octaves and an imprecise absolute value. This split enables the more efficient representation and processing of pitch values, particularly when one pitch value must be "subtracted" from another to calculate interval size.
Our perception of relative pitch must be calibrated somehow.
This explains the importance of consonant intervals in music perception. Consonant intervals correspond to the intervals between the harmonic components of voiced sounds in human speech, and they provide a natural standard for calibrating the comparison of pitch intervals between different pairs of pitch values. Our accurate ability to calculate and compare pitch intervals enables the pitch translation invariant perception of speech melody.
Musicality corresponds to the occurrence of constant activity patterns in cortical maps.
The regularities of time, pitch and repetition in music cause the cortical maps responding to music to become divided up into active and inactive zones, where the division remains constant for all or part of a tune.
Constant activity patterns in the speaker's brain are echoed by similar constant activity patterns in the listener's brain.
Occurrence of constant activity patterns in the speaker's brain represents information about the internal mental state of the speaker. One consequence of the perception by the listener of constant activity patterns in the speaker's brain is a reinforcement of the listener's emotional reaction to what the speaker is saying. This accounts for the emotional effect of music.