The Characteristics of Musical Perception and Musical Pleasure

15 August, 2019
A list of characteristics of our perception of music, and our enjoyment of music, against which music may be compared to all other things that we perceive and/or enjoy.

Motivation for Making this List

This article presents a list of characteristics of the perception and enjoyment of music.

The big mystery of music is about what and why: what actually are we perceiving when we perceive music, and why do we enjoy music?

In the context of evolutionary biology, the "why" question amounts to asking what the biological function of music is, ie how does perceiving and enjoying music increase the expected long-term reproductive success of the music listener?

Many attempts have been made to explain music within an evolutionary framework. But, as far as I know, no theory has been proposed that explains all the things in the list I have given here.

If you think you have a theory that explains music, please read this list first.

And then make sure that your theory accounts for every single item in the list.

Thank you.

Some of these are personal and subjective ...

Some of the items in this list are the result of my own personal subjective observations, and I have marked these with a "*".

It may be that some of these subjectively observed characteristics of music are unique to me, or perhaps I have misinterpreted my own self-observations. But to me they are too strong and too important to be ignored, so I have included them in the list that I present here – if it turns out that some of these personal observations do not correspond to other people's experience of music, then so be it.

The Characteristics

Music is pleasurable

Music is something that we enjoy listening to.

Most things that give us pleasure can be related to obvious biological functions that support survival, growth and long-term reproductive success.

It is not all obvious that music has any specific biological function.

A strong item of music has its own unique perceived quality (quale)*

("Quale" is the singular of the more commonly used word "qualia".) To say that a particular item of music has a quale, means that we can talk about "what it feels like" to perceive that item of music.

The concept of "qualia" is a contentious topic – some people use it to argue that there exist aspects of human consciousness which can not be scientifically explained. Others take the view that the human brain is a physical information processing system (which can be scientifically explained), but that qualia are a real thing, and certain types of information constitute qualia, perhaps because of how they are represented in the brain, or perhaps because of the special role they play in the processing of information in the brain.

Examples of things usually considered to have definite qualia are smells and colours.

Both of these are examples of things where conscious awareness of them is likely, in many situations, to be directly relevant to making conscious decisions about the current circumstances of the individual. Unusual colours and unusual smells are often an indicator that something significant is present that needs to be dealt with urgently.

If immediate relevance is the basic criterion defining qualia, then it is not at all obvious why music items need to have qualia. It's hard to imagine any uncontrived situation where the identity of a musical item indicates the presence of something that requires urgent conscious attention.

The whole is greater than the sum of its parts*

This follows to some degree from the previous item. The perceived quality of a strong item of music is peculiar to that item of music, and this quality is not constructed in any obvious manner from the perceived qualities of the components of the musical item. Indeed our perception of the parts is strongly infected by the perceived quality of the whole – we perceive the notes and phrases of a strong item of music as having the quality of being the components of that item of music (and not, for example, just being notes from a scale).

The perceived quality of a strong musical item gives that item of music a very strongly perceived identity*

The identity of a strong musical item is derived from its unique perceived quality.

There are two things besides music that I know of where the whole has a strong identity, separate from our perception of the parts:

In the case of faces, our ability to reliably identify who a particular face belongs to is a very important biological function. It is not possible to survive in human society if you cannot reliably identify every person you encounter, whether they be a friend, a relative, an enemy or even a complete stranger you just met 5 minutes ago.

In the case of words, the words are units of meaning in language (or at least they are to a first approximation – sometimes the units of meaning are parts of words), and being able to instantly identify words is critical to being able to use language and communicate with other people.

In the case of music, neither of these explanations applies:

The weakness of our perception of the parts

With both faces and words we can consider to what extent the identities of the components are hidden from conscious awareness.

In the case of faces, our conscious awareness of the components of the face, and the relationships between those components, is rather weak, as shown by the general difficulty most people would have when attempting to draw a recognisable picture of a known individual.

On the other hand, in the case of words, there is no particular difficulty in determining the identities of the component consonant and vowel sounds that make up a word.

(There may be some analogy between the difficulty of drawing faces, and the difficulty of learning to perform music well – in both cases our awareness of the wrongness of the whole does not help us to correct our performance of individual components.)

Music has an emotional effect

It is difficult to pin down exactly what is the precise emotional affect or meaning attached to any particular item of music, but subjectively we are very aware that music somehow either invokes or interacts with our emotions.

For almost anything else that causes an emotional response, the emotion is either due to the actual circumstances we are in (eg I'm hungry and I am unexpectedly given some food, which makes me happy), or in some cases, due to our imagination of some circumstances (eg I'm being told a story, and the emotions I feel relate to what is happening in the story).

Music is unique in that there is no obvious relationship between the emotions invoked or reinforced by a particular item of music and the type of situation that would cause those emotions in a real world situation.

In this respect, music is more like a mind-altering drug than anything else.

To be "strong", a musical item must satisfy a certain set of rules

Music does appear to be constrained by some set of rules, because it is very hard to construct new original strong items of music by a naive process of trial and error. If these rules exist, our knowledge of them is very incomplete, although we do have some partial knowledge, as contained within basic music theory of notes, chords, rhythm etc.

Anyone who casually tries to compose original strong music will soon find out how difficult it is to satisfy those rules, whatever they might be. This applies even to most people who are competent and practised performers of music.

The rules of music are a triple mystery:

  1. We don't know what the biological function of music is.
  2. We don't know what the rules of music are.
  3. We don't know how the unknown rules of music allow music to provide the unknown biological function that music has.

For almost all other things perceived or enjoyed, the associated "rules" usually relate in some comprehensible way to the biological function of the thing in question (or the perception thereof).

Sometimes there are rules that have a certain circular logic in their rationale, in sense that the rules have to be followed because they are the rules.

Two known examples of this are sexual selection, and languages.

With sexual selection, a male or female is motivated to choose a partner who is the most physically attractive female or male respectively, to maximise the chances that their own children will be the most attractive possible. Some aspects of attractiveness relate directly to other elements of biological fitness, but other aspects arise from this purely circular logic. (The circularity of sexual selection is more obvious in other species, such as, for example, brightly coloured birds, where the bright colouration serves no purpose other than to attract the opposite sex, and is even counter-adaptive, since it makes one more visible to predators.)

In the case of languages, any particular language is defined by a set of rules that describe which constructions are possible within that language, and what the rules are for associating meanings with those constructions. Each individual has to follow the correct rules of the language spoken by those around them, in order to effectively communicate with those other people.

It has been suggested that music exists entirely as a result of sexual selection, and there is indeed some association between music and sexual or romantic activity. However this association is rather weak, and it is too weak to explain the total amount of effort that people put into creating and listening to music.

And comparing music to language, music is not a language. Other than in a few special cases, most musical items do not have any well-defined "meaning" that is shared between different people in a manner that allows music to be a system of communication in the way that spoken language is.

Music bears a lot of repetition

Music may be considered a form of entertainment. The major benefit of entertainment is that it prevents us from being bored, and this usually means that we don't want to experience the same content more than once.

Music is quite unusual in how much we are happy to enjoy exactly the same music that we have heard many times before.

We can compare this to jokes and stories, for example, where once is usually enough, and ten times would definitely be too much.

It is easier for a group to produce "good enough" music than an individual

Most people, if they attempt to sing alone, without any technological assistance, will not produce a very musical result.

In contrast, if a group of people is large enough, and they sing together, it is not hard for them to produce something that is worth listening to.

This is mostly the result of the averaging of pitch and time that happens when a group sings, and the fact that acceptable musical quality depends on a level of precision which is very difficult for an individual to achieve.

The situation with music can be contrasted with that of speech - most speech, that most people want to listen to, and all conversational speech, is produced by just one individual speaking by themselves. Not by a group.

Also in contrast to the case with music, the quality of speech produced by the average person is quite good enough most of the time, for example to conduct a useful or enjoyable conversation.

Speech-related Characteristics of Music

Some of the more peculiar characteristics of music have to do with how music relates to speech.

Adding speech enhances music. Music does not enhance conversational speech.

The preferred form of popular music is song, which consists of spoken language embedded in music. However music can exist entirely without any lyrics or singing. (Singing can also consist of nonsense speech not made up of real words. And we can enjoy listening to songs where the lyrics are in a language that we do not ourselves understand.)

Music does not enhance normal spoken language. We do not communicate with other people by singing to them. The situations where music is used to enhance or accompany speech are usually situations which are very unlike normal conversation. For example, religious sermons or ceremonies, or political rallies. (As it happens, both those situations are situations where a certain amount of suspension of disbelief is expected from the audience.)

Music has certain continuous symmetries

Music is invariant under both pitch translation (raising or lowering pitch) and time scaling (faster or slower).

It shares both of these invariances with spoken language (in as much as spoken language has rhythm and "melody").

For both music and spoken language, pitch-translation invariance allows different speakers to utter the same "melodies".

Spoken language encodes information in those characteristics not relevant to the identity of musical items

Most of the information in spoken language is encoded by consonant choices and vowel choices.

This choice of consonants and vowels is one thing that does not have any noticeable effect on the perceived musical quality and identity of a musical item.

In other words, we can change the lyrics of a song, but it's still perceived as being the same item of music, with the same musical quality. (This is not to deny that there is a preference for lyrics where the meaning of the lyrics is consistent with the emotional tone of the music.)

Song lyrics are expected to rhyme

Rhyming is one minor exception to my previous statement that choice of vowels and consonants has no effect on the musicality of a song.

It can be noted that with rhyming, the actual choice of specific vowels and consonants doesn't matter – the only thing that matters is that certain vowels and consonants in certain positions in the tune match each other, and that this matching isn't just caused by the words being the same.

Film and Video

Music is often combined with film or video.

These combinations can be divided into two major categories:

  1. Music is added to a film, for example a film that tells a story, to enhance the audience's appreciation of the film.
  2. A video is created to accompany a musical item, to enhance the audience's appreciation of the music.

In the first case, the music is only added to some parts of the film – there are certain parts of films which are strongly enhanced by the addition of music, and there are other parts where music is of little or no benefit.

In the second case, the video fully accompanies the whole musical item. Typically such a video can not be a full narrative film (not even a short one), because, as just stated, only some parts of a narrative film are improved by the addition of music.

Music makes us interested in the thoughts and emotions of a character in a film*

Stronger items of music typically accompany parts of a film where a character is not doing that much, but we know that the person is having thoughts of emotional significance about something that has previously happened, or perhaps about something that is going to happen.

Oddly this effect does not happen in social situations, even if we happen to be hearing a strong item of music at the same time. It is almost as if it is a hidden aspect of our response to music, which is only brought about by the unnatural scenario of a film where we (the audience) feel free to stare at the characters and ponder what is going on inside their heads. The effect is presumably suppressed when we are in the actual physical presence of a person, even though we might be interested in knowing the inner thoughts and emotions of that person.


There is an interaction between music and daydreaming.

There are people, so-called "maladaptive daydreamers", who suffer/enjoy a compulsive addiction to daydreaming, to the extent that their daydreaming interferes with their ability to live what most people would consider a normal life, and where their addiction typically includes, or is triggered by, listening to music.

(I have added a "*" to the heading, because this is something I do subjectively experience myself – a tendency to enjoy certain types of daydreaming while listening to very strong music. As far as I know my own daydreaming does not qualify to be classified as maladaptive, and I suspect that "maladaptive" daydreaming is an exaggerated version of what is actually a normal tendency experienced by many or maybe even most people.)