Music Is Anti-Conversation

1 July, 2020

Music is the opposite of spontaneous "in the moment" conversation.

When engaged in spoken conversation, certain processes in the listener's brain are specially activated. When listening to music, those same processes are fully de-activated.

Some Observations About Conversation

Conversation is when two people talk to each other in an informal unplanned manner. (It can be more than two, but I will stick to the case of two people to keep things simple.)

A conversation involves a mutual exchange of information.

Both parties to a conversation will take part in the conversation on the assumption that there is a mutual benefit.

Of course the interests of different parties are never perfectly aligned, so there is no guarantee that mutual benefit will occur in any particular conversation.

We can assume that a transaction is more likely to be mutually beneficial in those cases where the interests of the parties are naturally aligned, such as conversations between family members, or long-term friends.

But informal conversations can occur between people with less closely aligned interests, such as work colleagues, or people involved in one-off transactions, or even complete strangers who just happen to be in the same place with nothing better to do.

The Happy Case

An idealised view of conversation is that every spoken utterance corresponds to the perfect transfer of information, ie:

Unfortunately, much of the time, a listener cannot trust the truth of what the speaker says.

The speaker may be intentionally lying. Or they may be unintentionally lying. They may have been lazy about verifying the truth of what they say. For any number of reasons they could be wrong.

If the listener is not to suffer a disadvantage from listening to what a speaker says, it is quite essential for the listener to evaluate the truth of what was said, ie to form their own opinion of the likely or possible truth of what the speaker says.

Truth Evaluation, and Feedback

The most important function of this truth evaluation is for the listener to protect themselves from believing something to be true just because the speaker said so.

But there is a second important function, which has to do with the interests of the speaker, and which is quite essential to the success of any on-going conversation.

As I already observed, there can be a number of reasons why the speaker might have said something that is not certainly true, and it is not necessarily because the speaker has a malicious intention to deceive.

In most cases, the speaker wants to know to what extent the listener believes the truth of what they, the speaker, said to the listener.

In a normal conversation, the listener does more than just privately evaluate the truth of what has been said.

The listener responds to the speaker, sometimes by giving just a basic acknowledgement, sometimes by elaborating on whatever the speaker said. But in either case, the listener's response provides information to the speaker about the listener's truth evaluation of the speaker's speech.

(Of course the listener can have a malicious intention to deceive, and their response may include a deliberate concealment of the listener's actual beliefs about the truth of what the speaker said. But, in the scenario where both parties are acting cooperatively, the listener will usually give feedback fairly freely.)

Feedback about truth evaluation is an essential part of any conversational transaction.

A speaker says one thing, and then, perhaps after a response from the listener, that speaker says a second thing based on context provided by the first thing they said and the listener's response, if any. If the listener is very skeptical about the first thing said, and doesn't let the speaker know, then the speaker will say the second thing, and they will be largely wasting their time, because the listener does not share the assumption that the first thing is true.

"In the Moment" Spontaneity

I have now considered two different aspects of intention in a conversation, relevant to the issue of mutual benefit:

  1. The speaker's intention to say something which they know or believe to be true, or likely to be true, or at least possibly true.
  2. The listener's intention to provide feedback to the speaker about their belief about the truth of what the speaker said.

There is one more layer of intention that we can consider.

(Potentially the layers of intention go on forever. The third layer plays a special role, because it turns out that this third layer is directly relevant to understanding the nature of music.)

If the listener is willing to give feedback to the speaker about their beliefs about the truth of what the speaker said, then the speaker may choose to exploit this by deliberately saying something, just to find out what the listener believes about it.

This counts as an exploitation, because the speaker is saying something for the sake of getting information out of the listener, and what the speaker says is not a reflection of their own thoughts at the moment they said it. So the speaker is getting information about the listener's current thinking, but the speaker is not providing information about their own current thinking.

If there is some way that the listener can detect speech from speaker that is spoken with this intention, then it would be beneficial for the listener not to make an effort to evaluate the truth of any such speech.

So, how could such speech be detected?

The things we are trying to distinguish are:

For the sake of the current argument, we don't necessarily have to know precisely how such a distinction might be made.

I will assume, for the moment, that there is some way that a listener can distinguish, to some extent, between speech that is a genuine expression of the speaker's own "in the moment" thoughts about something, and speech which is deliberately spoken for the purpose of observing the listener's response, and which does not necessarily represent anything that the speaker is actually thinking.

Also I will adopt a terminology of spontaneous versus non-spontaneous, ie speech reflecting the speakers "in the moment" thoughts is spontaneous, and speech deliberately spoken for some other reason is non-spontaneous.

The hypothesis then is that if a listener perceives a speaker's speech to be less spontaneous, then the listener will make less effort to evaluate the truth of what was said.

I have used the word "less" in this hypothesis – recognising that spontaneous and non-spontaneous is more of a continuum than a binary distinction. There will be circumstances where a speaker has good intentions, but at the same time they decide to say something that they have previously planned and even possibly rehearsed. It would be an over-reaction for a listener to completely suppress all evaluation of truth whenever a speaker showed any signs of non-spontaneity in their speech. The important thing is that there is a continuum of motivation, ie the more spontaneous I perceive your speech to be, the more effort I'm willing to make to evaluate the truth of your speech and give you feedback about it.

Music

So, what does any of this have to do with music?

I propose that music is an extreme form of non-spontaneous conversation.

What does this mean?

To use a more technical term, music is a superstimulus.

That is, music is a superstimulus for the perception of non-spontaneous conversation.

Whatever quality it is that non-spontaneous conversation has that allows a listener to perceive the non-spontaneity of a speaker's speech, music has more of that quality, to the greatest extent possible.

If music has more of this quality, what effect then does it have on the brain of the listener?

The answer is: music shuts down the process of truth evaluation.

In other words, without music, the listener's brain does this:

  1. Listen to speaker's speech.
  2. Determine meaning of what was said.
  3. Evaluate the truth of the meaning.
  4. Respond to the speaker, with the response including information about the listener's evaluation of the truth of the speaker's speech.

With music, the process is:

  1. Listen to speaker's speech.
  2. Determine meaning of what was said.
  3. THE END

This hypothesis can explain the emotional effects of music, if we insert two addition steps into the original process, relating to the emotional significance of a speaker's speech.

The extra steps are:

  1. Determine emotional significance of the meaning of what the speaker said.
  2. After truth evaluation, reduce the emotional significance in proportion to the likelihood that what the speaker said is true.

So the full list of steps, without music, is:

  1. Listen to speaker's speech.
  2. Determine meaning of what was said.
  3. Determine the emotional significance of the meaning of what the speaker said.
  4. Evaluate the truth of the meaning.
  5. After truth evaluation, reduce the emotional significance in proportion to the likelihood that what the speaker said is true.
  6. Respond to the speaker, with the response including information about the listener's evaluation of the truth of the speaker's speech.

And, with music:

  1. Listen to speaker's speech.
  2. Determine meaning of what was said.
  3. Determine the emotional significance of the meaning of what the speaker said.
  4. THE END

We now have an explanation of how music causes emotion, which is that it prevents the reduction of the emotional significance of the meaning of what a speaker said.

To fully account for the emotional effect of music, we have to explain what happens when noone is actually talking to the listener, or saying any words at all, not even song lyrics.

We can suppose a more general process of calculating the emotional significance of meanings derived from sources other than actual speech.

For example, the content of a film scene. Or the listener's own imagination.

This leads to the following set of steps without music:

  1. From some source, determine a hypothetical meaning – which can be an idea about something, or a hypothetical scenario.
  2. Determine the emotional significance of the hypothetical meaning.
  3. Evaluate the truth of the hypothetical meaning.
  4. After truth evaluation, reduce the emotional significance in proportion to the likelihood that the hypothetical meaning is true.

And again, with music:

  1. From some source, determine a hypothetical meaning – which can be an idea about something, or a hypothetical scenario.
  2. Determine emotional significance of the hypothetical meaning.
  3. THE END (the perceived emotion remains un-reduced)

Just a Side-Effect, or Also Functional?

This then is my hypothesis about music: music is a superstimulus for an aspect of speech perception, which, as a result, has an emotional effect on the listener.

But does music have any actual purpose in itself?

People have been creating and listening to music for at least 42,000 years (possibly much, much longer, but hard evidence is lacking for any older date).

Also, in the present day, people spend significant time, money and effort on creating and listening to music.

If the only thing we can say about music is that it is a superstimulus for the perception of non-spontaneity of conversational speech, then people are spending large amounts of time, money and effort on creating something so that they can listen to it and their brains can respond by deciding that what they are listening to is not spontaneous conversational speech.

To account for both the length of time that music has existed, as a thing, and for the current-day effort that people put into music, we have to assume that music additionally serves a biological function in itself.

In other words, music has evolved as a side-effect of how conversation speech is perceived, but, this side-effect turned out to have its own benefits, and as a result, the phenomenon of music continued to exist.

The specific effect of music is the emotional state that it causes, ie:

This suggests that if music has a function, it relates to the benefits of a person being able to experience, at full strength, their own emotional responses to purely hypothetical meanings, where those meanings might come from something that someone else said, or they might come from the listener's own private thoughts and imagination.