Music: The Hidden Sense and the Three Ways of Knowing

23 December, 2007

Music perception reveals to us an otherwise hidden sense. The three ways of knowing are genetic, experiential learning and cultural learning. Cultural learning can be verified or unverified. "Musicality" perception is a hidden sense that increases the reliability of unverified cultural learning.

by Philip Dorrell

(Note to regular readers: this is yet another attempt to explain my latest ideas and theories about the nature and evolutionary history of music.)

Music Perception: The 21st Sense

According to this article at http://www.howstuffworks.com/, there are maybe 20 known human senses.

"Music perception" is not one of the senses given in that list of 20. And if we believe that music is a thing in itself which is quite distinct from anything else that is not music – and most people who like music would regard that as a rather obvious fact about the universe – then we must believe that there is a separate sense of music.

Where is this sense? Most of what we perceive in music is a function of our ability to hear, so our sense of music certainly depends on our sense of hearing. But music is not perceived in the ears.

Music: The Only Brain Sense

It could be argued that the perception of music is really a brain function, just like all our other perceptual capabilities, and as such is not really a sense.

But a case can be made that the perception of music involves the observation within the brain of the brain's own activity patterns, and that this observation is made by non-neuronal brain cells (i.e. glial cells). These non-neuronal "musicality detectors" are like a special type of sensory cell, so the perception of musicality can be considered to be a sense.

For the most part the brain is devoid of sensory cells – if some part of your brain is stimulated (electrically or otherwise), you don't feel that your brain has been stimulated, instead what you feel depends on the function of the particular part of the brain that has been stimulated (i.e. poke at a part of your brain that deals with your legs, and what you feel is something within your legs).

So if music perception is truly a sense, then it is unique in that the relevant sensory cells are found within the brain itself.

The Three Ways of Knowing

The three ways of knowing are:

Genetic: your genes "know" something, and what they know was learnt via evolution.
Experiential learning: you learn from experience.
Cultural learning: information is transferred from another person's mind into your own.

Of course to learn you have to know how to learn, and ultimately the experiential and cultural ways of knowing must depend on a form of genetic knowledge, i.e. we learn from experience and we import knowledge from other people's minds because our brains know how to do those kinds of learning, and that knowledge is encoded in our genes.

Cultural learning can be further divided into two sub-categories: verified and unverified. Verified is when you accept information from another person provisionally, and only accept it fully when you have verified it against your own understanding and (perhaps) from your own future experience. Unverified is when you accept information without any further qualification.

Unverified cultural learning seems a very risky proposition, but it creates opportunities that would not otherwise exist. There are some important ideas about how the world works that can only be verified by living one's life according to those ideas, and then seeing what the final result is. Those ideas have to be used before they can be verified.

The Human Cultural Explosion

The downside of unverified cultural learning is that the learner is put at risk. Much of the information that we can receive culturally consists of spoken statements about the nature of reality. But speech can be dishonest. And if a speaker thinks that the listener might believe what is being said, then the temptation exists to speak falsely, in order to manipulate the listener.

This leads to a simple prediction about human evolution:

The development of any possible method to distinguish honest speech from dishonest speech would result in an acceleration of cultural evolution.

The logic of this prediction is as follows:

There is a trade-off between the advantages of unverified cultural learning and the risk of manipulation.
If the risk of manipulation can be minimised, then evolution will favour maximising the degree of unverified cultural learning, so as to receive the maximum possible benefits.

A further qualification can be made, which is that since the goal of this type of learning is to accumulate truth about the world in the long term, and not necessarily to determine immediately the truth or otherwise of every spoken statement, it is only necessary for an "honesty detection" mechanism to work probabilistically, i.e. to be right more often than it is wrong.

Now the development of such a mechanism may have occurred in steps, and the acceleration of cultural evolution may have followed in corresponding steps. But there is evidence of a major human cultural "explosion" about 70,000 years ago, slightly preceding the emigration of some modern humans from Africa about 60,000 years ago. Which suggests perhaps the sudden evolution of some major improvement in the mechanisms of cultural learning.

Musicality Perception = Probabilistic Honesty Perception

The latest version of my music theory unites these two ideas: music perception as a within-brain sense, and the need for a probabilistic "honesty sense".

Very briefly, the theory can be stated as follows:

Musicality is a perceived aspect of normal speech, which represents a probabilistic estimate of the speaker's honesty (or at least the lack of any intention to manipulate the listener).
Music is a false and contrived super-stimulus for the perception of the musicality of speech.

To make this theory into a plausible evolutionary account of why we respond to music as we do, there are two major explanatory gaps to be filled.

The first is that the musical sense appears to be operating within the brain of the listener, whereas if there is an intention to monitor the inner workings of the speaker's mind, then the sense should be operating within the speaker's brain. This gap can be closed by assuming that there is some degree of correlation between brain activity when listening to speech sounds and brain activity when making those same speech sounds. In other words, the neural activity patterns in the listener's brain are an approximate echo of the activity patterns in the speaker's, and the echo is good enough to provide useful information about the current mental state of the speaker.

The second explanatory gap is that music appears to operate on emotion, whereas the requirement appears to be for a perception of truth. This is a somewhat more difficult gap to close, however two assumptions can help to close it:

Firstly, the musical "sensory" cells are probably non-neuronal cells which are not able to make synaptic conections to neurons. Given the lack of synaptic connections, any output from the musical "sense" must be non-synaptic, i.e. via the emission of some specific hormone or neurotransmitter which carries a meaning of "truth", and which can be received by a suitable population of neurons able to interpret this musical neurotransmitter as having that meaning. When the musical sense evolved, evolution had to "make do" with whatever neurotransmitter best represented truth, and there may not be any specific non-synaptic neurotransmitter representing truth as such.
Secondly, emotions can be regarded as partial abstractions of truth. That is, a particular type of emotion occurs when a particular type of proposition about the world and the person's situation is true. In such cases, the propensity to express the relevant emotion is correlated with the propensity to believe the truth of the proposition. If (and this is the big if), there is feedback from the emotion to the original proposition, or at least to the consequences of believing said proposition (i.e. what to do about it), then an increase in emotion could be considered at least partially equivalent to an increase in the apparent "truthfulness" of the proposition. In effect, the music perception system is treating emotion as a proxy for perceived truth.

To give a specific example of an emotion as a partial abstraction, if someone says to me "your dog just got run over", then this will make me feel the emotion of sadness, if I am inclined to believe the speaker. Sadness can be regarded as an abstraction along the lines of "something good is not there any more", with an abstract consequence of the form "plans for the future must now be abandoned". This abstract consequence can be reconciled with the specific consequence – that I won't be able to go for walks in the park with my dog any more (in some cases the abstract consequences are not reconciled with our specific response to a situation, in which case we feel a conflict between what we intend to do and what our emotions "want" us to do).

The False Musicality of Music

It needs to be emphasised that the above is a theory of music perception as a hidden sense which is part of speech perception.

Music itself is a super-stimulus, which means that music is a contrived form of speech which exaggerates the perceived musicality. The form of music is determined by the manner in which musicality is perceived as a function of the observation of brain patterns in those parts of the brain where the musical sensory cells exist. Individual items of music can be regarded as local maximums of an optimisation problem determined by this "musicality function".

Applying this to the "your dog just got run over" example above, the implication is that if someone sings to me a song which says "your dog just got run over", the false perceived musicality of music should exaggerate the emotion of sadness so greatly, that, regardless of how credible I think the singer is, my reaction is one of extreme sadness as if my dog really had died.

Since this isn't quite what happens when we listen to music, even to sad music, there has to be something else going on.

Musical Immunity: Defending Against the False Musicality of Music

If music is a false form of "musical" speech, then music must have been something that was discovered, after music perception evolved.

My theory suggests, that when this happened, our ancestors' response to music was to respond to it emotionally as if those emotions were true. Very likely this would have caused confusion and corruption of our ancestors' perception of reality. In particular, they sang songs to each other, and the content of any song sung regularly would become a perceived "truth" about reality.

Given the absolute importance of musicality perception as an aid to cultural learning, music perception could not just evolve away. So human evolution had to evolve a secondary defense against the false musicality of music.

One could speculate about what this defense could be, but a reasonable and simple solution would be:

Learn to recognise music as a distinct category of speech which is separate from "normal" speech, and once recognised, disconnnect the emotional response to music from any permanent effect on the listener's understanding of reality.

This recognition system is somewhat like the immune system. Just like the immune system that protects us against diseases, the musical immune system cannot know in advance what the specific threats are going to be, and must learn to recognise them according to general (genetically pre-programmed) criteria. In the case of diseases this is required because new diseases keep on evolving. In the case of music it is required because people keep on making up new items of music (and sometimes new kinds of music).

The idea of disconnection is consistent with our subjective response to songs about dogs dying, i.e. we feel something subjectively similar to the sadness that we would feel if the dog had died, but we don't feel the consequent depression, and once the song has finished, there is no belief that any particular dog did die (unless we have additional reason to believe that the song was written about a real-life incident that happened to some particular dog).

The "Hiddenness" of Musicality Perception

I described musicality perception as a "hidden" sense. But there is an inconsistency here: we are consciously aware of the musicality of music, because of the emotional effect that it has on us, but we are not consciously aware of the musicality of normal speech.

In the first instance, the musicality of speech is much more subtle than that of music. Experimental study would be required before any exact number could be put on this, but I would guess that the ratio of speech musicality to musical musicality might be between 1:20 and 1:100. This difference could explain why we are only conscious of the musicality of music, i.e. we can be consciously aware of something at a level 100 times higher than a normal level that we are not consciously aware of.

A second reason for the musicality of speech to be hidden from conscious awareness is that it reduces any possibility of the listener becoming vulnerable to manipulation, whether that be by the speaker or by the listener. (For example, a listener could manipulate their own perception of reality by exposing themselves to a speaker who said certain things with greater musicality.) Because the purpose of musicality perception is to gradually and probabilistically alter the perception of reality, no particular instance of perceived speech musicality has any significance in the short term, and there is no meaning which can be reliably attached to it. Thus the listener is better off not even knowing what degree of speech musicality they are currently perceiving when they listen to speech, and our brains have evolved so that this is the case.