Limitations of Statistical Methods for Analysing Music

14 November, 2005

Statistical analysis of music promises to deliver algorithms that can distinguish "good" music from "bad" music. But analyses disconnected from biology are unlikely to lead to any major improvement in our understanding of what music is, even if they result in the discovery of significant correlations between observed characteristics of music and its popularity.

by Philip Dorrell

In the News

Every now and then there is a news item about some company or research group which has discovered a new method for predicting musical "hits". The latest of these is from Brian Whitman and Tristan Jehan at MIT. Before that there was Hit Song Science.

For commercial reasons, these enterprises don't reveal the full details of the algorithms they use, because their algorithms are their "crown jewels", which they hope to sell as a service to musical hopefuls and their promoters.

The "Science" of Music

What I cannot find in any of the descriptions of these prediction systems is reference to the biology of music. The ultimate predictor of how much a person is going to like a piece of music would be a full simulation of the workings of the parts of the person's brain involved in perceiving music. And if there is to be any shortcut to enable a "good enough" simulation, without simulating the precise operation of each one of tens of millions of neurons, it is probably going to involve some understanding of what music perception is perception of, and what, if anything, music means.

As far as I can tell, commercial statistical analyses of music are not based on any attempt to scientifically understand the biology of music. The researchers think up any number of characteristics to statistically analyse, they throw a lot of data into the analysis, they look at the results to see if anything significant falls out, and sometimes they find something.

Unfortunately, even if these analyses do result in the discovery of correlations between the quality of music and aspects of its mathematical structure, and even if the researchers told us what those correlations were (which they won't tell us, for commercial reasons), probably we still wouldn't learn much about what music is.

An Analogy: The Statistical Analysis of Cars

Imagine that you wanted to understand what makes a car go. Let's pretend that you know nothing about mechanics or thermodynamics or chemistry or anything else relevant to car technology. You observe that cars can go quite fast, and you decide to learn what makes them go fast by measuring statistical correlations between the observable characteristics of cars and their maximum speed.

You start by analysing the relationship between a car's maximum speed, its color and the number of doors. You discover that cars which are red and have two doors often travel considerably faster than other cars. However, although this observation provides a valid and useful predictor of maximum speed, it tells you very little about what makes a car go fast, because being red and having two doors actually has nothing whatsoever to do with the basic mechanics of what makes a car go.

A more reasonable approach might be to look under the hood. You could identify various types of components inside, which are similar in different cars, and which can be measured. For example, there is a large metal object in the middle, the overall size of which is correlated with the speed of the car, and even more so if one compares the ratio of the size of this large metal object to the size of the car.

This ratio is more useful than the color and the door count as a predictor of the maximum speed of a car. But it still tells us very little about how a car works. If I make an object that looks like a car, and put a large metal object in the bonnet area, will it go fast? (Hint: it probably won't go at all.)

What's Wrong With Statistical Analysis

The problem with statistically analysing cars in this way is that you are analysing a pre-selected set of objects, i.e. cars which people have paid for and which they drive. As a result, the analysis does not tell you anything useful about the difference between things that are cars and things that are not cars.

One way that you could start to understand how a car works would be to observe that its owner regularly fills it with petrol, that the car uses up oxygen, and that it emits products of combustion from the exhaust. Without even knowing the details of what happens inside the engine, you would be able to deduce something about the consumption of chemical energy and its conversion to mechanical energy. You could then proceed to trace the flow of fuel into the engine, and you could trace the flow of mechanical energy out from the engine, through the transmission and into the rotational energy of the wheels.

The prerequisite to this analysis is a basic understanding of the physics of chemical and mechanical energy. Trying to understand how cars work without this prior knowledge of physics would be very difficult, no matter how much statistical analysis you did.

The upside is that if you do understand the basic underlying science, not only can you understand how existing cars work, but you can hope to have some success designing new improved cars, including even cars that work by different mechanisms of energy conversion (like an electric car, or one that runs on diesel).

What's Wrong With Statistically Analysing Music

Statistical analysis of music suffers similar problems. There is an utter lack of understanding of even the basics of what goes on inside the brain when we perceive music. No one has any understanding of what music is, or why we should respond to it the way we do. We know that human beings are the result of evolution by natural selection, but no one has any idea what selective forces act in favour of the genes that give us the ability to enjoy music.

A Biological Approach

My own attempt to understand music scientifically started from a basic assumption about information processing. I assumed that music perception is some form of information processing, because music perception happens in the brain, and information processing is what the brain does. This led to two questions:

What is the output of the processing?
What is the meaning of the calculation that derives the output from the input?

I assumed that the primary output of music perception is a measure of the musicality of the music, i.e. how musical (or how "good") the music is (and that other effects, such as emotional effects, are secondary consequences of the primary output).

I then attempted to determine the meaning of this calculation (and its output). Music itself seems to have no intrinsic purpose, other than our enjoyment of it. So I considered the possibility that the intended input of music perception is actually something else, and music just happens to be a contrived stimulus, or super-stimulus, which maximises the perceived musicality of the input.

The natural stimulus which most closely resembles music is speech, so I concluded that the perception of musicality may be the perception of some important property of speech.

What's Next?

In the meantime, expect to occasionally hear about new claims of success in the statistical analysis of music, and businesses selling propietary "hit" prediction services. But don't expect any of these businesses to radically improve the musical landscape. In fact it is safe to conclude that anyone who claims to have a good predictive algorithm, but who does not have a corresponding generative algorithm (i.e. an algorithm to compose music), probably has not discovered the secret of what music is.

When a complete scientific understanding of music is achieved, you will surely know about it – even if its discoverers keep their theory a secret – because such an understanding will enable the creation of newer and stronger music, some of which will sound unlike any kind of music you had ever heard before.