Music, Twin Speak and the Language Prototype

7 June, 2015

What can 'Talking Twins' tell us about the nature of music?

The Hypothetical One-Consonant-One-Vowel Melodic/Rhythmic Language

In 2005 (current page here) I suggested the existence, historically, of a language that contained only those aspects of language that relate to music. In particular:

Melody (but in the form of speech melody, and probably not musical)
Rhythm (similar, in the form of speech rhythm, and probably not musical)
One consonant
One vowel

The one consonant and one vowel prediction came from the simple observation that even non-sung music contains consonant-like sounds and vowel-like sounds, but the distinction between different consonants and different vowels is not of any major musical significance.

The Talking Twins

I suggested that this hypothetical language with music-like aspects reflected the state of human language at some time in the distant past.

I did not predict that it would be spoken by two 18-month old twin boys talking to each other in their family kitchen.

Yet, in this very famous Talking Twins Video, the two boys Sam and Ren seem to be "speaking" a language rather similar to what I described.

There's no real evidence that the twins' conversation had any real meaning, or actually communicated anything, yet it seems fairly obvious that they were going through the motions of talking to each other, and listening to each other, and doing it via something that has the basic characteristics of a language, but which does not sound much like the native English of their parents.

The Language Prototype

When I wrote my original article in 2005, my idea about the relationship between music and speech was that music perception is some hidden aspect of speech perception.

However, my current thinking is that music perception is a mutated version of a component of the early speech-learning instinct. This component is that part of speech-learning which recognises speech as a special category of external stimulus whose nature is substantially determined by the the language that the speech is in, where that language exists as a set of rules and patterns which don't particularly follow the logic or causality that applies to immediate physical reality.

One simple way to specify such an instinct is to assume that there is a language prototype, and that those stimuli which match that prototype are probably speech spoken in the language, and any stimuli which don't match the prototype are not speech.

The prototype may not exactly match what is and isn't in any particular spoken language, but it is close enough that the infant's brain will, with sufficient listening experience, eventually correctly re-classify those speech or non-speech stimuli which don't initially match the prototype.

In the jargon of neural networks, it's a classification problem with an initial partially correct solution. (Almost certainly this language prototype is just one of many prototypes in the brain that represent initial approximately correct solutions to important classification problems.)

The language prototype is largely defined by a criterion of simultaneous activity in a certain subset of neural regions in the brain, ie those regions dedicated to processing those aspects of sound usually occur in spoken languages.

Twin Speak: Language, Which Might Not Actually Be Communication

Where does this leave the twins Sam and Ren? My best guess is that the language prototype, as I have described it here, exists in a substantially similar form in every infant's brain, and is used to recognise speech when it's heard being spoken by older family members.

The difference with twins is that they co-exist as infants of the same age, and can potentially act as models for each other. Regardless of the specifics of language learning, there is the potential for twins to 'learn' a language from each other. Since neither of them 'knows' a proper language to start with, the result of such a learning process is somewhat circular, and they are effectively making it up as they go along.

If there is such a thing as a language prototype, then it might be expected to influence such a mutual 'learning' process.

For example, if twin A happens to say something that matches the prototype, then twin B will match that against the prototype, 'learn' it as language, and be motivated to repeat that same utterance back again. At which point twin B's utterance matches twin A's language prototype, so twin A now 'learns' the same utterance. And so on.

In this scenario, it is not necessary for the twins' mutually-learned speech to be entirely spontaneous, so, for example, Sam and Ren's "da-da-da-da-da..." may have started off with some older person saying "Dada" to one or both of them.

Note that this language prototype does not necessarily include any pre-defined meanings to be assigned to any particular set of sounds, given that its primary purpose is to distinguish spoken language from all other external stimuli.

A common question about "twin speak" is whether or not it constitutes actual communication. There is evidence that in some cases it does, where there are historical cases of twins who have grown up speaking a mutual secret language, and who have had a corresponding difficulty later learning the 'real' language of the society that they lived in. But at the same time there are situations where twins can be seen "talking" and "listening" to each other, and there is no evidence that any real communication is going on. Which I would say appears to be the case for most of the "twin talk" videos of very young children that can be found on YouTube. Including the famous Sam and Ren video. (For my own mini-compilation of such videos, see my "Twin Speak" YouTube channel.)

If we accept the concept of a Language Prototype, the purpose of which is to recognize the existence of language, as a separate and important 'thing', before the infant brain can proceed to the steps of learning the specific meanings of spoken utterances, then it is not implausible that infants can be learning to "talk" language and "listen" to language, without having yet associated any meaning with that language.

Non-Speech, Speech, and Music

Music then enters into this story as a special third category of external stimulus – it's something which is not a normal non-speech stimulus, and also it's not speech.

Music is defined by activity in the same neural regions that define the language prototype, but it is additionally defined by constant patterns of inactivity, at a fine level of detail, within all those neural regions. This requirement for constant inactive sub-regions adds strong constraints which in practice are never satisfied by normal speech, so it readily distinguishes music from normal speech. At the same time, those constraints are very unlikely to be satisfied by an external stimulus that this not the result of people deliberately creating something that happens to satisfy those constraints – so music is therefore a third category of external stimulus clearly distinguished from both normal (non-speech) stimuli, and from speech.

I have given here what I believe to be the criterion that defines musicality, but this criterion does not by itself answer the one question that matters the most, in the scientific investigation of music, which is: what is music actually for?

I will attempt to answer that question in my next blog posting, in a way that relates it to the concept of the Language Prototype.