Do Language and Music Mimic Nature?

In a new book, neuroscientist and author Mark Changizi explores how language and music separate us from our primate ancestors

By Mark Changizi

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

Editor's Note: The following is an excerpt from the first chapter of the new book Harnessed: How Language and Music Mimicked Nature and Transformed Ape to Man, by Mark Changizi. Copyright (c) 2011 by Mark Changizi.

If one of our last nonspeaking ancestors were found frozen in a glacier and revived, we imagine that he would find our world jarringly alien. His brain was built for nature, not for the freak-of-nature modern landscape we humans inhabit. The concrete, the cars, the clothes, the constant jabbering—it's enough to make a hominid jump into the nearest freezer and hope to be reawakened after the apocalypse.

But would modernity really seem so frightening to our guest? Although cities and savannas would appear to have little in common, might there actually be deep similarities? Could civilization have retained vestiges of nature, easing our ancestor's transition? And if so, why should it—why would civilization care about being a hospitable host to the freshly thawed really-really-great-uncle?

The answer is that, although we were born into civilization rather than melted into it, from an evolutionary point of view we're an uncivilized beast dropped into cultured society. We prefer nature as much as the next hominid, in the sense that our brains work best when their computationally sophisticated mechanisms can be applied as evolution intended. Living in modern civilization is not what our bodies and brains were selected to be good at.

Perhaps, then, civilization shaped itself for us, not for thawed-out time travelers. Perhaps civilization possesses signature features of nature in order to squeeze every drop of evolution's genius out of our brains for use in the modern world. Perhaps we're hospitable to our ancestor because we have been hospitable to ourselves.

Does civilization mimic nature? I believe so. And I won't merely suggest that civilization mimics nature by, for example, planting trees along the boulevards. Rather, I will make the case that some of the most fundamental pillars of humanity are thoroughly infused with signs of the ancestral world…and that, without this infusion of nature, the pillars would crumble, leaving us as very smart hominids (or "apes," as I say at times), but something considerably less than the humans we take ourselves to be today.

In particular, those fundamental pillars of humankind are (spoken) language and music. Language is at the heart of what makes us apes so special, and music is one of the principal examples of our uniquely human artistic side.

As you will see, the fact that speech and music sound like other aspects of the natural world is crucial to the story about how we apes got language and music. Speech and music culturally evolved over time to be simulacra of nature. Now that's a deep, ancient secret, one that has remained hidden despite language and music being right in front of our eyes and ears, and being obsessively studied by generations of scientists. And like any great secret code, it has great power—it is so powerful it turned clever apes into Earth-conquering humans. By mimicking nature, language and music could be effortlessly absorbed by our ancient brains, which did not evolve to process language and music. In this way, culture figured out how to trick nonlinguistic, nonmusical ape brains into becoming master communicators and music connoisseurs.

One consequence of this secret is that the brain of the long-lost, illiterate, and unmusical ancestor we unthaw is no different in its fundamental design from yours or mine. Our thawed ancestor might do just fine here, because our language and music would harness his brain as well. Rather than jumping into a freezer, our long-lost relative might instead choose to enter engineering school and invent the next-generation refrigerator.

The origins of language and music may be attributable, not to brains having evolved language and music instincts, but rather to language and music having culturally evolved brain instincts. Language and music shaped themselves over many thousands of years to be tailored to our brains, and because our brains were cut for nature, language and music mimicked nature…and transformed ape to man.

Under the radar
If language and music mimic nature, why isn't this obvious to everyone? Why should this have remained a secret? It's not as if we have no idea what nature is like. We're not living on the International Space Station, and even those who are on the Space Station weren't raised up there! We know what nature looks and sounds like, having seen and heard countless examples of it. So, given our abundant experiences of nature, why haven't we noticed the signature of nature written (I propose) all over language and music?

The answer is that, ironically, our experiences with nature don't help us consciously comprehend what nature in fact looks and sounds like. What we are aware of is already an assembled interpretation of the actual data our senses and brains process. This is true of you whether you are a couch potato extraordinaire or a grizzled expedition guide just returned from Madagascar and leaving in the morning for Tasmania.

For example, I am currently in a coffee shop—a setting you'll hear about again and again—and when I look up from the piece of paper I'm writing on, I see people, tables, mugs, and chairs. That is, I am consciously aware of seeing these objects. But my brain sees much more than just the objects. My early visual system (involved in the first array of visual computations performed on the visual input from the retina) sees the individual contours, and does not see the combinations of contours. My intermediate-level visual areas see simple combinations of several contours—for instance, object corners such as "L" or "Y" junctions—but don't see the contours, and don't see the objects. It is my highest-level visual areas that see the objects themselves, and I am conscious of my perception of these objects. My conscious self is, however, rarely aware of the lower hierarchical levels of visual structure.

For example, [could] you recall [a] figure [from] the start of the chapter—[a] person's head with a lock and key on it? Notice that you [could] recall it in terms referring to the objects—in fact, I just referred to [an] image using the terms person, head, lock, and key. If, instead, I were to ask you if you recall seeing the figure that had a half dozen "T" junctions and several "L" junctions, you would likely not know what I was talking about. And if I were to ask you if you recall the figure that had about 40 contours, and I then went on and described the geometry of each contour individually, you would likely avoid me at cocktail parties.

Not only do you (your conscious self) not see the lower-level visual structures in the image, you probably won't find it easy to talk or think about them. Unless you have studied computational vision (i.e., studied how to build machines that see) or are a vision scientist, you probably haven't thought about how contours intersect one another in images. "Not only did I not see T or L junctions in the image," you might respond, "I don't even know what you're talking about!" We also have great trouble talking about the orientation and shapes of contours in our view of three-dimensional scenes (something that came to the fore in the theory of illusions I discussed in The Vision Revolution).

Thus, we may think we know what a chair looks like, but in a more extended sense, we have little idea, especially about all those lower-level features. And although parts of our brain do know what a chair looks like at these lower levels, they're not given a mouthpiece into our conscious internal speech stream. It is our inability to truly grasp what the lower-level visual features are in images that explains why most of us are hopeless at drawing what we see. Most of us must undergo training to become better at accessing the lower levels, and even some of the great master painters (such as Jan Van Eyck) may have projected images onto their canvases and traced the lower-level structures.

Not only do we not truly know what nature looks like, we also don't know what it sounds like. When we hear sounds, we hear the meaningful events, not the lower-level auditory constituents out of which they are built. I just heard someone at the next table cutting something with her fork on a ceramic plate. I did not consciously hear the low-level acoustic structure underlying the sound, but my lower-level auditory areas did hear just that.

For both vision and audition, then, we have a hierarchy of distinct neural regions, each a homunculus ("little man") great at processing nature at its level of detail. If you could go out for drinks with these homunculi, they'd tell you all about what nature is like at lower and middle hierarchical scales. But they're not much for conversation, and so you are left in the dark, having good conscious access only to the final, highest parts of the hierarchy. You see objects and hear events, but you do not see or hear the constituents out of which they are built.

You may now be starting to see how language and music could mimic nature, yet we could be unaware of it. In particular: what if language and music mimic all the lower- and middle-level structures of nature, and only fail to mimic nature at the highest levels? All our servant homunculi would be happily and efficiently processing stimuli that appear to them to be part of nature. And yet, because the stimuli may have a structure that is not "natural" at the highest hierarchical level, our conscious self will only see the dissimilarity between our cultural artifacts and nature.

Why should we believe what we can't consciously perceive—that language and music mimic nature at all but the highest hierarchical level? Why not go all the way and make language and music completely like nature?

Let's not forget that language and music are not merely trying to mimic nature. They have jobs to do: writing is for putting thoughts on the record, speech is for transmitting thoughts to others, and music is perhaps for something like evoking feelings in others. Language and music want to capture as much of the structure of nature as they can so that they have an easy ride into our brains, but they must serve their purpose, and will have to sacrifice nature-mimicry when it is necessary to do so.

So one can see how sacrifices of nature-mimicry may sometimes be part of doing business. But why should the sacrifices be up near the top, where we have greater conscious access? The principal reason for this is that if the earlier regions of the hierarchy receive stimuli that they can't make any sense of, then they will output garbage to the next higher level, and so all levels above the unhappy level will be unhappy. Breaking nature-mimicry at one level will break it at all higher levels.

For example, I have argued in earlier research and in The Vision Revolution that writing looks like nature. In particular, I have suggested that written words look like visual objects. But words do not necessarily look natural at all levels up the hierarchy. Strokes look like contours, and letters look like object junctions; and thus the lower and middle levels of your visual hierarchy are happy. But because in alphabetic writing systems the letters in a word depend on how it is spoken, there is no effective way to make entire words look like objects. (For example, the junction-like letters in the words you are currently reading are simply placed side by side, which is not the way junctions in scenes are spatially related.) Your highest-level regions, of which you are most directly aware, only notice the nonnatural look of written words. And when visual signs do more closely match the visual structure of objects at the highest levels, people do see the resemblance to nature—this is why trademark logos and logographic writing systems like Chinese look (to your conscious self) much more object-like than the words you're reading here.

My claim in this book that language and music mimic nature must be understood in this light. I claim that they mimic nature, indeed, but not necessarily "all the way up." The reason why writing, speech, and music don't obviously seem like nature is that nature is not being injected at the higher levels, perhaps—as we've seen with writing—in order to better accomplish the functions they are designed to carry out.

We see, then, why it is that the nature-mimicry in language and music has remained a secret for so many millennia. If only your lower-level visual and auditory areas could speak! They'd have long ago let you know that language and music are built like nature. Because those lower homunculi are part of you, there is a sense in which you have known about this ancient, deep secret code all along. Pieces of meat inside you knew the secret, but weren't telling. In this light, one can view this book as a kind of psychoanalysis—if you're into that—digging up the homunculus-knowledge you already have deep inside you, and working through the ways it shaped who you are today.

Reprinted by arrangement with Benbella Books, Inc., from Harnessed by Mark Changizi. Copyright © 2011 by Mark Changizi.