Culturomics: The Challenge for Wordability

I have been thinking a lot about Culturomics recently. Frankly, it has given me a headache. But it has also reminded me that if Wordability were to be up to date with ever single new word that enters the English Language, I would be glued to my keyboard the whole time and would neither eat nor sleep.

Culturomics has existed as a word and a discipline for two years. It is a very exciting linguistic development, and one that is only possible because of advances in technology. With millions of books now existing in digital format, courtesy of Google, scientists are able to analyse this vast amount of data to derive conclusions about the English language that have never previously been possible.

The first paper, published at the end of 2010 in the journal Science (free log-in needed to view the link), analysed 4% of all published material and used this to give an indication of the number of words in the English language. The estimate came out at more than a million, far more than recorded by dictionaries.

This year, a new paper by Alexander M. Petersen, Joel Tenenbaum, Shlomo Havlin and H. Eugene Stanley has given Wordability something to think about. Rejoicing in the catchy title Statistical Laws Governing Fluctuations in Word Use from Word Birth to Word Death, the paper applies science to the life of words and comes up with rules to explain the birth and death of words and the evolutionary processes that govern their existence.

Leaving aside the many complex equations and use of Greek letters, the writers come to some interesting conclusions. More than 8,000 words entered the English language last year, so you can understand why Wordability will only track those that really start to hit the headlines. It also says that there is a change in the rate at which words are born and die, with more words dying off and fewer words coming in, though it says that those that do arrive have greater staying power because they describe completely new things, such as in the field of technology.

What is particularly interesting is the way that evolutionary theory can be applied to words. As the authors say, “words are competing actors in a system of finite resources”. Factors such as being favoured by modern spell checkers can given a word “reproductive fitness” and allow it to survive against other words of a similar semantic bent.

I have thought about this paper quite a lot, and find myself wondering if it will actually end up marking a point in time and that the evolutionary rules are about to change. Is the technology which allows Culturomics to flourish and these observations to be made now going to be the agent which changes that evolutionary process?

The authors say that it takes around 30-50 years for a word to be fully accepted and to either make it into a dictionary or disappear into linguistic obscurity. I wonder whether this will now change, and a new pattern will start to emerge. I have bemoaned in the past how long it sometimes takes dictionary makers to recognise words which have gained significant currency. In our interconnected world, where ideas and words can fly across the globe and become accepted almost instantly, the evolutionary pattern identified by the authors may start to change. I suspect it may become quicker for words to become accepted, and that the survival characteristics that will govern this will also change. Words that are slightly silly, that have the capacity to be shared on social networks, that describe an action people can participate in, will be the ones that evolve rapidly and see off the other competing words around them.

It is a fascinating concept that words fight the same survival battles as species on earth. In the 21st century, it will be interesting to see what factors allow them to survive.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s