The number of new words contributed to the English language by technology is well known. But how does a company which provides technology to help with language and communication cope with the ever-expanding tide of vocabulary?
Swiftkey has garnered praise and awards for its predictive text app. Its nifty software allows users of Android devices to speed up their typing by anticipating what they are going to type and then suggesting it for them.
I wondered how the Swiftkey database keeps up to date, to ensure that it can offer users the newest words on the block. So I asked Dr Caroline Gasperin, who leads a team of eight language processing engineers responsible for most language-related tasks at the London-based company.
She explained that Swiftkey learns an individual’s linguistic habits, and that by extension this grows its global database as a result.
“Your SwiftKey will learn any word you teach it, you only have to type it once and it will be included in your personal language model on your device,” she said.
“Through the Personalisation feature – which allows you to sync it with your Gmail, Facebook and Twitter accounts - and through continuous use, SwiftKey learns the words you use and the contexts in which you use them so that its predictions and corrections are based on your own way of writing.”
This learning can then feed into the overall word database to help the word corpus grow. Caroline said: “We’ve started putting in place the infrastructure for learning new words from our user base.
“As users use the Personalisation feature of SwiftKey, we are able to collect statistics about the words they use and identify words that we did not know before. We are putting in place a semi-automatic process to identify which of those words could become part of a standard dictionary and consequently become part of our downloadable language modules.
“This process consists of observing the frequency of use of words over time: words which used to have few occurrences across our user base, but which start becoming more frequent over time, and which are mentioned by several of our users instead of by just one or a few, are considered as good candidates for being added to our dictionaries.
“It’s worth adding we take our users’ privacy extremely seriously and have policies in place to safeguard this. We do not process a user’s data personally.”
So has the way that new words are assimilated changed, and is the process quicker than before? Caroline said: “We look into how many different people have used an unknown word in order to consider it as a potential new word in the language instead of a personal word.
“We take our users’ privacy seriously, so we’ve developed ways to discover words in wide use instead of focusing on single users.
“We haven’t followed users’ language use for long enough to know whether new words are being adopted faster than before, but we are working on getting those statistics.”
I have long since believed that new words are being created and accepted into the language considerably quicker than before, with technology the principal driver behind that evolution. It would be interesting to revisit Swiftkey at some point soon to see whether those promised statistics back up that theory. And the company also gives us a very clear steer about how its core business has to adapt to the ever-changing delights of the English language.