Tag Archives: spoken vs written language

COBUILD English Usage 4th Edition: Changes in vocabulary and grammar

In the second of our blog posts about the new edition of COBUILD English Usage, Penny Hands details some of the findings that came out of the team’s research into the ways in which new words and uses are created.

The second stage of the COBUILD English Usage update involved a survey of the current state of various aspects of the English language. It was carried out specially for this edition using the constantly updated Collins Corpus, as well as social media research and crowdsourcing.

It’s all very well having billions of words of corpus, but how do you find new words in it? It’s for this reason that a linguist’s job is a 24-hour one, constantly on the lookout for new words and uses. Corpora allow us to track these changes and to look for different ways that they are used, and to establish who uses them and in what context.

One really useful source of data is the Language Observatory Group (LOG) facebook page, set up by Mike McCarthy, where members add their observations about changes in the language.

The aim is not to gripe about ‘annoying’ things we hear people say, but some members care about that happening more than others. Mike has a certain refreshing tolerance for people expressing their preference for, or dislike of, certain neologisms, taking the view that a lot of fashions in clothes, music, etc, seemed odd or silly when they came out (and then do again when we look back on them).

New words are created all the time, often coming into the language via younger people. Occasionally we see a completely new word appear apparently from nowhere; more often, though, new words come about by people recycling existing ones so that they are used in a slightly different way.

The resulting findings hopefully provide a handy reference guide to new words and uses, but they also represent a fascinating snapshot of today’s society with all its attitudes and preoccupations.

Comparing the Bank of English section of the Collins Corpus with the ‘New Monitor’ corpus (which contains recent material from news and social media websites), we explored the ways in which language has evolved, looking at content from social media sites and news articles produced over the last 10 years.

Firstly, based on data from Collins’ new words database, we looked at some of the most popular ways of creating new language.

Common ways of doing this include adding a prefix or a suffix to an existing word, combining words, or using words in new ways, perhaps by giving them a new function or part of speech.

So the first thing we did was to follow up some hunches we had about new-word creation. As predicted, a lot of the new words we were seeing coming through in our dictionary department were ones created from existing words, combined with prefixes and suffixes.

Here are some of the most prominent innovations that came up in our survey of the current state of the language.

Prefixes

Common examples were:

crowd

crowdsourcing

crowdlending

crowdwritten

crowdworking

crowdfinancing

crowdsharing

upand down

upthread, upvote, uptick

downthread, downvote

Suffixes

Common examples were:

-less cashless, contactless, driverless, paperless

free traffic-free, GMO-free, carbon-free, meat-free, lactose-free

Verbing

This one was flagged up among others on the LOG facebook page by Gavin Dudeney, who spotted the use of ‘sciencing’ on Radio 4.

The new probe is due to touch down on Mars soon and will be ‘sciencing’ as soon as it does.

This observation led us to investigate the current craze for verbing.

What we found, on investigating the social media sections of the Collins corpus, was a multitude of verbs based on brands.

Brand names have always been a rich source of verbing – hoovering, xeroxing, googling – but they seem to be proliferating in our current climate. I wonder if that’s because of the way that we all feel part of the action – we have agency over what gets bought and sold on these sites.

Why are you asking this here when you can just google the answer?

Jen snapchatted the whole thing.

Now we usually netflix it or chill at home with some good food.

We also found plentiful examples of airbnbing, eBaying, Instagramming and Ubering.

Adjectives as nouns

The next tendency we investigated was the sudden increase we had noticed in the use of adjectives as nouns.

Spread the happy. (Nutella®)

Committed to great since ’78. (Ben & Jerry’s®)

Find your fabulous.

And, by extension, a HarperCollins book …

‘Because’ as a preposition

Finally, we observed the repurposing of because as a preposition:

Why bother discussing this? Because language.

Not bothering with this. Because lazy.

Not going out tonight. Because working.

Here’s a snapshot of the concordance for ‘Because language’:

Note the line from the 2018 social media corpus containing the acronym ‘nsfw’, which stands for ‘not safe for work’, often used as a warning for an email subject line or social media post when sharing a link to potentially inappropriate content:

‘… hilarious nsfw because language.’

See also below a Twitter user’s use of ‘Because’ + adjective:

Note the use of a full stop to create a pause for emphasis.

Finally, if you’re interested in looking into this type of research further, take a look at Jack Grieve’s inaugural lecture, ‘The Future of Language Change’ at the University of Birmingham in December, 2018.

Professor Grieve shows how the study of language change is fast becoming a data science, and demonstrates what can be done with social media and high-level analysis tools.

He shows a series of graphs to demonstrate how we can now track usage from its initial use on social media and its exact location. We can see on what days certain words are typically used, where a brand-new coinage starts, and its pattern of diffusion over time. We can even home in on a particular city or neighbourhood, and see in which district a word emerges.

In the past, linguists used to say that you can never know where a word started because you’re not there to notice them. But now that isn’t true, at least for language used on social media. Language change research is making huge strides – and we’re the lucky ones who are here to see it.

Grammar and register


This article has been written by Julie Moore, who is an ELT materials developer and lexicographer.

Our last post focused on the difference between a prescriptive and a descriptive approach to grammar. A descriptive grammar, such as the Collins COBUILD English Grammar, describes the language which people actually use, and draws from that a set of norms for usage. These norms, in turn, are used to help learners use English in a way that will, hopefully, come across as normal and natural.

While it doesn’t make judgments about ‘good’ and ‘bad’ grammar, a descriptive grammar does, however, still need to draw distinctions about what is typical in different contexts and what is therefore generally considered appropriate. Language which is perfectly normal in everyday conversation or in social media chat, for example, may be inappropriate or even unacceptable in an academic essay or a business report.  The idea that different types of language are typically used in different contexts is known as register.

Spoken vs written language:

Perhaps the most obvious distinction to make is between spoken and written language. As corpus linguists have begun to study the grammar of not just written texts but of spoken, conversational English as well, a number of important differences have become apparent in the way we use language when we speak and when we write. Carter and McCarthy (2015) highlight two broad differences:

  1. They explain that some of the established grammatical features found in writing need to be rethought when it comes to speaking. For example, whereas written language has clear sentences, spoken language tends to be instead structured around turns, where each turn may or may not consist of what we’d conventionally think of as a complete sentence.
  2. They point out the existence of small words or phrases in spoken language which stand on their own and function independently of grammatical structures, for example, well, anyway, fine, and great.

Consider the following dialogue between two students in a university library. What do you notice about the structure of the turns? Could any of them be considered fully-formed sentences?

A: You finished yet?shutterstock_521796607
B: Nearly.
A: Want to go and grab a coffee?
B: When I get to the end of this bit, maybe.
A: Okay, fine.
B: You go. I’ll be there in a bit.

Only the final turn here contains what we’d conventionally recognise as a fully-formed sentence. So why is this ‘looser’ approach to grammar acceptable in speech but not necessarily in writing? A lot comes down to shared understanding and context. When you’re talking to someone face-to-face, you rely a lot on the shared context (i.e. you and your listener are in the same place, at the same time, looking at the same surroundings) and your shared understanding – about each other and why you’re there. This means that there’s a lot that can remain unsaid, and this is what Carter and McCarthy (2015) term ‘situational ellipsis’. In writing, we generally have to be more explicit because we don’t share the same immediate context as our reader. That means we have to fill the ‘information gap’ between us, especially if our potential audience is unknown. We have to spell things out clearly to make sure our reader understands our message; we can’t judge by their expression whether they’ve understood or whether they look a bit puzzled, and they can’t signal understanding or ask for clarification.

Audience and purpose:

The register you choose, whether in speech or writing, also depends very much on your audience and purpose. Imagine, for example, that you witness a minor car accident in the street and you react in the following three ways.

  1. You take a picture and post it on social media with a comment.
  2. You tell your family about what happened when you get home.
  3. One of the drivers takes your contact details and some time later you receive a letter from her insurance company asking you to write a report of what you saw.

shutterstock_20978257

In each of the three situations, how might your language differ in terms of …
– the amount of detail you include?
– vocabulary?
– grammar?

Which of the following examples do you think might be used in each context? Which grammatical features give you a clue?

At 8.30 on the morning of 25 January 2017, I was walking along Clifton Road.
Nasty smash on Clifton Rd … no one hurt, but road blocked & loads of traffic backing up.
The guy was going way too fast, he was never going to stop.
The black vehicle may have been travelling above the speed limit.

The very careful, formalized order of the time adverbials in the first example signals a (semi)legal register. This is how police reports typically describe the time of events and it’s a form that lay people who find themselves in a legal context, such as writing a statement to an insurance company, tend to adopt. As well as it just being ‘the norm’, we use this type of language because we understand the need to be clear and accurate, and to provide as much detail as possible in this particular context; we recognize the purpose of the communication as well as the audience.

In the second and third examples, we see instances of slightly more informal grammatical forms – loads of … and way too + qualitative adjective – which are typical of speech or informal writing, such as on social media. Whereas in the final example, the use of may have to express possibility is a slightly more formal choice than might have or could have. Collins COBUILD English Grammar includes many more examples of grammatical features typically used more in formal or informal registers.

Specialized registers:

As well as the broad register categories of spoken and written or formal and informal, certain features are typical of a more specialized register. We’ve already seen an example of a legal register; some other features most usually found in specialized contexts include:

  • Literary: Her pale face grew paler yet. (yet after a comparative adjective)
  • Old-fashioned or very formal: It is my decision, is it not? (an uncontracted negative tag)
  • Technical: non-ferrous metals such as copper, lead and aluminium (a normally uncountable (mass) noun being used in the plural form to refer to different types of a substance)
  • Academic: a clear demonstration of the brain mechanisms at work (a long noun phrase) 

What happens if you break the rules?

Throughout this post, I’ve been using lots of hedging language – typically, usually, tend to – because I’ve been describing tendencies rather than hard-and-fast rules. Of course, speakers break them all time. But what happens when we get a mismatch in register? The text below is from a television advert (for totaljobs.com). It’s delivered by a primary school teacher addressing a group of five-year-olds:

I put it to you that on the morning of the 17th you did enter the Story Time Corner and with malice aforethought you did inflict grievous injury upon one Mr Boo-Boo Bananas.

The effect here is humorous because the use of typically legal language sticks out as marked in the context. This is fine if you’re aiming for humour, but less good if you’re a learner who inadvertently uses linguistic features that don’t match the communicative context. In the classroom, we tend to mention register in relation to vocabulary (children vs. kids, thank you vs. cheers), but if we’re going to help our students avoid embarrassing faux pas, then it’s something to bring up in relation to grammar too.

Explore this topic in greater detail with our free guided worksheet.


References:
Carter, R. & McCarthy, M. (2015) ‘Spoken Grammar: Where Are We and Where Are We Going?’ Applied Linguistics