Bígí ag tweetáil – the internet can give Irish new life, says DCU researcher

22 Apr 2016

Dr Teresa Lynn

Technology can help breathe new life into the Irish language, says Dr Teresa Lynn, who is improving machine translation systems.

If you grew up in Ireland, it’s a safe bet that you spent years learning the Irish language. Yet many of us are a bit shy to unleash our inner Gaeilgeoir in conversation – perhaps due to a fear of getting the grammar wrong or needing to substitute a few words in English along the way.

Online platforms such as Twitter can help us reconnect with our Irish, and we should be brave and use the cúpla focal there, according to Dr Teresa Lynn from Dublin City University, who sees that Irish is getting a new lease of life online.

The researcher in natural language processing at the ADAPT Centre has worked on building parsing models to help language tools such as machine translation systems or language-learning software to better identify the structures of sentences in Irish.

Sentence puzzles

Machine translation systems compare sentences between languages and work off the probabilities of translations being correct, and Lynn has had a long interest in how to ‘teach’ computers about relationships between the words in those sentences.

“For English-Irish machine translation to work well, the computer needs to break down the structures of sentences and understand the relationships between words,” she explained.

During her PhD at DCU and Macquarie University in Sydney, Australia, Lynn developed parsing software that could ‘learn’ about the structures of sentences in Irish. “A parser is a tool that reads through a sentence and breaks it up into syntactic structure – the subject, the object and the prepositional phrase and so on,” she explained.

But the Irish language has particular features that can differ quite substantially from English.

For example, Irish has two forms of the verb to be ­– and Is – and the correct one to use depends on the context, notes Lynn. We also tend to ‘cleft’ sentences in Irish, fronting the information in a particular way: in Irish the structure might be ‘It was herself who answered the door’ and in English it is more likely to be ‘she answered the door’.

Lynn’s doctoral research, which could ultimately support better online translations into Irish, analysed a body of 1000 Irish-language sentences, ‘teaching’ the software about the relationships between the words.

Now a post-doc at DCU, she is helping to develop Tapadóir, an in-house machine translation system for the Department of Arts, Heritage and the Gaeltacht, which is tailored to nuances of the written material the Department encounters.

“To train a general machine translation system you give it large amounts of parallel data – the English text and the equivalent Irish text,” explained Lynn. “But if you want it to work well in more specific environments, you can then tune a system to a certain type of text and genre.”

Love of Irish

Growing up in Drogheda, Lynn credits her father’s interest in languages and a trip to the Gaeltacht in Gweedore, Co Donegal, as inspirations to love Irish. But when it came to university, she chose to study applied computational linguistics at DCU.

That led her to work in software localisation and, when working in Melbourne, Australia, on a machine translation system from Tagalog (Filipino) into English she realised she could use her skills to improve machine translation for the Irish language too.

Next stop was her PhD in DCU and Sydney, during which she was involved in a project to automatically live-translate tweets about the soccer World Cup finals in 2014 in Brazil. That sparked an interest in language use on social media, which led to her undertaking a Fulbright scholarship with Prof Kevin Scannell in St Louis University to analyse Irish language use on social media.

Tweeting as Gaeilge

The colloquialism and ‘noisiness’ of social media content provided Lynn with an extra edge to her work. “I had been analysing text from newswire and literature, which was all well-structured,” she recalled.

But for this work she analysed a corpus of 1500 Irish-language tweets, and found that tweeters were coming up with new and interesting twists on the Irish language.

“People were using a mix of English and Irish words in sentences and there were new variations in the way people were spelling things,” she said, recalling how she looked at a tweeted word ‘áicbheaird’ for quite a while before realising what it meant. Awkward.

“There were lots of cute little things like that, and it made me realise that Irish isn’t old and dated, and that people were being creative with it online,” said Lynn, pointing out that languages tend to change over time. “Language is a tool for communication. We don’t speak English in the same way that Shakespeare wrote it.”

Bígí ag tweetáil

Lynn is now working on a national strategy for Irish language technology, and she recently gave a TEDx Fulbright talk about the use of the language in social media (which itself has been translated into other languages).

The bottom line? She encourages anyone who has Irish to use it online, and not to worry too much about the grammatical accuracy. “It is our language,” she said. “I’ve met people from Russia, Germany and India who have learned Irish and use it, Irish people should be using it too.”

Dr Claire O’Connell is a scientist-turned-writer with a PhD in cell biology and a master’s in science communication

editorial@siliconrepublic.com