Natural Language Processing (NLP) – University of Copenhagen

Natural Language Processing (NLP)

NLP is a subfield of artificial intelligence, concerned with extracting linguistic information (e.g. grammatical, semantic, or pragmatic) from text. Today, NLP models use large-scale data and statistical models to accomplish their goals. The same techniques are also behind machine translation and machine learning. NLP models can e.g. be used to find the grammatical structure of sentences, to find out what topics a text contains, but also to analyse which words in a document refer to people, places, or organisations.

In our research group, we are interested in the biases that can influence NLP models, for example due to the demographic variation in language, the selection of specific data sets, or the use of certain algorithms. We are actively working on improving the quality of NLP models for all languages, domains, and demographics, i.e., we want to build models that process the Facebook update of a Danish teen with the same accuracy as an Amercian newspaper report or a French novel.

Our work has been reported on in several international newspapers, nominated and decorated with several best-paper awards at scientific conferences, and established Copenhagen as an international center for NLP.


NLPL: Nordic Language Processing Laboratory (2017-19)
Infrastructure grant from NeIC
Led by Bjørn Lindi

More information

Digital Disinformation (2016-18)
Research grant from the Carlsberg Foundation
Led by Rebecca Adler-Nissen

More information

ReProsis: Real Time Big Data Product Analysis – Product Management System in International Markets (2016-18)
Research project funded by Eurostars
Led by Dirk Hovy and Isabelle Augenstein
More information

From dogma to data (2015-18)
Research project supported by the Danish Research Council
Led by Anders Søgaard with Henrik Palmer Olsen, iCourts, Faculty of Law, University of Copenhagen
More information

Uncertain archives (2015-18)
Research project supported by the Danish Research Council
Led by Kristin Veel
More information

Interactive text simplification for dyslexics (2015-18)
Research project supported by Trygfonden
More information

From a thesaurus to a Danish FrameNet (2016-17)
Infrastructure grant from the Carlsberg Foundation

Semantic processing across domains (2014-17)
Research project supported by the Danish Research Council
Led jointly with Bolette S. Pedersen
More information (in Danish)

LOWLANDS: Parsing low-resource languages and domains (2013-17)
Research project (ERC Starting Grant) supported by the European Research Council

Led by Anders Søgaard
More information

Finding Waldo in a haystack of informal writing styles (2016-17)
Research grant from the Data Transparency Lab
More information

RELIP: Reading between the lines (2016)
Seed money, Patient At Home
More information (in Danish)