Paraphrasing and semantic overlap detection

The same piece of information can be expressed in many different ways, and this is one of the major challenges in building robust Natural Language Processing (NLP) applications. It is commonly assumed that such applications can be improved with knowledge of how natural language expressions relate to each other, for instance in terms of paraphrases (same semantic content, different wording) or entailments (one expression implied by the other). The Stevin DAESO project, on which I work with Erwin Marsi and others, investigates the detection of semantic overlap between Dutch sentences and the exploitation of this knowledge in a range of NLP applications. DAESO grew out of the IMOGEN project, an earlier NWO project on multimodal generation that I did together with Mariƫt Theune and others.