IXA group


Talk: Computational explorations of creative language (C. Strapparava, 2017-07-07)

Speaker: Carlo Strapparava
…………….FBK-irst (Fondazione Bruno Kessler – Istituto per la ricerca scientifica e Tecnologica)
Date: July 7, 2017,
Time: 09:30
Place: UPV/EHUko Informatika Fakultatea, Manuel de Lardizabal 1, 20018 Donostia (map)
Title: Computational explorations of creative language


Dealing with creative language and in particular with affective, persuasive and even humorous language has often been considered outside the scope of computational linguistics. Nonetheless, it is possible to exploit current NLP techniques starting some explorations about it. We briefly review some computational experiences about these typical creative genres.

Short bio:

Carlo Strapparava is a senior researcher at FBK-irst (Fondazione Bruno Kessler – Istituto per la ricerca scientifica e Tecnologica) in the Human Language Technologies Unit.
His research activity covers artificial intelligence, natural language processing, intelligent interfaces, human-computer interaction, cognitive science, knowledge-based systems, user models, adaptive hypermedia, lexical knowledge bases, word-sense disambiguation, affective computing and computational humour. He is the author of over 200 papers, published in scientific journals, book chapters and in conference proceedings. He has the Italian scientific habilitation for full professor in informatics and engineering.
He regularly serves in the program committees of the major NLP conferences (ACL, EMNLP, etc.). He was executive board member of SIGLEX, a Special Interest Group on the Lexicon of the Association for Computational Linguistics (2007-2010), Senseval (Evaluation Exercises for the Semantic Analysis of Text) organisation committee (2005-2010).
On June 2011, he was awarded with a Google Research Award on Natural Language Processing, specifically on the computational treatment of creative language.

PhD Thesis: Automatic Scansion of Poetry (M. Agirrezabal, 2017/06/19)

Title: Automatic Scansion of Poetry
Author: Manex Agirrezabal Zabaleta
Supervisors: Dr. Iñaki Alegria Loinaz and Dr. Mans Hulden
Date: June 19, 2017, Monday
Time: 12:00
Where:  Faculty of Informatics, Ada Lovelace Room (UPV/EHU)

Research questions:

  • What do we need to know when analyzing a poem, and how can we capture it?
  • Does language-specific linguistic knowledge contribute when analyzing poetry?
  • Is it possible to analyze a poem with any language-specific information?
    Is such analysis something that can be learnt?



Neural Machine Translation. Open workshop with Kyunghyun Cho (2017-05-29)

Neural Machine Translation
Open workshop with Kyunghyun Cho
Donostia, 2017-05-29

The third generation of machine translation systems is currently under active development. After initially dominating the field, rule-based machine translation (RBMT) systems have been gradually replaced by data-driven approaches in the last two decades, with statistical machine translation (SMT) systems prevailing as the main paradigm. In the last two years, deep learning approaches have significantly impacted the field, with the rise of neural machine translation (NMT) as the new state-of-the-art in automated translation. This event presents advanced results in the field, in particular for machine translation of Basque.

The MODELA project was created to advance research and development in deep-learning approaches to machine translation and to address the many challenges of Basque machine translation. The project is financed by the Basque Government and is being carried out by the following entities: Ametzagaiña, Elhuyar, ISEA, UPV/EHU (IXA group) and Vicomtech-IK4.


The main speaker will be Kyunghyun Cho (Center for Data Science, New York University), who is an eminent researcher in the area, the most referenced on NMT, a field in which he has obtained a Google prize. Additionally, he is a a brilliant speaker.

Date: May 29, 2017, 11:00
Place: UPV/EHUko Informatika Fakultatea, Manuel de Lardizabal 1, 20018 Donostia (map)

  • 11.00-11.15: Introduction and presentation of the project
  • 11.15-12.30: Neural Machine Translation (Kyunghyun Cho)
  • 12.30-13.15: First results in the Modela project

Sponsor: Modela project and University of the Basque Country

Next day, on Tuesday May 30, at 15:00 he will be with the students of our Master on Language Technology

PhD position in Innsbruck with Michael Ustaszewski

After finishing our Erasmus Mundus LCT Master in 2016 Michael Ustaszewski is now a postdoc assistant at the University of Innsbruck, and Unit Manager (liaison with the Department of Translation Studies) at the Innsbruck Translation Centre. His group is working on Corpus-Based Translation and asked us to publish this Call for PhD Position Candidates:

The Department of Translation Studies at the University of Innsbruck invites applications for a PhD position in the framework of the two-year research project “TransBank: A Meta-Corpus for Translation Research” funded by the Austrian Academy of Sciences.

The goal of the project is to build a large, open and expandable bank of translated texts and their original texts. Its main innovative feature is the ability to exploit a rich set of metadata labels characterising each text and text pair for the compilation and download of sub-corpora, tailored to the requirements of specific translation-related research questions.

The PhD student will be involved in all stages of the corpus building process, thus having the opportunity to gather translation data relevant to his/her specific research interest. The student will work autonomously on the development of the metadata labelset and on collecting translation data, on the basis of which he or she will conduct quantitative and/or qualitative analyses for his/her thesis. Work will be carried out in close collaboration with the project’s two principal investigators and two MA students.

The following requirements are looked for in the successful candidate:

  • Master’s degree in Translation Studies, Corpus Linguistics,
  • Computational Linguistics or a related field
  • proven familiarity with translation theory
  • strong interest in data-driven research methodologies and linguistic annotation
  • excellent teamwork skills
  • proficiency in English on a level suitable for written and spoken scientific communication
  • solid programming skills in a scripting language (e.g. Python) will be an asset, as will knowledge of German or any other language(s)

The two-year position with a weekly working time of 20 hours (50%) commences in September 2017 and offers an annual stipend of € 19,117 plus allowances for conference attendance. The position involves enrolment in the PhD programme in Linguistics and Media Studies at the University of Innsbruck.

Applications should include:

  1. A cover letter (1 page maximum) that relates the candidate’s  experience and interest in the TransBank project
  2. A two-page thesis proposal describing the research question and methodology underlying the candidate’s envisaged analyses using TransBank data
  3. A CV listing any publications
  4. Copies of relevant diplomas and certificates
  5. A recommendation letter by the candidate’s MA thesis supervisor or a university professor
  6. A copy of the MA thesis or the latest draft

To apply, please submit the documents in two PDF files (one containing documents 1 to 5, one containing document 6) by 10 April 2017 via the upload form at http://transbank.info/jobs

Shortlisted applicants will be interviewed in person or via Skype towards the end of April.

Further information:

Details on the research project can be found on the project website http:/www.transbank.info
For enquiries about the position and the application process, please contact mail[at]transbank.info
Information about the Department of Translation Studies at the University of Innsbruck: http://translation.uibk.ac.at
For information on the PhD programme in Linguistics and Media Studies at the University of Innsbruck and the enrolment process, please refer to

Mikel Artetxe awarded in Hackaton on Language Technologies organized by Red.es

Yesterday, Mikel Artetxe was awarded in Barcelona with the second prize in the First Hackaton on Language Technologies organized by Red.es in collaboration with  the Spanish Plan to promote Language Technology managed by the Spanish Government’s SESIAD agency.

This hackathon was organized in the context of  “4 Years From Now” (4YFN), the bussines platform created by Mobile World Capital Barcelona to promote technological startups. Several IXA members participated as organizers (German Rigau, Iñaki Alegria and Rodrigo Agerri).

Eight projects participated in the final session yesterday in Barcelona. Mikel developed a free alternative that allows the automatic creation of  bilingual dictionaries offering examples with real uses of words (an application similar to Linguee).

German Rigau keynote speaker in the JRC Conference TEXT MINING IN POLICY MAKING

IXA Group member German Rigau participated as keynote speaker last Monday in  the JRC Conference “TEXT MINING IN POLICY MAKING” organised by the European Commission in Brussels to present the new JRC competence centre on text mining. This new JRC has been organized with a showcase of various success stories of JRC applied text mining solutions. German Rigau addressed challenges related to textual data.

“This conference was an opportunity for policy makers from EU institutions to understand better the benefits of text mining in policy making processes, and pave the way forward for a better use of these solutions in policy making.

Information needed by policy makers is increasingly embedded in large amounts of textual data available on the Internet, e.g. traditional or social media, or in large public or proprietary document sets.

Text mining, the automatic extraction of information from text, offers policy makers timely access to important information which would otherwise be inaccessible. Indeed, the sheer volume of data makes it nearly impossible to extract the available information manually.”

Our papers in Japan (COLING 2016)

Those are our six papers in COLING 2016, taking place in Osaka, Japan, on Dec 11 2016.

HAP/LAP master theses (2016-09-27)

Master HAP/LAPhap-laptesi-irakurketa
EMLCT master
Master thesis defences

Date: September 27th
Place: Ada Lovelace room


Universal Dependencies for Buryat.
Author: Elena Badmaeva
Supervirors: Koldo Gojenola , Gosse Bouma

LexSynSimpleText, a lexical and syntactic simplifier: first steps.
Author: Maria Eguimendia
Supervirors: Arantza Diaz de Ilarraza and Gosse Bouma

Data Sparsity in Highly Inflected Languages: The Case of Morphosyntactic Tagging in Polish.
Egilea / Author: Michael Ustaszewski
Tutoreak / Supervirors: Rodrigo Agerri and German Rigau

Multilingual Central Repository version 3.0: improving a very large lexical knowledge base.
Egilea / Author: Daniel Parera Perez
Tutoreak / Supervirors: German Rigau Claramunt

Book: Microparameters in the Grammar of Basque

Edited by Beatriz Fernández (UPV/EHU) and Jon Ortiz de Urbina (Deusto University), this book is an endeavor to present and analyze some standard topics in the grammar of Basque from a micro-comparative perspective. From case and agreement to word order and the left periphery, and including an incursion into determiners, the book combines fine-grained theoretical analyses with empirically detailed descriptions. Working from a micro-parametric perspective, the contributions to the volume address in depth some of the exuberant variation attested in the different dialects and subdialects of Basque. At the same time, although the contributions focus mainly on Basque data, cross-linguistic evidence is also presented and discussed.
After all, the goal pursued in this book is to attempt to explain variation in Basque as a particular instantiation of variation in human language at large. The volume presents and analyzes a wide range of empirical phenomena, many typologically marked among European languages, and will therefore be a welcome resource to linguists looking for detailed description and/or theoretical discussion.

Nora Aranberri: Machine Translation for Translators (Innsbruck, 2016-07-20)

InsbrukSummertransOur colleague Nora Aranberri has been the lecturer in the workshop on “Machine Translation for Translators: Taking Advantage of the New Technology” at SummerTrans 2016.

The International Translation Summer School SummerTrans, was founded in Innsbruck in 2004.  From 11 to 20 July 2016 the University of Innsbruck hosted the 7th International Translation SummerSchool “SummerTrans VII: Quality and Competence in Translation”. Addressing trainee translators, professional translators and translation researchers alike, its varied programme featured cutting-edge courses and workshops aiming to advance participants’ theoretical knowledge of and practical skills in translation and interpreting, including state-of-the art translation technology and human-machine interaction in translation.
SummerTrans VII welcomed more than 60 participants from 16 countries spanning from Tunisia over half of Europe to India and China.NoraInnsbruck2016
Michael Ustaszewski, one of our students in Eramus Mundus LCT master2014-2016, now is a lecturer at the University of Innsbruck and one of the organizers of SummerTrans 2016  🙂
Michael told us that now the participants in the workshop know the state-of-the art translation technology and human-machine interaction in translation.