Statistics and Mathematics for NLP

Face-to-face degree course

Introduction to basic concepts of statistics in the field of natural language processing. Inferential aspects will be worked on, including the most common statistical tests, as well as interval estimation. Furthermore, basic concepts of functions and linear algebra will be introduced for Natural Language Processing tasks with the final aim of gaining understanding of the rationale behind neural networks.

NameInstitutionCategoryDoctorTeaching profileAreaE-mail
EZEIZA RAMOS, NEREAUniversity of the Basque CountryProfesorado AgregadoDoctorBilingualComputer Languages and
IRIGOYEN GARBIZU, ITZIARUniversity of the Basque CountryProfesorado AgregadoDoctorBilingualScience of Computation and Artificial


Ability to understand and apply the appropriate statistical tests according to the objectives set, given a set of data.35.0 %
Ability to understand basic mathematical language. 25.0 %
Ability to understand the underlying intuition in vector spaces and perform basic calculations.40.0 %

Study types:
Face-to-face hours: 20
Non face-to-face hours: 30
Total hours: 50
Applied laboratory-based groups203050

NameHoursPercentage of classroom teaching
Lectures25.040 %
Prácticas con ordenador, laboratorio, salidas de campo, visitas externas50.040 %

NameMinimum weightingMaximum weighting
Attendance and participation10.0 % 10.0 %
Practical tasks30.0 % 60.0 %
Written examination30.0 % 60.0 %

Pose the correct hypotheses according to the objectives and characteristics of the data to perform statistical tests.

Interpret the results of a statistical test and complement it with estimation by intervals.

Know the definition of what a function is. Know the concept of derivative. Perform basic calculations of linear algebra.

Learn to use specific software to perform calculations related to natural language processing tasks.

Final exam to assess the subject.

Final exam to assess the subject.


1. Introduction to hypothesis testing: independence test, Mc Nemar test

2. Real function. Concept of derivative

3. Matrix calculus


R.H. Baayen (2008). Analyzing Linguistic Data. A Practical Introduction to Statistics using R. Cambridge University Press

G. Strang (2019). Linear Algebra and Learning from Data. Cambridge University Press

A. Trask (2019). Grokking deep learning. Manning Publications Co.