# Statistics and Mathematics for NLP

## General details of the subject

Mode
Face-to-face degree course
Language
English

## Description and contextualization of the subject

Introduction to basic concepts of statistics in the field of natural language processing. Inferential aspects will be worked on, including the most common statistical tests, as well as interval estimation. Furthermore, basic concepts of functions and linear algebra will be introduced for Natural Language Processing tasks with the final aim of gaining understanding of the rationale behind neural networks.

## Teaching staff

NameInstitutionCategoryDoctorTeaching profileAreaE-mail
IRIGOYEN GARBIZU, ITZIARUniversity of the Basque CountryProfesorado AgregadoDoctorBilingualScience of Computation and Artificial Intelligenceitziar.irigoien@ehu.eus

## Competencies

NameWeight
Ability to understand and apply the appropriate statistical tests according to the objectives set, given a set of data.35.0 %
Ability to understand basic mathematical language. 25.0 %
Ability to understand the underlying intuition in vector spaces and perform basic calculations.40.0 %

## Study types

TypeFace-to-face hoursNon face-to-face hoursTotal hours
Lecture-based101525
Applied computer-based groups203050

## Training activities

NameHoursPercentage of classroom teaching
Computer work practice, laboratory, site visits, field trips, external visits50.040 %
Lectures25.040 %

## Assessment systems

NameMinimum weightingMaximum weighting
Attendance and participation10.0 % 10.0 %
Written examination30.0 % 60.0 %

## Learning outcomes of the subject

Pose the correct hypotheses according to the objectives and characteristics of the data to perform statistical tests.

Interpret the results of a statistical test and complement it with estimation by intervals.

Know the definition of what a function is. Know the concept of derivative. Perform basic calculations of linear algebra.

Learn to use specific software to perform calculations related to natural language processing tasks.

## Ordinary call: orientations and renunciation

Final exam to assess the subject.

## Extraordinary call: orientations and renunciation

Final exam to assess the subject.

## Temary

1. Introduction to hypothesis testing: independence test, Mc Nemar test

2. Real function. Concept of derivative

3. Matrix calculus

## Bibliography

#### Basic bibliography

R.H. Baayen (2008). Analyzing Linguistic Data. A Practical Introduction to Statistics using R. Cambridge University Press

G. Strang (2019). Linear Algebra and Learning from Data. Cambridge University Press

A. Trask (2019). Grokking deep learning. Manning Publications Co.