Breadcrumb

Exploit your data: machine learning for multivariate data analysis (March-April 2026)

Exploit your data: machine learning for multivariate data analysis

Participant profile

Doctoral students of the UPV/EHU

Calendar

Biscay Campus: March-April 2026

Duration / Timetable

20 hours (4 hours classes run over 5 weeks)

Time: 09:30 to 13:30

Attendance Requirement

Students will be expected to attend 90% of the classes together with submission of a final practical work assignment (see points 3 and 5 of the Basic regulations for participation in transversal training activities organised by the Doctoral School).

ALL ABSENCES must be justified with supporting documentation.

Language

English

Modality

Face-to-face

Pre-requisites

Participants must use their own laptops with the software "R" and the IDE “R Studio” installed. Basic R knowledge is recommended.

Location and dates

CAMPUS DATE LOCATION
Biscay Campus
(Leioa)
March: 6, 12, 20, 27
April: 17
To be confirmed

Teacher

Giulia Gorla. Bachelor's degree in Chemistry and Industrial Chemistry and Master's degree in Chemistry at the University of Insubria (Como). In 2023, I achieved my Ph.D. in Chemical and Environmental Sciences, specializing in analytical chemistry, graduating with honors (Cum Laude) with a doctoral thesis titled " Infrared spectroscopy and Chemometrics: facing analytical chemistry issues through data." My scientific interests encompass the analysis of spectroscopic data, hyperspectral imaging, and the application and development of chemometrics, including Machine Learning and Deep Learning techniques on several type of data. I am currently working as a postdoctoral fellow in the IBeA Research Group (Ikerkuntza eta Berrikuntza Analitikoa - Analytical Research and Innovation) within the Department of Analytical Chemistry at the University of the Basque Country (UPV/EHU).  You can find more about my work experience on my LinkedIn profile page and my research interest at my ORCID (0000-0002-2311-9333).

Group size

25

Registration

REGISTRATION (available from January 7, 2026)
NOTICE: in order to participate in the school's transversal activities it is necessary to have paid the registration fee for the academic year 2025/26

Competences to be acquired by the doctoral student:

  • b) Ability to conceive, design or create, implement and adopt a substantial process of research or creation.
  • d) Ability to critically analyse, evaluate and synthesise new and complex ideas.
  • f) Ability to promote, in academic and professional contexts, scientific, technological, social, artistic or cultural progress within a knowledge-based society.

Objectives

The main objective of the course is to train students in the use and understanding of machine learning techniques applied to research. Students will learn to use these techniques autonomously and apply them to their research projects. In terms of the competencies and skills to be developed, students will gain the ability to manage data acquisition, structuring, analysis, and visualization, and critically evaluating the results obtained. Their capacity for critical, logical, and mathematical reasoning will be fostered, along with their ability to solve problems and create models that reflect real-world situations. Additionally, students will learn to design and implement experiments and analyze and interpret the results.

Self-directed learning will be promoted, as well as the ability to communicate conclusions and knowledge to both specialized and non-specialized audiences clearly. Students will work in teams during the practical sessions and learn to select the most appropriate technique for each problem. They will be trained in the use of statistical software and the critical evaluation of the results obtained, considering their applicability and possible limitations.

The assessment will be based on the practical work completed during the course and will culminate in a final presentation where students will apply the techniques learned to their own field of study. Mastery of the techniques and the ability to apply them effectively in their areas of interest will be expected.

Format

The course will combine theoretical lectures with practical sessions where students will work in groups or individually on real datasets to apply the techniques they have learned. Active participation and discussion of results will be encouraged to enhance learning, ensuring that students can use these tools to extract the maximum information from their data using machine learning techniques.

Content

This course is designed to provide doctoral students from various disciplines with both a theoretical and practical understanding of multivariate analysis tools. It will cover the fundamentals of experimental design, data acquisition strategies, preprocessing, and models for exploration, prediction, and classification. The course structure includes both theoretical lectures and practical sessions focused on solving real-world problems using datasets and implementing the methods in R. R is used exclusively as a computational tool, and the course will not include instruction on the R language itself, only the specific code needed to perform multivariate analyses. All necessary foundational concepts for multivariate analysis, along with the basic use of R required to apply these methods, will be introduced. The modules will be divided as follows:

  1. Introduction to the Multivariate Approach and Its Advantages
    Presentation of basic concepts and the importance of multivariate analysis across various disciplines. Discussion of the benefits of using multivariate techniques for extracting complex information.
  2. Experimental Design and theory of sampling
    Strategies for designing effective experiments and analyzing the results obtained.
  3. Exploratory Data Analysis
    Principal Component Analysis (PCA) for dimensionality reduction and data interpretation. Detection of outliers, types of outliers, and their impact on analysis. Residual inspection to assess model quality. Clustering techniques for identifying natural groups within the data.
  4. Data Preprocessing
    Visual inspection of data using visualization tools. Preprocessing techniques.
  5. Regression and Prediction Methods
    Introduction to multiple linear regression and multivariate regression methods: Principal Component Regression (PCR) and Partial Least Squares Regression (PLSR). Evaluation of prediction models through internal and external validation.
  6. Classification Methods
    Classification techniques. Comparison and evaluation of classification models using metrics such as accuracy, precision, and ROC curves.