# Materia

# Métodos de inferencia estadística / Methods of Statistical Inference

## Datos generales de la materia

- Modalidad
- Presencial
- Idioma
- Inglés

## Descripción y contextualización de la asignatura

Methods of Statistical Inference is a compulsory subject in the second semester of the studies in the Master on Language Acquisition in Multilingual Settings (LAMS).This is an instrumental subject in which the statistical tools required to analyze data in the context of language acquisition are provided. Therefore, students will be provided with the required tools for the appropriate data analysis they they will take on in the different courses within this master program, as well as in their final project or Ph.D. thesis, if it is the case.

Methods of Statistical Inference requires no previous knowledge in either Statistics or statistical software. Classes will take place in a computer room fully provided with computers where the required software will be available for students. Each student will have access to one computer and, thus, they will be able to follow the instructions provided during the class period and work, together with the professor and their classmates, in developing the most appropriate statistical analysis for each situation under study.

## Profesorado

Nombre | Institución | Categoría | Doctor/a | Perfil docente | Área | |
---|---|---|---|---|---|---|

NUÑEZ ANTON, VICENTE ALFREDO | Universidad del País Vasco/Euskal Herriko Unibertsitatea | Profesorado Catedratico De Universidad | Doctor | No bilingüe | Métodos Cuantitativos para la Economía y la Empresa | vicente.nunezanton@ehu.eus |

## Tipos de docencia

Tipo | Horas presenciales | Horas no presenciales | Horas totales |
---|---|---|---|

Magistral | 18 | 5 | 23 |

Seminario | 0 | 10 | 10 |

P. de Aula | 6 | 10 | 16 |

P. Ordenador | 6 | 20 | 26 |

## Actividades formativas

Denominación | Horas | Porcentaje de presencialidad |
---|---|---|

Discusión en grupo | 4.0 | 100 % |

Exposiciones teóricas | 16.0 | 100 % |

Lectura y análisis prácticos | 10.0 | 0 % |

Prácticas de ordenador | 10.0 | 100 % |

Seminarios y tutorías, sesiones de laboratorio, etc. Realizados con el Director del trabajo | 10.0 | 0 % |

Trabajo individual y/o en grupo | 20.0 | 0 % |

## Sistemas de evaluación

Denominación | Ponderación mínima | Ponderación máxima |
---|---|---|

Evaluación continua a través de la asistencia a clase | 20.0 % | 20.0 % |

Trabajos y proyectos | 80.0 % | 80.0 % |

## Resultados del aprendizaje de la asignatura

Specific skills:- Be able to identify and discriminate the main characteristics of the different types of variables, as well as on the different descriptive statistical methods than can be used to describe each type of variable. In this way, students will adequately assess their usefulness and applicability in their professional field.

-Be able to use the different estimation methods (pointwise and by confidence interval), as well as their properties, so that students can select the most appropriate alternative of analysis to the specific situation under study.

- Be able to apply the most appropriate statistical methodology for hypothesis test design that would allow the student to take on specific decisions in his/her professional field.

- Be able to obtain and interpret the results of specific statistical analyses applied to data in Language Acquisition by making use of the most appropriate sources of information, as well as of the required statistical or text editing software tools.

Cross-sectional skills:

- Ability to provide motivated judgments well supported by previously obtained data.

- Ability to fluently communicate orally and in writing.

- Ability to work in groups, showing respect, responsibility, initiative and leadership with the working team.

- Ability to develop analytical thinking and critical reflection.

- Ability to communicate in a foreign language, particularly in English.

Teaching Methodology:

The teaching methodology to be used in this course will be based on lecture type classes (L), computer software class of exercises (CSCE) and discussion seminars (DS). In the lecture type classes the theoretical contents of the course will be introduced and, in addition, students’ participation will be actively motivated by using take-home questions and/or exercises that should be analyzed by students prior to the next class and outside of the regular class time period. In the computer software exercise type classes (CSCE), both the students and the professor will solve real data set exercises, based on examples from the context of language acquisition that will help illustrate, with the use of the different statistical software packages available for this course, the theoretical contents described in the lectures. In addition, students will be able to interpret the corresponding computer outputs and extract the relevant information from them. In the discussion seminars, the professor and the students will actively discuss over practical real settings in the context of language acquisition, so that both design, methodology, hypothesis under study, and interpretation of results are illustrated and students learn about the advantages and/or disadvantages each decision they made in this whole process really means. In general, in the activities designed for the course, students will be able to better understand and adequately assess the applicability the concepts covered in class will have.

## Convocatoria ordinaria: orientaciones y renuncia

Grading Process:The final grade for this course will be based on homework exercises students will have to do during the whole class period, as well as on individual’s student class participation during the computer software class of exercises (CSCE) and discussion seminars (DS) periods. Exercises and examples included both in the computer software class of exercises, discussion seminars, and lectures, as well as the regular homework exercises assigned to students, are all part of the students' ongoing evaluation process.

Grading Process Weight:

Student's class participation: 20%

Homework exercises: 80%

Students not handing all of the assigned homework exercises will have a final grade of "No Passing" or "Suspenso" for the course. Students not handing in any of the homework exercises will have a final grade of "Absent" or "No presentado" for the course.

## Convocatoria extraordinaria: orientaciones y renuncia

The grading process for the course’s second call for each academic year will be, in any case, a final written exam based on the total 100% of the grade for this course. This exam will assess all skills that have been developed and contents that have been covered in the different activities during the in-class period of the course.## Temario

Objectives:The main objective for this course is that students become familiar with all elements related to basic descriptive and statistical inference, as well as with the different and commonly used statistical techniques and their required hypothesis for the analysis of real data sets in the context on language acquisition. In addition, students should be able to appropriately use them. The following are issues to be covered: descriptive statistics for the different types of variables, relation between variables, regression analysis, analysis of variance, pointwise and interval estimation, test of hypotheses and nonparametric basic statistical tools. The contents included in this course will allow the student to deal with statistical inferential problems, which are very important in any real data analysis situation, as well as in any study that includes a quantitative component.

Specific skills:

- Be able to identify and discriminate the main characteristics of the different types of variables, as well as on the different descriptive statistical methods than can be used to describe each type of variable. In this way, students will adequately assess their usefulness and applicability in their professional field.

- Be able to use the different estimation methods (pointwise and by confidence interval), as well as their properties, so that students can select the most appropriate alternative of analysis to the specific situation under study.

- Be able to apply the most appropriate statistical methodology for hypothesis test design that would allow the student to take on specific decisions in his/her professional field.

- Be able to obtain and interpret the results of specific statistical analyses applied to data in Language Acquisition by making use of the most appropriate sources of information, as well as of the required statistical or text editing software tools.

Cross-sectional skills:

Ability to provide motivated judgments well supported by previously obtained data.

Ability to fluently communicate orally and in writing.

Ability to work in groups, showing respect, responsibility, initiative and leadership with the working team.

Ability to develop analytical thinking and critical reflection.

Ability to communicate in a foreign language, particularly in English

DETAILED PROGRAM

CHAPTER 1. RELEVANCE OF STATISTICAL CONCEPTS AND IDEAS IN THE CONTEXT OF LANGUAGE ACQUISITION.

Introduction. Uses of Statistics. Main ideas in selecting an appropriate statistical analysis. Why do we need Statistics in Language acquisition?

References: Baayen (2008); Tanur et al. (1989); Larson-Hall (2010); Mackey and Gass (2005); Moore (1991); Peña and Romo (1997), Chapter 1; Tanur et al. (1989).

CHAPTER 2. DIFFERENT TYPES OF STATISTICAL VARIABLES AND THEIR DESCRIPTION.

Types of variables. Variable distributions: graphs and numerical description. Quantitative variables: Frequency tables, bar graph, Pareto chart, pie chart and mode. Quantitative variables: Histogram, frequency polygon, stem and leaf plot, mean, standard deviation, coefficient of skewness, median, quartiles and interquartile range, boxplot and trimmed mean. Transformations: linear and nonlinear. Normal and Log-normal distributions.

References: Chen (2005); Larson-Hall (2010); Moore (1991), Chapter 4; Peña and Romo (1997), Chapters 2-6; Cao et al. (2001), Chapter 1.

CHAPTER 3. ACCOUNTING FOR THE RELATION BETWEEN DIFFERENT STATISTICAL VARIABLES. CONTINGENCY TABLES.

Joint distribution. Contingency tables. Marginal and conditional distributions. Studying relationships: correlation and scatterplots. Correlation and heterogeneity. Correlation and causality. Least squares regression and relationships between categorical variables. Comparing populations. Testing relationships. Correlation test. Contingency tables and tests of independence.

References: Cao et al. (2001), Chapter 2; Chen (2005); Larson-Hall (2010); Moore (1991), Chapter 5; Peña and Romo (1997), Chapters 7-10 and 23-24; Ruiz-Maya and Martín-Pliego (2005), Chapter 12.

CHAPTER 4. STATISTICAL ANALYSIS TECHNIQUES.

Regression Analysis. ANOVA and MANOVA. Repeated Measures ANOVA. Starting hypotheses, model selection and how statistical inferential questions should be addressed.

References: Baayen (2008); Chen (2005); Cryer and Miller (1991); Hair et al. (2000); Johnson and Wichern (1988); Larson-Hall (2010); Myers (1990); Peña (2002a, 2002b).

CHAPTER 5. INFERENCE: POINTWISE AND INTERVAL ESTIMATION AND HYPOTHESIS TESTING.

Insights on the maximum likelihood and the method of moments pointwise estimation methods. Interval estimation. Hypothesis testing. Cases: One mean, two means, one proportion and two proportions. Nonparametric alternative testing methods.

References: Bain and Engelhardt (1992), Chapters 9, 11, 12 and 14; Cao et al. (2001), Chapters 8-11; Chen (2005); Conover (1980); Cryer and Miller (1991); Larson-Hall (2010); Peña (2001), Chapters 7-10; Peña and Romo (1997), Chapters 19-22; Ruiz-Maya and Martín-Pliego (2005), Chapters 7-13.

CHAPTER 6. USE OF STATISTICAL SOFTWARE: SPSS, MINITAB, R AND OTHER ALTERNATIVES.

Deciding on what statistical software to use. Basic commands. General and specific applications.

References: Baayen (2008); Chen (2005); Cryer and Miller (1991); Larson-Hall (2010); Martín et al. (2008); Pérez (2004, 2005); Ryan et al. (1985); Ugarte et al. (2009).

Teaching Methodology:

The teaching methodology to be used in this course will be based on lecture type classes (L), computer software class of exercises (CSCE) and discussion seminars (DS). In the lecture type classes the theoretical contents of the course will be introduced and, in addition, students¿ participation will be actively motivated by using take-home questions and/or exercises that should be analyzed by students prior to the next class and outside of the regular class time period. In the computer software exercise type classes (CSCE), both the students and the professor will solve real data set exercises, based on examples from the context of language acquisition that will help illustrate, with the use of the different statistical software packages available for this course, the theoretical contents described in the lectures. In addition, students will be able to interpret the corresponding computer outputs and extract the relevant information from them. In the discussion seminars, the professor and the students will actively discuss over practical real settings in the context of language acquisition, so that both design, methodology, hypothesis under study, and interpretation of results are illustrated and students learn about the advantages and/or disadvantages each decision they made in this whole process really means. In general, in the activities designed for the course, students will be able to better understand and adequately assess the applicability the concepts covered in class will have, and, thus, select the most appropriate methods for each specific application.

Grading Process:

The final grade for this course will be based on homework exercises students will have to do during the whole class period, as well as on individual¿s student class participation during the computer software class of exercises (CSCE) and discussion seminars (DS) periods. Exercises and examples included both in the computer software class of exercises, discussion seminars, and lectures, as well as the regular homework exercises assigned to students, are all part of the students ongoing evaluation process.

Student`s class participation: 20%

Homework exercises: 80%

Students not handing all of the assigned homework exercises will have a final grade of No Passing or Suspenso for the course. Students not handing in any of the homework exercises will have a final grade of Absent or No presentado for the course. The grading process for the courses second call for each academic year will be, in any case, a final written exam based on the total 100% of the grade for this course. This exam will assess all skills that have been developed and contents that have been covered in the different activities during the in-class period of the course.

## Bibliografía

#### Materiales de uso obligatorio

The required material for this course, available at egela, includes:- Class notes

- SPSS commands material provided for this course

- MINITAB commands material provided for this course

- Statistical tables

- Datasets, both in SPSS and MINITAB format, provided for this course

- Homework material provided for this course

#### Bibliografía básica

REFERENCESBaayen, R.H. (2008). Analyzing Linguistics Data. A Practical Introduction to Statistics using R. Cambridge: Cambridge University Press.

Bain, L.J. and Engelhardt, M. (1992). Introduction to Probability and Mathematical Statistics}. Second Edition. New York: Duxbury Press.

Cao, R. et al. (2001). Introducción a la Estadística y sus Aplicaciones. Madrid: Pirámide.

Chen, R. (2005). Research methodology; Quantitative approaches. In Mind & Context in Adult Second Language Acquisition (C. Sanz, Ed.). Washington, DC: Georgetown University Press, 21-68.

Conover, W.K. (1980). Practical Nonparametric Statistics. Second Edition. New York: Wiley.

Cryer, J.D. and Miller, R.B. (1991). Statistics for Business: Data Analysis and Modelling. Boston: PWS-Kent Publishing Company.

Hair, J.F. Jr, Anderson, R.E., Tatham, R.L. and Black, W.C. (2000). Análisis Multivariante. Quinta Edición. Madrid: Prentice Hall.

Johnson, R.A. and Wichern, D.W. (1988). Applied Multivariate Statistical Analysis. Second Edition. New York: Prentice Hall.

Larson-Hall, J. (2010). A Guide to Doing Statistics in Second Language Research Using SPSS. London: Routledge.

Mackey, A. and Gass, S.M. (2005). Second Language Research. Methodology and Design. London: Lawrence Erlbaum Associates, Publishers.

Martín, Q., Cabero, Ma.T. and de Paz, Y. (2008). Tratamiento de Datos con SPSS. Prácticas Resueltas y Comentadas. Madrid: Thomson.

Moore, D.S. (1991). Statistics. Concepts and Controversies. Third Edition. New York: W.H. Freeman and Company.

Myers, R. (1990). Classical and Modern Regression with Applications. Second Edition. Boston: PWS-KENT Publishing Company.

Peña, D. (2001). Fundamentos de Estadística. Madrid: Alianza Editorial.

Peña, D. (2002a). Regresión y Diseño de Experimentos. Madrid: Alianza Editorial.

Peña, D. (2002b). Análisis de Datos Multivariantes. Madrid: Alianza Editorial.

Peña, D. and Romo, J. (1997). Introducción a la Estadística para las Ciencias Sociales. Madrid: McGraw Hill.

Pérez, C. (2004). Técnicas de Análisis Multivariante de Datos. Aplicaciones con SPSS}. Madrid: Pearson. Prentice Hall.

Pérez, C. (2005). Técnicas Estadísticas con SPSS 12. Aplicaciones al Análisis de Datos. Madrid: Pearson. Prentice Hall.

Ruiz-Maya, L. and Martín-Pliego, F.J. (2005). Fundamentos de Inferencia Estadística}. 3ª edición. Madrid: Thomson.

Ryan, B.F., Joiner, B.L., Ryan Jr., T.A. (1985). Minitab Handbook. Second Edition. Boston: PWS-Kent Publishing Company.

Tanur, J.M. et al. (1989). Statistics. A Guide to the Unknown. Third Edition. Pacific Grove, California: Wadsworth \& Brooks/Cole Advances Books \& Software.

Ugarte, M.D., Militino, A.F. and Arnholt, A.T. (2009). Probability and Statistics with R. Second Edition. New York: Chapman and Hall

#### Bibliografía de profundización

Baayen, R.H. (2008). Analyzing Linguistics Data. A Practical Introduction to Statistics using R. Cambridge: Cambridge University Press.Bain, L.J. and Engelhardt, M. (1992). Introduction to Probability and Mathematical Statistics}. Second Edition. New York: Duxbury Press.

Conover, W.K. (1980). Practical Nonparametric Statistics. Second Edition. New York: Wiley.

Johnson, R.A. and Wichern, D.W. (1988). Applied Multivariate Statistical Analysis. Second Edition. New York: Prentice Hall.

Myers, R. (1990). Classical and Modern Regression with Applications. Second Edition. Boston: PWS-KENT Publishing Company.

Peña, D. (2001). Fundamentos de Estadística. Madrid: Alianza Editorial.

Peña, D. (2002a). Regresión y Diseño de Experimentos. Madrid: Alianza Editorial.

Peña, D. (2002b). Análisis de Datos Multivariantes. Madrid: Alianza Editorial.

Ruiz-Maya, L. and Martín-Pliego, F.J. (2005). Fundamentos de Inferencia Estadística}. 3ª edición. Madrid: Thomson.

Ugarte, M.D., Militino, A.F. and Arnholt, A.T. (2009). Probability and Statistics with R. Second Edition. New York: Chapman and Hall.

#### Revistas

- The American Statistician- Chance

- Significance

- Applied Statistics

- Journal of Applied Statistics

- Statistical Modelling

#### Enlaces

- American Statistical Association (http://www.amstat.org/)- The Royal Statistical Society (https://www.rss.org.uk/)

- International Statistical Institute (https://isi-web.org/)

- Statistical Modelling Society (http://www.statmod.org/)