Tesis / Tesia: Behavior modelling with data obtained from the Internet and contributions to cluster validation (I. Perona, 2016/02/05)


Autor / Egilea: Iñigo Perona Balda

Título / Izenburua: Behavior modelling with data obtained from the Internet and contributions to cluster validation

  •  Abstract: This PhD thesis makes contributions in modelling behaviours found in different types of data acquired from the Internet and in the field of clustering evaluation. Two different types of Internet data were processed, on the one hand, internet traffic with the objective of attack detection and on the other hand, web surfing activity with the objective of web personalization, both data being of sequential nature. To this aim, machine learning techniques were applied, mostly unsupervised techniques. Moreover, contributions were made in cluster evaluation, in order to make easier the selection of the best partition in clustering problems.
    With regard to network attack detection, first, gureKDDCup database was generated which adds payload data to KDDCup99 connection attributes because it is essential to detect non-flood attacks. Then, by modelling this data a network Intrusion Detection System (nIDS) was proposed where contextindependent payload processing was done obtaining satisfying detection rates.
    In the web mining context web surfing activity was modelled for web personalization. In this context, generic and non-invasive systems to extract knowledge were proposed just using the information stored in webserver log files. Contributions were done in two senses: in problem detection and in link suggestion. In the first application an meaningful list of attributes of navigation to group and detect different navigation profiles was proposed. In the latter, a general and non-invasive link suggestion system is proposed which is evaluated in a link prediction context.
    With regard to the analysis of Cluster Validity Indices (CVI), the most extensive CVI comparison found up to a moment was carried out using a partition similarity measure based evaluation methodology. Moreover, we analysed the behaviour of CVIs in a real web mining application and with elevated number of clusters in which they tend to be unstable and proposed a procedure which automatically selects the best partition analysing the slope of different CVI values.

Directores / Zuzendariak: Olatz ArbelaitzJavier Muguerza

Fecha / Data: 5 de febrero de 2016 – 2016ko otsailak 5
Hora / Ordua: 10:30
Lugar / Lekua: Sala de grados David Cato de la Facultad de Derecho / Zuzenbide Fakultatkeo David Cato gradu aretoa

2 iruzkin honentzako: Tesis / Tesia: Behavior modelling with data… (I. Perona, 2016/02/05)

Erantzuna idatzi




HTML etiketa hauek erabil ditzakezu

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>