020IABDM2

Mining Massive Data Set

This teaching unit covers the fundamentals of designing dedicated software systems for analytics processing of large data. The course begins with the design principles of relational database systems for business data analysis, including declarative queries, query optimization and transaction management, as well as the evolution of the basic systems of data to support complex analytical problems and scientific data management. The course then looks at fundamental architectural changes at the data processing scale beyond the limit of a single computer, including parallel databases, "MapReduce", column storage and distributed key value, and to also allow the calculation of low latency analytical results from real-time data flows. Finally, this course examines advanced data management systems to support models of various data including tree structure (XML and JSON) and structured data graph (RDF) and new workloads such as learning tasks automatic (Spark) and mixed workloads (Google Cloud data feed).


Temps présentiel : 30 heures


Charge de travail étudiant : 70 heures


Méthode(s) d'évaluation : Examen final, Projets, Travaux pratiques

Ce cours est proposé dans les diplômes suivants
 Master en data sciences