COURSE SPECIFICATION

The course information as follows may be subject to change, either during the session because of unforeseen circumstances, or following review of the course at the end of the session. Questions about the course should be directed to the course instructor.

 

Course Title

Data Science

ECTS Credits

6

Teaching Language

English

Instructor(s), Affiliation

To be confirmed

Delivery Method

Lectures

Tutorials

Independent Study Hours

Total

Contact Hours

35

10

105

150

Pre-requisites or Other Academic Requirements

Programming:

Familiarity with the basics of Python

Python 3.7 or 3.8 installed as a part of the Anaconda Python distribution of Data Science, or equivalent.

Libraries: scipy, numpy, matplotlib, pandas, sklearn

Mathematics and statistics:

Working knowledge of linear algebra, calculus, basic probability and statistics

 

SYLLABUS

Course Objectives

This course introduces the students to the mathematical foundations of data analysis, as well as to visualising and analysing data with Python. There are three main parts to the course. The first third of the course introduces a statistical basis for data analysis, including means and variance of random variables, Bayes’ formula, the central limit theorem, linear regression, confidence intervals, and hypothesis testing (z-test and t-test). The second third is about data cleaning and visualisation (including creating plots of various types), with many examples in Python. The final third introduces the basics of machine learning in data analysis, covering the k nearest neighbours algorithm, regression using machine learning, and principal component analysis, with an emphasis both on mathematical understanding and the ability to utilise these methods in Python.

Learning Outcomes

By the end of the course, the student should learn

       - To visualise, analyse and clean data sets using Python;

       - To make hypotheses about data and test them;

       - To fit regression models to data;

       - To learn how to use machine learning to make accurate predictions and models for complex data sets;

       - The mathematics behind all of this.

Textbook and Supplementary Readings

"Mathematical Foundations for Data Analysis". Jeff M. Phillips. Springer.

"Mathematics for Machine Learning". Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong. Cambridge University Press.

Assessment

The final assignment is a programming project, 100% of the final grade.

Grading System

Letter Grading