WHAT WE DO
Provide the student with the correct intuition behind data science problems and some of the algorithms to solve them, including:
The geometric interpretation
Both theoretical and practical limitations
Comparison with other algorithms
Provide the student with the necessary language to translate fluently:
The problems of Catho science to the mathematical language used in machine learning.
The algorithms exposed in the literature -either in scientific articles or textbooks- to the specific problems.
Block one is focused on two main objectives:
Using three algorithms (perceptron, linear regressions and logistic regressions) invite the student to the methods and language of Data Science.
Make an accurate diagnosis of the student in order to offer a better planned program for the rest of the blocks.
1. Perceptron (Classification)
Statement of a binary classification problem.
Stages of a learning problem.
Geometric interpretation of linear classification
Algebraic formulation of linear classification
2. Linear regressions (Forecasting)
Statement of a regression problem
Exact solution and matrix algebra
Approach using the gradient method
3. Logistic regression (Bayesian inference)
Binary classification using logistic regression
Sigmoid function and interpretation
The main objective is to continue the two algorithms we studied in block one, as well as to introduce the first non-parametric and unsupervised algorithms.
On the one hand, the decision trees generalize the perceptron by allowing non-linear classification, and with them we will begin the study of non-parametric algorithms.
The PCA method will be the first example of an unsupervised algorithm that we will study, in addition to reinforcing the idea of correlation studied in the previous block.
Finally, we will begin the study of proximity algorithms, which in addition to being the second unsupervised and non-paramedical example will allow us to introduce the idea of clusterization.
What is not your decision tree?
Entropy and Gini function
2. Principal component analysis (PCA)
Interpretation in terms of variance
Interpretation in terms of distance
Relationship to linear algebra
Singular value decomposition
3. Closeness and clusterization algorithms
Euclidean distances and other metrics
The curse of dimension
There are three objectives of block three:
Firstly, we seek to introduce the concept of regularization in machine learning, which is essential to compare algorithms through their generalization capacity.
The second objective is to expand the palette of algorithms that the student understands by means of two fundamental techniques for classification and forecasting: neural networks and time series.
Finally we begin the presentation and analysis of another family of useful and common algorithms in machine learning, the so-called stochastic algorithms, we will focus on their relationship with neural networks, linear regressions and decision trees. We will complement this block with an invitation to boosting.
1. Regularization in Machine Learning
Fitting vs overfitting
In linear regressions
In decision trees: pruning
Perceptron: support vector machines
2. Invitation to Deep learning
Neural network architectures
Convolution and its interpretation: CNN
3. Stochastic algorithms
Stochastic gradient descent (regressions and neural networks)
Random forests (decision trees)
4. Invitation to time series
Components of a time series
White stochastic noise