# Overview of Machine Learning

Below is a high-level overview of Machine Learning.

*Double-tap to expand/collapse an item. Left-tap & drag to move. Mouse-wheel/pinch to zoom.
Knowledge Graph |  Text |  Top
Machine Learning Artificial Intelligence Statistical tools for learning from data in order to derive predictive insights. Pattern recognition from examples. Terminology Label - a correct output for an input - a fact or true answer Input - a variable used to predict the label Example - a set of inputs and a corresponding label Model - a mathematics function that takes input variables and tries to approximate the label Training - adjusting the model to minimise the error between the approximated label and the actual label Prediction - the output of the Model on unlabelled data Hyper-parameter tuning Back-propagation Epoch - one traversal through the entire training set Gradient Descent - optimisation - the process of reducing the error Batch size - the amount of data that the error is computed on Weights - the parameters of a function that are optimised Evaluation - periodically determining whether the model is good enough, based on a set of metrics Training - the process of optimising the weights, including gradient descent and evaluation Softmax - helps handle multiple labels - all values are normalised to sum to one Over-fitting - the model does not generalise very well for unseen examples Under-fitting - the model is too inaccurate Feature Engineering - using insights to calculate or engineer extra features/inputs Neuron - one unit of combining inputs Activation Function Hidden Layer - a set of neurons that operate on the same set of inputs Features - transformations of inputs, typically using an Activation Function Ground Truth Error = ground truth value - prediction value Root Mean Squared Error (RMSE) - for Regression - square the error (so it becomes positive) and take the mean, then square rooted Cross-Entropy - differentiable error value for Classification - the log loss Confusion Matrix - for evaluation of a model - True Positives (TP), False Positives (FP), False Negatives (FN), True Negatives (TN) Accuracy - intuitive measure of skill for classifiers if dataset is balanced - the fraction that is correct (fails if dataset is unbalanced) Precision - use when what you are trying to find is common; accuracy when a classifier is positive ; positive predictive value = TP / (TP + FP) (good if dataset is unbalanced) Recall - use when what you are trying to find is rare; accuracy when the truth is positive ; true positive rate = TP / (TP + FN) (good if dataset is unbalanced) Training Dataset Validation Dataset Test Dataset Cross-validation - if Test Dataset is rare, use different splits of training and validation datasets Dense features - continuous numbers; Neural Network is good for these Sparse/Wide features - independent, discrete, categorical values; feature cross pairs; Linear models are good for these ML Steps 1. Explore the data 2. Split data into train/validation/test datasets 3. Benchmark the performance to be obtained Classes of Machine Learning Supervised Learning Learning from past examples to predict future values Model Types Regression Label has a continuous real value Classification Label has a discrete set of values or classes Datasets What makes a good Dataset? Positive examples Negative examples Negative examples that are near misses Exhaustive coverage of examples Examples of outliers - so they can be learned and handled gracefully Neural Networks Single neuron line function - w.x1 + w.x2 > bias ? Optimisation - Gradient Descent - iteratively reducing the error of output from the label Unsupervised Learning Using unlabelled data to discover relationships between data Clustering Semi-supervised Learning Applications of Machine Learning Natural Language Processing (NLP) The processing of any natural language in order to understand both its grammatical syntax and semantics Computer Vision Methods for acquiring, processing, analysing and reasoning about images or video sequences in order to extract meaningful/useful information that can be interpreted and acted upon as desired Robotics Deep Learning Deep Neural Networks Code Libraries Python Libraries PyTorch FastAI TensorFlow TODO Classical Machine Learning TODO AI Adoption Strategy Preference 1 - Use pre-built AI services/models Preference 2 - Customise pre-built AI services/models Preference 3 - Create new models => rule of thumb: only when you have > 100k high-quality examples Workflow Options Kubeflow Pipelines TODO Feature Engineering Tips Have a reasonable hypothesis for why a specific feature may be relevant for the problem, otherwise discard it The feature value must be known at the time when a prediction is needed - don't use historical data that was later determined - careful when training on data from a Data Warehouse! Ensure the feature data is legal and ethical to use Feature values need to be numerical WITH a meaningful magnitude; or at least representable in a numeric form with a vector representation... Must have enough examples of each feature input value - e.g. at least 5 examples or samples; for real values you may need to group/bin them together Discard values that are too specific - like a transaction id One-hot encode categorical values - a vector/list representing each input category; only one item in the list has a value of 1 and the others are zero Create a vocabulary in training pre-processing to create a vocab of keys Don't mix magic numbers representing missing data (e.g. null or -1) with real data - perhaps have 2 values - one for whether the value was provided and another for the actual value (or zero if was not provided) Use feature crosses - e.g. using intuition like a yellow car in New York is likely a taxi, so combine 2 features so a yellow car in another city is not misrepresented because of the training data from New York. e.g. Bucketise Latitude/Longitude into 0.1 degrees and do a feature cross - essentially same as putting lat/long points onto grid cells Use a wide and deep network if you have both dense and sparse features