Key Points
Introduction to Machine Learning & Deep Learning |
|
Deep Learning to Identify Smartphone Applications |
|
Overview of Deep Neural Network Concepts |
|
An Introduction to Keras with Binary Classification Task |
|
Classifying Smartphone Apps with Keras |
|
Tuning Neural Network Models for Better Accuracy |
|
Effective Deep Learning Workflow on HPC |
|
Post-Analysis for Modeling Tuning Experiments |
|
Using Multicore CPUs or GPUs for KERAS Computation |
|
Dealing with Issues in Data and Model Training |
|
Deep Learning in the Real World |
|
Glossary
- activation Function
- A mathematical function that introduces non-linearity into a (NN) model.
- artificial neural network
- A type of machine learning model consisting of connected artifical neurons, inspired by neurons in the brain. Often abbreviated to just “neural network.”
- backpropagation
- An algorithm that updates/corrects the weights so as to bring the predicted outcome closer to the expected outcome.
- baseline Model
- Refers to a simple, benchmark model whose reasonable results will act as a standard or basis for evaluating the performance of other models.
- batch Size
- A hyperparameter that dictates the number of training samples used simultaneously in one iteration of model training. This hyperparameter mostly affects the speed of training.
- bias
- A type of error in prediction resulting from wrong assumptions during training. There are various types of bias that can lead to poor performance on new data. High bias models will not capture the training dataset closely and will result in underfitting.
- convolution neural network
- A type of neural network specializing in image related tasks, designed to capture local correlations/patterns in data that has spatial, temporal, or sequential order.
- convolution layer
- Applies a filter, or kernel (a set of weights), to the input to create a feature map.
- features
- Are attributes of data used for training and testing ML algorithms. Also simply referred to as “input.”
- forward propagation
- Calculates the output based on the current weights (and biases).
- A layer of (artificial) neurons between the input and output layers. The layers introduce nonlinearity to the model.
- hyperparameters
- “Settings” or variables that are set beforehand that controls the learning process (of a model). Hyperparameters govern how the model learns and adjusts the parameters. Hyperparameters for a neural network can include the number of layers and the number of neurons in each layer, the learning rate, the activation function, the optimizer, and many more.
- inference
- Uses a trained model to make predictions or generate outputs on new, unseen data. In other words, deploying the model to make predictions or generate output on new, unseen data.
- input layer
- The first layer of a neural network that receives the input data and passes it to the next layer.
- learning rate
- A hyperparameter that adjusts the step size, or amount that the machine learning model adjusts its parameters at each step.
- loss function
- Measures how well a model’s (current) output/predictions compare with the actual target values (also referred to as ground-truth labels). Also known as a cost or error function.
- malware
- A type of software designed to cause malicious harm to the user. This can include damaging, spying, or stealing personal data.
- neural network
- See Artificial neural network.
- parameters
- Internal variables that are adjusted during training. Refers to weights and bias.
- optimizer
- An algorithm used to iteratively improve the model during training. It adjusts the parameters (weights and bias) according to the loss.
- output layer
- The final layer in a neural network that generates the prediction or output.
- overfitting
- This occurs when a machine learning model becomes overly attuned to noises and specific details of the training data and leads to poor performance on new data. Models with high variance will cause this phenomenon.
- shape
- The shape of a tensor or vector refers to the dimensionality. A 1-D shaped tensor (a vector) represents the sequence length. Can also be referred to as “size.”
- underfitting
- Occurs when a machine learning model fails to capture the underlying training data’s pattern. This typically occurs when there is insufficient complexity in the model or when the model is not trained long enough.
- variance
- A type of error resulting from following the behavior of the training data (including the noise) too closely. Variance can lead to not being able to generalize well on new data. Too much variance leads to overfitting.
Further Reading
Other Courses
-
Howard, Jeremy. “Practical Deep Learning for Coders.” 2022. https://course.fast.ai/.
A course that teaches how to apply deep learning and machine learning to practical problems. This course assumes some coding experience.
-
Ng, Andrew. “AI for Everyone.” Coursera. https://www.coursera.org/learn/ai-for-everyone/.
A good overview AI course that is not too technical.
Building neural networks
-
Belcic, Ivan and Stryker, Cole. “What is Learning Rate in Machine Learning.” IBM: 27 November 2024. https://www.ibm.com/think/topics/learning-rate.
This article provides a more in depth explanation about learning rates. It explains why learning rate is important and how to determine the optimal learning rate.
-
Coursera Staff. “What Does Batch Size Mean in Deep Learning? An In-Depth Guide.” Coursera: 23 April 2025. https://www.coursera.org/articles/what-does-batch-size-mean-in-deep-learning.
This article goes more in depth about the hyperparameter batch size and its impact. It also explains more in depth how to find the optimal batch size.
-
“Pipelines, Mind Maps and Convolutional Neural Networks.” https://towardsdatascience.com/pipelines-mind-maps-and-convolutional-neural-networks-34bfc94db10c.
This article describes the discipline that a data scientist exerted over himself and his impulses so that he could get to his end-goal more efficiently.
He was using a “mind map” to help keep track what one has done in changing network, etc. (as well as the effect of each change) to achieve a better-performing network.
Optimizers
-
Amananandrai. “10 famous Machine Learning Optimizers.” Dev, May 3, 2024. https://dev.to/amananandrai/10-famous-machine-learning-optimizers-1e22.
-
A. Zohrevand and Z. Imani, “An Empirical Study of the Performance of Different Optimizers in the Deep Neural Networks,” 2022 International Conference on Machine Vision and Image Processing (MVIP), Ahvaz, Iran, Islamic Republic of, 2022, pp. 1-5, doi: 10.1109/MVIP53647.2022.9738743. keywords: {Training;Deep learning;Handwriting recognition;Image recognition;Costs;Machine vision;Neural networks;Convolutional Neural Network (CNN);Persian handwritten word recognition;Optimizer;Training cost;Recognition accuracy}. https://ieeexplore.ieee.org/document/9738743.
-
“Linear Regression: Gradient Descent.” Google Machine Learning Education: Machine Learning Crash Course, 8 Nov. 2024. https://developers.google.com/machine-learning/crash-course/linear-regression/gradient-descent.
-
Ruder, Sebastian. “An overview of gradient descent optimization algorithms.” Ruder.io, 2016. arXiv preprint arXiv:1609.04747. https://www.ruder.io/optimizing-gradient-descent/#adagrad.