DeapSECURE module 4: Deep Learning (Neural Network)

Key Points

Introduction to Machine Learning & Deep Learning	Deep learning is a subset of machine learning, which is a subset of AI. Deep learning is an advanced form of machine learning that uses neural networks, which mimic the neurons in the human brain, to collect data, learn and modify the model as it learns more.
Deep Learning to Identify Smartphone Applications	Malicious apps (malware) have become a prevalent tool for compromising mobile devices, stealing personal information, or spying on a user’s activities. Researchers have leveraged artificial intelligence (AI) / machine learning (ML) to keep up with increasing security challenges. Large amounts of data are critical to train and validate accurate and effective ML-based cybersecurity techniques. ML models can be used to distinguish smartphone apps and potentially identify malware.
Overview of Deep Neural Network Concepts	Deep neural networks have linear and nonlinear parts. Activation functions add the nonlinear component to the model. Forward propagation calculates the output based on the current parameters. Backpropagation is used to adjust the network parameters. An HPC can be adopted to speed up the training and inference processes.
An Introduction to Keras with Binary Classification Task	Keras is an easy-to-use high-level API for building neural networks. Main parts of a neural network layer: number of hidden neurons, activation function. Main parts of a neural network model: layers, optimizer, learning rate, loss function, performance metrics.
Classifying Smartphone Apps with Keras	On KERAS, we can easily build the network by defining the layers and connecting them together.
Tuning Neural Network Models for Better Accuracy	Neural network models are tuned by tweaking the architecture and tuning the training hyperparameters. Some common hyperparameters to tune include the number of hidden layers (depth of the network), number of neurons in each hidden layer (width of the layer), learning rate (step rate), and batch size (number of training samples before updating).
Effective Deep Learning Workflow on HPC	How scripting works by converting the notebook to job scripts Build a simple toolset/skillset to create, launch, and manage the multiple batch jobs. Use this toolset to obtain the big picture result after analyzing the entire calculation results as a set. Use Jupyter notebook as the workflow driver instead of using it to do the heavy-lifting computations on it.
Post-Analysis for Modeling Tuning Experiments	Post-analysis focuses on analyzing the results (of a model) to better understand the behavior and improve its performance.
Using Multicore CPUs or GPUs for KERAS Computation	First key point. Brief Answer to questions. (FIXME)
Dealing with Issues in Data and Model Training	Data issues like missing values, imbalance, and errors lead to biased models and poor predictions. Metrics like precision, recall, and F1 score provide class-specific insights into model performance. Overfitting and underfitting can be detected by comparing training and validation metrics.
Deep Learning in the Real World	Layer design

Glossary

activation Function: A mathematical function that introduces non-linearity into a (NN) model.
artificial neural network: A type of machine learning model consisting of connected artifical neurons, inspired by neurons in the brain. Often abbreviated to just “neural network.”
backpropagation: An algorithm that updates/corrects the weights so as to bring the predicted outcome closer to the expected outcome.
baseline Model: Refers to a simple, benchmark model whose reasonable results will act as a standard or basis for evaluating the performance of other models.
batch Size: A hyperparameter that dictates the number of training samples used simultaneously in one iteration of model training. This hyperparameter mostly affects the speed of training.
bias: A type of error in prediction resulting from wrong assumptions during training. There are various types of bias that can lead to poor performance on new data. High bias models will not capture the training dataset closely and will result in underfitting.
convolution neural network: A type of neural network specializing in image related tasks, designed to capture local correlations/patterns in data that has spatial, temporal, or sequential order.
convolution layer: Applies a filter, or kernel (a set of weights), to the input to create a feature map.
features: Are attributes of data used for training and testing ML algorithms. Also simply referred to as “input.”
forward propagation: Calculates the output based on the current weights (and biases).
hidden layer: A layer of (artificial) neurons between the input and output layers. The layers introduce nonlinearity to the model.
hyperparameters: “Settings” or variables that are set beforehand that controls the learning process (of a model). Hyperparameters govern how the model learns and adjusts the parameters. Hyperparameters for a neural network can include the number of layers and the number of neurons in each layer, the learning rate, the activation function, the optimizer, and many more.
inference: Uses a trained model to make predictions or generate outputs on new, unseen data. In other words, deploying the model to make predictions or generate output on new, unseen data.
input layer: The first layer of a neural network that receives the input data and passes it to the next layer.
learning rate: A hyperparameter that adjusts the step size, or amount that the machine learning model adjusts its parameters at each step.
loss function: Measures how well a model’s (current) output/predictions compare with the actual target values (also referred to as ground-truth labels). Also known as a cost or error function.
malware: A type of software designed to cause malicious harm to the user. This can include damaging, spying, or stealing personal data.
neural network: See Artificial neural network.
parameters: Internal variables that are adjusted during training. Refers to weights and bias.
optimizer: An algorithm used to iteratively improve the model during training. It adjusts the parameters (weights and bias) according to the loss.
output layer: The final layer in a neural network that generates the prediction or output.
overfitting: This occurs when a machine learning model becomes overly attuned to noises and specific details of the training data and leads to poor performance on new data. Models with high variance will cause this phenomenon.
shape: The shape of a tensor or vector refers to the dimensionality. A 1-D shaped tensor (a vector) represents the sequence length. Can also be referred to as “size.”
underfitting: Occurs when a machine learning model fails to capture the underlying training data’s pattern. This typically occurs when there is insufficient complexity in the model or when the model is not trained long enough.
variance: A type of error resulting from following the behavior of the training data (including the noise) too closely. Variance can lead to not being able to generalize well on new data. Too much variance leads to overfitting.

DeapSECURE module 4: Deep Learning (Neural Network)

Key Points

Glossary

Further Reading

Other Courses

Building neural networks

Optimizers