Introduction to Machine Learning & Deep Learning
Overview
Teaching: 0 min
Exercises: 0 minQuestions
What is deep learning?
What are the top deep learning algorithms?
What are the differences between Artificial Intelligence (AI) vs Machine Learning (ML) and Deep Learning (DL)?
Objectives
Learning general concepts of machine learning, deep learning and their relationship with artificial intelligence.
Diving deeper into deep learning concepts and algorithms and the tools necessary.
Artificial Intelligence, Machine Learning, and Deep Learning
Artificial intelligence (AI) has recently gained a lot of popularity; many technologies containing AI have been deployed, which have tremendously improved our lives. In this episode, we will attempt to explain what AI is and clarify the difference between AI, machine learning (ML), and deep learning (DL). These three items are closely related, but they are not fully identical and interchangeable.
Artificial Intelligence
For many years, scientists dreamed of enabling computers to think for themselves. Deep learning, due to its human-like brain structure, has enabled machines to learn and to use what they learn to carry out their daily tasks. Similar to machine learning, deep learning algorithms first require training to be able to learn patterns and tasks that are being assigned.
The English Oxford Dictionary gives the following definition for AI:
“The theory and development of computer systems able to perform tasks normally requiring human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.”
Machine Learning
Machine learning (ML) is a subset of AI methods that provides computer systems the ability to automatically learn and improve from experience without being explicitly programmed. In ML, there are many different algorithms that help to solve problems.
Deep Learning
Deep learning (DL) is a specific branch of ML that aims to achieve a high level of performance (approaching, or even exceeding, human intelligence capabilities) based on a specific model architecture known as artificial neural networks (often simply known as neural networks [NN]). As we shall see later in this lesson, NN models are strongly inspired by the neural structure of the human brain and how it works.
DL is much more sophisticated and powerful than traditional ML algorithms. It is able to perform intelligent tasks beyond what is possible with traditional ML, i.e. tasks such as accurate recognition of objects in images and videos, handwriting and speech recognition, and interacting with humans (e.g. chatbots).
Relationships between Artificial Intelligence, Machine Learning and Deep Learning
(TODO) Put in the 3 concentric circles here.
Contrast and Comparison between Machine Learning and Deep Learning
Machine learning key features
- Machine learning uses algorithms to parse data, learn from that data, and make informed decisions based on what it has learned.
- Can be trained on smaller training datasets.
- Takes less time to train.
- Training conducted on a CPU.
- Training costs can be low.
- The output is in numerical form for classification and scoring applications.
- Output accuracy varies greatly (e.g. 99% < accuracy < 50%).
- Limited capability for hyperparameter tuning.
- Feature selection is necessary.
Deep learning key features
- Deep learning structures algorithms in layers to create an “artificial neural network” that can learn and make intelligent decisions on its own.
- Requires large datasets for training.
- Takes a longer time to train.
- Training should occur on a GPU for proper training.
- Training costs are very high.
- The output can be in any form, including free form elements such as free text and sound.
- The output accuracy can exceed 99%.
- Hyperparameters can be tuned in various ways.
- There is no need for feature selection. (Source)
Overview of Machine Learning
The machine learning process can be divided into five sections:
1) The input is data generated and collected in real life, such as voices, images, etc.;
2) Features are attributes of data used for training and testing ML algorithms and are usually obtained by feature abstraction;
3) The output is any value(s) one wants to obtain from the input;
4) Learning algorithms or models distinguish the patterns of a feature by adjusting a set of parameters through training. Briefly, machine learning models, which are essentially mathematical functions, map between the features (inputs) and the outputs.
5) The inference is conducted based on the determined parameters via learning algorithms.
Figure: Overview of machine learning.
Applications of Machine Learning
Machine learning has various applications, including:
-
Image recognition (e.g. facial recognition for Facebook or Google Photos or the face unlock feature on smartphones)
-
Movie or product recommendation (e.g. app store or play store application recommendations)
-
Fraud detection
-
Medical diagnosis
-
Autonomous driving
Future applications include things like self-driving cars, health monitoring, and many more.
In cybersecurity areas, machine learning has been used or researched for these purposes:
-
Detection of malware, both existing and novel types
-
Detection of malicious URLs
-
Detection of network attacks, intrusions or suspicious activities
Discussion Topic
Discuss with your partner an example for each application listed above, and consider what the input and output could be for each.
Overview of Deep Learning
Complex tasks require a complex AI architecture to address them. Deep learning algorithms are well suited to undergo such tasks due to their unique structure.
Deep learning enables a machine to constantly adapt to its surroundings and make changes as needed. This ensures versatility of operation, making it possible for a machine to efficiently analyse problems through its hidden layer architecture that would otherwise be far too complex to address through manual programming. Therefore, deep learning offers an advantage when handling huge volumes of unstructured data, as it does not require any labels.
Deep neural networks consist of multiple layers of interconnected nodes, each building upon the previous layer to refine and optimize the resulting prediction or categorization. This progression of computations through the network is called forward propagation. The input and output layers of a deep neural network are called visible layers. The input layer is where the deep learning model ingests the data for processing, and the output layer is where the final prediction or classification is made.
Another process called backpropagation uses algorithms, like gradient descent, to calculate errors in predictions and then adjusts the weights and biases of the function by moving backwards through the layers in an effort to train the model. Together, forward propagation and backpropagation allow a neural network to make predictions and correct for any errors accordingly. Over time, the algorithm becomes gradually more accurate.(Source)
The figure below is an example of a deep learning network.
Figure: Overview of deep learning.
There are many variants of neural network algorithms, each serving specific goals. The following list contains some of the more popular ones:
- Fully-connected neural network, also known as Multilayer Perceptrons (MLPs)
- Convolutional Neural Networks (CNNs)
- Recurrent Neural Networks (RNNs)
- Deep Belief Networks (DBNs)
We will return to these names, what they are in a nutshell, and their applications in the last episode of this lesson.
Deep learning tools and services
There are many deep learning tools and services currently available. Below is a list of the most used tools for deep learning:
- TensorFlow
- Keras
- H2O.ai
- Caffe
- DeeplearningKit
- Torch
- Theano
Introduction to TensorFlow and Keras
In this lesson, we will provide a hands-on introduction to NN using two closely related Python libraries: TensorFlow and Keras.
TensorFlow is a powerful computation framework that provides a convenient way to express complex neural-network low-level mathematics involving series of tensors (which, briefly, are high-dimensional “matrices”). In this training module, we will leverage the capabilities of TensorFlow via a higher-level API named Keras.
Keras is a high-level software framework used to perform training and inference for neural networks. Originally, Keras was a stand-alone project (https://keras.io) that provided a uniform, high-level, user-friendly API for various lower-level neural network libraries: TensorFlow, Microsoft Cognitive Toolkit (CNTK), and Theano. (Theano is now unmaintained, so it is not recommended for new projects.) Keras requires the use of a low-level neural network library (TensorFlow, Theano or CNTK), which provides the actual computational capabilities. We will use only the TensorFlow backend in this workshop.
With Keras, construction of a neural network has become tremendously easier. As we will soon see, Keras users need not be overburdened by the different conventions and bookkeeping rules that are encountered when using lower-level libraries. They can instead focus on the structure of the network, providing almost 1:1 mapping to human-friendly neural network diagrams.
The popularity and adoption rate of Keras was so high that TensorFlow developers announced on January 2017 that they adopted KERAS as the high-level API for TensorFlow. As a result, there has been an integrated KERAS API since TensorFlow version 1.5.
Figure: Keras-Tensorflow Chart.
Next section
Next, we will introduce the structure of deep learning networks. We will focus on a popular and powerful branch of deep learning called Artificial Neural Network (ANN) or, more often, as Deep Neural Network (DNN).
(TODO) Explain the roadmap of the episodes, incl the “problem” we want to answer (ie. smartphone app classification), and activities we will be doing.
Key Points
Deep learning is an advanced form of machine learning that uses neural networks, which mimic the neurons in the human brain, to collect data, learn and modify the model as it learns more.