An Introduction to Deep Learning

Syed ShahMarch 23, 2020, 3:03 pmMarch 23, 2020

Author: Johanna Pingel, product marketing manager, MathWorks

Deep learning is getting lots of attention lately, and for good reason. It’s making a big impact in areas such as computer vision and natural language processing. It’s a key technology behind driverless cars, and voice control in consumer devices like phones and hands-free speakers.

Let’s explore three key concepts within deep learning:

What is deep learning
What is the difference between machine learning and deep learning
How to get started training a deep learning model

What is deep learning?

In deep learning, a computer model learns to perform classification tasks directly from images, text, or sound. Deep learning models can achieve state-of-the-art accuracy, sometimes exceeding human-level performance. Most deep learning methods use neural network architectures, which is why deep learning models are often referred to as deep neural networks.

The term “deep” usually refers to the number of hidden layers in the neural network. Traditional neural networks only contain 2-3 hidden layers, while deep networks can have as many as 150. One of the most popular types of deep neural networks is known as convolutional neural networks (CNN or ConvNet). A CNN convolves learned features with input data, and uses 2D convolutional layers, making this architecture well suited to processing 2D data, such as images.

Using an image example, a fully trained deep learning model will be able to automatically identify objects in images, even if it has never seen those exact images before. Ever wondered how certain websites can identify specific people in photos that were just uploaded? That’s deep learning at work.

Many of the techniques used in deep learning today have been around for decades. For example, deep learning has been used to recognise handwritten postal codes in the mail service since the 1990s.

So why has deep learning surged in popularity recently?

This main reason is accuracy. Deep learning models can achieve state-of-the-art accuracy, sometimes exceeding human-level performance. In addition, there are two main factors that have made these advances possible:

Deep learning requires large amounts of labeled data. For example, driverless car development requires millions of images and thousands of hours of video. These large sets of labeled data are prevalent and have become available recently.
Deep learning requires substantial computing power. High-performance GPUs have a parallel architecture that is efficient for deep learning. When combined with clusters or cloud computing, this enables development teams to reduce training time for a deep learning network from weeks to hours or less.

What is the difference between deep learning and machine learning?

Deep learning and machine learning both offer ways to train models and classify data. Let’s compare these two approaches to see what scenarios determine the use of each.

Using a standard machine learning approach, we would need to manually select the relevant features of an image, such as edges or corners, to train the machine learning model. The model then references these features when analyzing and classifying new objects

With a deep learning workflow, relevant features are automatically extracted from images. In addition, deep learning performs “end-to-end learning” – where a network is given raw data and a task to perform, such as classification, and it learns how to do this automatically.

Another key difference is deep learning algorithms scale with data, whereas shallow learning converges. Shallow learning refers to machine learning methods that plateau at a certain level of performance when you add more examples and training data to the network.

When choosing between machine learning and deep learning, we should ask ourselves whether we have a high-performance GPU and lots of labeled data. If we don’t have either of these things, we’ll have better luck using machine learning over deep learning. This is because deep learning is generally more complex, so we need at least a few thousand images to get reliable results. We will also need a high-performance GPU so the model spends less time analyzing all those images.

If we select machine learning, there is the option to train our model on many different classifiers. We might also know which features to extract that will produce the best results. Plus, with machine learning, we have the flexibility to choose a combination of approaches. Use different classifiers and features to see which arrangement works best for the data.

So, in general, deep learning is more computationally intensive, while machine learning techniques are often simpler to apply.