Welcome to our first lecture. We will start by looking at various deep learning applications, discuss the history of deep learning, compare deep learning to classical machine learning, look at common deep learning forums and then give a brief introduction to neon. To start, let's look at deep learning in action. Deep learning has a diverse and large range of applications, and I will share a few with you now. One of the most common applications is taking an image and providing the category or label of that image. For example, here the input is an image of a cat and the output is the label cat. The details of these and other networks that it will show you will be explained in a couple of lectures. My goal for now is to show you various applications. In this example, the inputs to the model is an image, and the outputs are bonding boxes around the various objects of interest. Furthermore, for each bounding box or on each object there is a corresponding label. Deep networks can be trained to generate sentences or captions that it describe an image. In this example, the input to the model is an image, and the outputs our bounding boxes are in the various objects of interest with a description for each object. For example, this image has several bounding boxes aren't objects of interest. Each one with a short description such as white laptop, or woman wearing a black shirt, well also, are putting bounding boxes. This is another common deep learning model use for semantic segmentations where the input is an image and the outputs are labels for each pixel in the image. In these examples, the input is an image or text, and a query are part of the image or text, and the output is the response to that query. For example, on the right, the input is some text that describes a series of actions on an apple, and another input is the question, where is the apple? The model outputs the correct answer, kitchen. In this example, the input is speech and the output is text corresponding to that speech. The speech signal is first represented as a spectrogram which is a time frequency representation of the speech signal, and the spectrogram becomes the input to the network. This model, on the other hand, does the opposite. It is a harder problem. The input is text, and the output is speech. In fact, for this text-to-speech system, five different deep learning models are used. Deep learning has also been used to learn actions. In this example, the input is an image with a score, and the output is an action that the network takes to maximize the score. This is an area known as reinforcement learning. In this example, a generative additional network is used to generate images using a small caption as the input to the network. For example, the input is, this small bird has a pink breast and crown, and black primaries and secondaries. And the output is a set of images on the right showing birds that are generated by the network which matched the description. In my organization, we have work with our customers on a variety of projects using deep learning. Let me briefly highlight three of them which will hopefully provide you with a better appreciation for how deep networks can be used by enterprises. We work with Manulife, an insurance company in Canada to develop a system for their financial analysis. Our system helps them consume the avalanche of financial information published every day. The system are processes new documents every day, are highlights topics of interest to them, and also allows them to quickly discover the competitive landscape of a market sector. We have also worked with the agricultural company, Blue River Technologies. A company that uses computer vision and robotics to accurately measure and characterize crops allowing farmers to make plant-by-plant decisions. This system needs to be embedded in their agricultural vehicles, and the task is difficult for humans due to occlusions and difficult lighting conditions. We developed a specialized deep learning model and then use compression techniques to deploy the model in a low power and better system that allow the farmers to classify each plant individually and treated accordingly. We worked with a customer to do oil exploration, specifically building a system for automatically detecting fault lines from seismic reflection data. The purple lines are a seismic reflection data, and the darkness of the green line indicates the likelihood of a fault. This is my last example for now. As you can see, the applications of deep learning are numerous and diverse. In this course, you will gain the foundation of knowledge to apply deep learning algorithms to these and other domains. So now, let's go over a few definitions. These are not setting stone as different people have their own variations of what Artificial Intelligence, Machine Learning and Deep Learning constitutes. But these are common definitions that I use. Artificial intelligence or AI can be described as a program that can sense, reason, act, and adapt. Machine Learning is a subset of AI and can be described as a program that learns a function that amounts features from the input data to some desired output, and whose performance improves with more data. Finally, deep learning is a subset of machine learning and can be described as a program that learns to extract features from the data with increased complexity at each layer, and also learn some function that amounts these features to some desired output. This will become more clear in a few slides. Now that we have looked at deep learning in action, I will briefly talk about the history of the fields of deep learning. Deep learning is an old field that started in the 1960's or earlier depending on who you ask, when neural networks were used for binary classification. Their popularity declined in the 1970's when they could not deliver on the hype that the community created. In the 1980's, the Backpropagation algorithm was developed, facilitating the training of deep networks and reviving the interest in this area. However, they were replaced in the 1990's by support vector machines or SVMs due to the nice theoretical properties of SVMs. However, in the past seven years, there has been a revived interest as deep learning has vastly outperform other techniques, in particular for vision, speech, and natural language processing tasks. This recent success is due to larger data sets, faster hardware, and smarter algorithms. In addition, the success of deep learning can also be attributed to the openness in the community. Researchers from competing companies are open about their algorithms and training methodologies, and are able to build on each other's knowledge. The information that companies tend not to share are their trained models and data sets. However, academics are usually more open and will often share their trained models and make their data sets publicly available. Here are a few quotes on why machine learning is so important today. Bill Gates said "If you invent a breakthrough in artificial intelligence, so machines can learn, that is worth ten Microsofts." Sundar Pichai, the CEO of Google, said "Machine learning is a core, transformative way by which we're re-thinking how we're doing everything." Finally, Tom Dietterich said "Computers are already more intelligent than humans on many tasks, including remembering things, doing arithmetic, doing calculus, trading stocks and landing aircrafts." Despite all the promises of AI, there are some people however, that fear that AI will bring about the rise of robots that will take over the world. This hype comes from Hollywood movies, from reporters hyping a story to get viewers and from scientists wanting attention to their work. The reality is paraphrasing and ranging that worrying about evil robots is like worrying about the overpopulation of Mars. Instead, as a society we should worry about the need to re-educate people whose jobs will disappear as AI advances. For example, autonomous vehicles will likely make transportation safer, more comfortable and efficient. But will also reduce the demand for jobs in the transportation sector. Other sectors will similarly be affected. And as a society, we will be better off by preparing for these economic restructuring. Okay, enough with the public policy discussion. Now, let's look at the types of machine learning algorithms and the differences and similarities between classical machine learning and deep learning. Machine learning, which includes deep learning, can be broadly split up into four categories; supervised, unsupervised, semi-supervised, and reinforcement learning. While there is active research in all areas, supervised learning is the most common type used in industry. Supervised learning maps input data to labels and therefore records label data to train. Unsupervised learning attempts to discover patterns in unlabeled data. One of the reasons for the lack of wide adoption of unsupervised learning, is that it is often easier to spend a week or two labeling data than months of research to get your unsupervised algorithm to deliver. Semi-supervised learning uses a combination of both labeled and unlabeled data to train to model. Reinforcement learning, is used to teach an agent to perform certain actions based on rewards. However, it usually takes time to get the reward for a given action. In this course, I will refer to machine learning algorithms that are not deep learning as classical machine learning algorithms. Let's explore the differences between classical machine learning and deep learning. We will use the task of image classification. In classical machine learning, you have an image with N by N pixels. Can you engineer a set of K features, where K is typically much less than the dimensions of the image N squared. These features are selected by a domain expert. For example, to design a car classifier, an expert may decide to select the ratio of the height versus length or the number of circular regions in the image. I am making up these features. In practice, a computer which an expert would use more advanced features. Each image is map to a K-dimensional feature representation. Then you apply your varied algorithm such as SVM or logistic regression to learn to associate these patterns of features with an identity vehicle. I'm showing various machine learning algorithms. And it's okay if you don't know what they are. The K features that are selected should be unique for their particular object. In these two example, we have selected two features that is K equals two for each image from five different classes; vehicles, animals, faces, fruits, and chairs. Each class is clustered together and separated from other classes. There is these two features that the expert selected are very good features. This is a made up example, in practice, two features are typically not sufficient. Supervised learning using the machine algorithms listed here, allows for the determination of decision boundaries. However, in practice choosing good features is very challenging. It may be better to learn these features. This is one of the benefits of deep learning. Deep learning is End-to-End, meaning that we pass the end by end pixels directly to an deep network and we provide the desired output labels. The data is passed through a series of linear and non-linear transformations that map the data to a distribution over all the labels. The weights of these transformations are learned during the training phase. These weights can also be called feature extractors. That is, we can view a deep network as a series of feature extractors that extract features of higher complexity, the deeper in the network you are. The network learns to extract good features without the need of a domain expert. This marks a conceptual shift in thinking from how do we engineer the best features to how do we engineer a model that will find the best features automatically? Now, that we have compared deep learning to classical machine learning, let's discuss some common deep learning frameworks that are used to construct deep learning models. There are a number of common frameworks, you can think of these frameworks as libraries for deep learning, that facilitate designing deep learning models. They hide the low-level implementations from the deep learning practitioner. Frameworks such as Neon, Caffe2, and Caffe are well suited for industry due to their stability, scalability, and speed. Frameworks such as the Theano, PyTorch and Torch7 are well suited for research-based applications due to their flexibility and ability to be debugged. Frameworks such as Tensorflow, and MXNet attempt to call both industry and research domains. There are other popular frameworks such as CNTK, PaddlePaddle, Chainer, Keras, Lasagne, BigDL, and DL4J. While in this course, we will focus on Neon, the concepts are general and apply to various frameworks. Neon is an open source framework available on github. It is designed for speed in both design and performance. Benchmarks such as the common end benchmark shows that Neon is the fastest framework for a variety of workload. Intel is committed to Neon's performance leadership across all the main hardware platform. Neon is built on top of Python and optimized at the low level with C++, Guta C++ and SAS. Python is fast to prototype and allows Neon users to take advantage of the rich ecosystem of existing packages. Training a model only requires the six steps shown here. Once you import the right libraries, the lines of code shown are sufficient to train a simple model. First, specify the backend. For example, CPU. Second, load the data. For example, load the MNIST data set. MNIST is a popular data set with images for the digits zero through nine. Third, specify the model architecture. In this case, a two layer network with each layer having ten units. Fourth, define the training parameters. Fifth, train your model. Then six, evaluate the model. In the next lecture, we'll study the various units, layers, and hyperparameters that needs to be defined. A common question is how do you choose a model? Deep learning practitioners often rely on models from Babylonian researchers as a basis. That is, you find an existing model and a hyperparameters that work on a similar problem and adapted to work for your task. Here are the various functions supported by Neon. The list continues to grows. Neon supports various backends, can easily done or various datasets, offers various techniques to initialize the model weight and various optimizers to train the weights. A model in Neon can use various activations and layers as well as various cost functions and metrics to evaluate the results. I encourage you to download and look through the provided examples. In the first exercise, you will become familiar with the MNIST dataset, the Gaussian initializer, the Gradient Descent with Momentum Optimizer, and you will use Rectified linear unit activations, linear layers, multiclass cross entropy cost, and accuracy metric. Today, we started by looking at some applications of deep learning including image and speech recognition. And then, discuss the history of deep learning and how it compares to classical machine learning. Finally, we looked for some common deep learning frameworks and gave a brief introduction to Neon. In the next lecture, we will look at deep learning models with a more detailed lengths by discussing the architecture and hyperparameters of a deep learning model.