Even data that has a schema might still be unstructured if it's not useful for your intended purpose. Here's an example: Imagine that you're selling products online. After the product is delivered an email is sent out asking for feedback about the experience. Upon reviewing the first dozen, or some emails you begin to regret not sending some survey. Because compiling the results of the text from each email is going to be impossible. For the purpose of identifying best practices and worst practices, the email text data is unstructured. However, you could use sentiment analysis to tag the emails and to group them. Let the machine learning do the reading for you and sort the emails into representative groups. Now, you can look at the most positive and most negative emails to identify what behaviors to enforce or avoid. The machine learning process turn the unstructured data into structured data for your purposes. Distinguish between one off reasoning problems that are best solved by humans and big data problems that can be solved by crunching a lot of data and machine learning problems that are best solved using modeling. I was once asked if a machine learning model could distinguish upside down images from right-side up images. Could you train a model to do that? I suppose so, but most modern cameras add metadata into the image header about the orientation of the camera at the time the image was taken. That data is accurate and easily accessed. So in this case, reading the metadata would be a better solution than training a machine learning model. It's important to recognize that machine learning has two stages, training and inference. Sometimes, the term prediction is preferred over inference because it implies a future state. For example, recognizing the image of the cat is not really predicting it to be a cat, it's really inferring from pixel data that a cat is represented in the image. Data engineers often focus on training the model and minimize or forget about inference. It's not enough to build a model, you need to operationalize it. You need to put it into production so that it can run inferences. If you have an ML question that refers to labels, it is a question about supervised learning. If the question is about regression or classification, it's using supervised machine learning. A very common source of structured data for machine learning is your data warehouse. Unstructured data includes things like pictures, audio or video, and freeform text. People sometimes forget that structured data might make great training data because it's already pre tagged. This example shows that birth data can be used to train a model to predict births. Another example I like to use is real estate data. There's a ton of information online about houses. How big they are? How many bedrooms? So forth and also the history of one house is sold and how much was paid for them? This is great training data for building a home pricing evaluation model. In other words, the goal would be to describe the house to the Machine Learning model and have it returned a price of what the house might be worth. If you don't define a metric or measure how well your model works, how will you know it's working sufficiently to be useful for your business purpose. You should be familiar with Mean Square Error or MSE. Gradient descent is an important method understand it's how an ML problem is turned into a search problem. MSC and RMSE or MC are measures of how well the model fits reality. How well the model works to categorize or predict. The root of the Mean Square Error, RMSE. One reason for using the root of the Mean Square Error rather than the Mean Square Error is because the RMSE is in the units of the measurement, making it easier to read and understand the significance of the value. Categorizing produces discrete values, and regression produces continuous values. Each uses different methods. Is the result you're looking for like deciding whether an instances in category A or category B? If so, it's a discrete value and therefore uses classification. If the result you're looking for is more like a number like the current value of a house, if so, it's a continuous value and therefore uses regression. If the question describes cross entropy it's a classification ML problem.