Spark includes an API named Spark MLLib (often referred to as Spark ML), which you can use to create machine learning solutions. Machine learning is a technique in which you train a predictive model using a large volume of data so that when new data is submitted to the model it can predict unknown values. The most common types of machine learning are supervised learning and unsupervised learning. In a supervised learning scenario, you start with a large volume of data that includes both features (categorical and numeric values that describe characteristics of the entity you’re trying to predict something about) and labels (the value your model will predict. Training the model involves applying a statistical algorithm that fits the features to the labels. Because your initial data includes known values for the labels, you can train the model and test its accuracy with these known label values – giving you confidence that the model will work accurately with new data for which the label values aren’t known. Unsupervised learning is a technique in which there are no known label values, and the model is trained to group (or cluster) similar entities together based on their features.
In this lab, we’ll focus on supervised learning; and specifically a type of machine learning called classification in which you train a model to identify which category, or class an entity belongs to. You will train a classifier to use features of flights that are enroute to an airport, and predict whether they will be late or on-time.
- Understand creation of an Azure Databricks Workspace and cluster using the Azure Portal
- Learn to work with Spark MLLib on Azure Databricks
Accessing and ending the Lab Environment
SkillMeUp Real Time Labs use a virtual machine for all lab exercises. This allows you access to all of the tools and software needed to complete the lab without requiring you to install anything on your local computer.
The virtual machine may take several minutes to fully provision due to software installation and supporting files to copy.
After you have completed all of the lab exercises ensure you click the End Lab button to get access to your certification of completion.
Accessing Microsoft Azure
Launch a browser from the virtual machine and navigate to the URL below. Your Azure Credentials are available by clicking the Cloud Icon at the top of the Lab Player.