Invalid Login Attempt

Lab: Implementing Structured Streaming with Azure Databricks

Overview
Spark structured streaming enables you to use the dataframe API to read and process an unbounded stream of data. This kind of processing is used in real-time scenarios to aggregate data over temporal intervals or windows. You can use Spark to process streaming data from a wide range of sources, including Azure Event Hubs, Kafka, and others. In this lab, you will run a Spark job to continually process a real-time stream of data.

Details
  • Estimated time required to complete: 1 hours, 0 minutes
  • You will have access to this environment for 2 hours, 0 minutes
  • Learning Credits Required: 10
Who this lab is designed for
  • Data Professionals
  • Data Engineers
  • Data Scientists

Learning Objectives

  • Understand creation of an Azure Databricks Workspace and cluster using the Azure Portal
  • Learn to work with Spark Structured Streaming on Azure Databricks

Exercises

Exercise 1: Environment Setup

Accessing and ending the Lab Environment

SkillMeUp Real Time Labs use a virtual machine for all lab exercises. This allows you access to all of the tools and software needed to complete the lab without requiring you to install anything on your local computer.

The virtual machine may take several minutes to fully provision due to software installation and supporting files to copy.

After you have completed all of the lab exercises ensure you click the End Lab button to get access to your certification of completion.

Accessing Microsoft Azure

Launch a browser from the virtual machine and navigate to the URL below. Your Azure Credentials are available by clicking the Cloud Icon at the top of the Lab Player.

https://portal.azure.com

Exercise 2: Deploying the Databricks Environment

In this exercise, you will provision provision a Databricks workspace, an Azure storage account, and a Spark cluster.

Task 1: Provision a Databricks Workspace

Exercise 3: Processing Streaming Data with Spark Structured Streaming
In this exercise, you will process a stream of data that simulates status information generated by Internet-of-things (IoT) devices. The data will be written to a blob storage container where it can be accessed by your Spark cluster.

Login to Start Lab


Not Registered? Already Registered?
Benefits
Real Time Labs allow you to learn technology in an isolated environment without the hassle or cost of setting up a dedicated learning environment.

How it works