Apache Spark Streaming with Scala Training Course

319 Phaya Thai Road, Pathum Wan, Pathum Wan District, Bangkok, Thailand

Description

Scala is a condensed version of Java for large scale functional and object-oriented programming. Apache Spark Streaming is an extended component of the Spark API for processing big data sets as real-time streams. Together, Spark Streaming and Scala enable the streaming of big data.

This instructor-led, live training (online or onsite) is aimed at software engineers who wish to stream big data with Spark Streaming and Scala.

By the end of this training, participants will be able to:

  • Create Spark applications with the Scala programming language.
  • Use Spark Streaming to process continuous streams of data.
  • Process streams of real-time data with Spark Streaming.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.

Course Details :

Course Code : sparkscala
Duration : 21 hours (usually 3 days including breaks)
Requirements
– Programming and scripting experience
Audience
– Software Engineers
Fees : 94820 THB(Price per participant)
Venue : Regus, Bangkok

Course Outline

Introduction

Scala Programming in Depth Review

  • Syntax and structure
  • Flow control and functions

Spark Internals

  • Resilient Distributed Datasets (RDD)
  • Spark script to graph to cluster

Overview of Spark Streaming

  • Streaming architecture
  • Intervals in streaming
  • Fault tolerance

Preparing the Development Environment

  • Installing and configuring Apache Spark
  • Installing and configuring the Scala IDE
  • Installing and configuring JDK

Spark Streaming Beginner to Advanced

  • Working with key/value RDD’s
  • Filtering RDD’s
  • Improving Spark scripts with regular expressions
  • Sharing data on a cluster
  • Working with network data sets
  • Implementing BFS algorithms
  • Creating Spark driver scripts
  • Tracking in real time with scripts
  • Writing continuous applications
  • Streaming linear regression
  • Using Spark Machine Learning Library

Spark and Clusters

  • Bundling dependencies and Spark scripts using the SBT tool
  • Using EMR for illustrating clusters
  • Optimizing by partitioning RDD’s
  • Using Spark logs

Integration in Spark Streaming

  • Integrating Apache Kafka and working with Kafka topics
  • Integrating Apache Fume and working with pull-based/push-based Flume configurations
  • Writing a custom receiver class
  • Integrating Cassandra and exposing data as real-time services

In Production

  • Packaging an application and running it with Spark-Submit
  • Troubleshooting, tuning, and debugging Spark Jobs and clusters

Summary and Conclusion

REGISTER FOR THE COURSE

Course Schedule

Add A Review


Please enter input field

sanu
NobleProg
Training Course
Training Institute

Book your course now

Enquiry

Your enquiry submitted successfully

Enquiry Submission failed

Please enter input field(s)

Claim this course

To manage this course details kindly claim this course.