All Tutorials

Tuning Apache Spark: Powerful Big Data Processing Recipes Course

Tuning Apache Spark: Powerful Big Data Processing Recipes Course
Tuning Apache Spark: Powerful Big Data Processing Recipes Course

Tuning Apache Spark: Powerful Big Data Processing Recipes Course

Uncover the lesser-known secrets of powerful big data processing with Spark and Kafka

What you’ll learn

Tuning Apache Spark: Powerful Big Data Processing Recipes Course

  • How to attain a solid foundation in the most powerful and versatile technologies involved in data streaming: Apache Spark and Apache Kafka
  • Form a robust and clean architecture for a data streaming pipeline
  • Ways to implement the correct tools to bring your data streaming architecture to life
  • How to create robust processing pipelines by testing Apache Spark jobs
  • Learn How to create highly concurrent Spark programs by leveraging immutability
  • How to solve repeated problems by leveraging the GraphX API
  • How to solve long-running computation problems by leveraging lazy evaluation in Spark
  • Tips to avoid memory leaks by understanding the internal memory management of Apache Spark
  • Troubleshoot real-time pipelines written in Spark Streaming


  • To pick up this course, you don’t need to be an expert with Spark. Customers should be familiar with Java or Scala.


Video Learning Path Overview

A Learning Path is a specially tailored course that brings together two or more different topics that lead you to achieve an end goal. Much thought goes into the selection of the assets for a Learning Path, and this is done through a complete understanding of the requirements to achieve a goal.

Today, organizations have a difficult time working with large datasets. This is where data streaming and Spark come in.

Beginning with a step by step approach, you’ll get comfortable in using Spark and will learn how to implement some practical and proven techniques to improve particular aspects of programming and administration in Apache Spark. You’ll be able to perform tasks and get the best out of your databases much faster.

The simple and practical solutions provided will get you back in action in no time at all!

By the end of the course, you will be well versed in using Spark in your day to day projects.

Key Features

  • From blueprint architecture to complete code solution, this course treats every important aspect involved in architecting and developing a data streaming pipeline
  • Test Spark jobs using the unit, integration, and end-to-end techniques to make your data pipeline robust and bulletproof.
  • Solve several painful issues like slow-running jobs that affect the performance of your application.

Author Bios

  • Anghel Leonard is currently a Java chief architect. He is a member of the Java EE Guardians with 20+ years’ experience. He has spent most of his career architecting distributed systems. Here is also the author of several books, a speaker, and a big fan of working with data.
  • Tomasz Lelek is a Software Engineer, programming mostly in Java and Scala. He has been working with the Spark and ML APIs for the past 5 years with production experience in processing petabytes of data. Recently he was a speaker at conferences in Poland, Confitura and JDD (Java Developers Day), and at Krakow Scala User Group. He has also conducted a live coding session at Geecon Conference. Is a co-founder of initlearn, an e-learning platform that was built with the Java language.

Who this course is for:

  • An Application Developer, Data Scientist, Analyst, Statistician, Big data Engineer, or anyone who has some experience with Spark will feel perfectly comfortable in understanding the topics presented. They usually work with large amounts of data on a day to day basis.
  • Content From:
  • Projects in HTML5 Course – Learn HTML5

Tuning Apache Spark: Powerful Big Data Processing Recipes Course

Download Tutorial (Size: 5.0 GB)