• Pune : +91-20-2427 2383 / 2426 4291 / 2426 0308 / 7410 071 951
  • Karad : 02164 - 225500 / 225800

Big Data - Apache Spark

Course Name : Big Data - Apache Spark

Duration : 60 hours

Fees : Rs. 15500/- (Incl 18% GST)

  • Students and Freshers.
  • Professionals willing to switch to Big Data / Spark developer stream.
Click to Register
  • Core Java programming skills
  • Any RDBMS (like Oracle or MySQL)
  • XML awareness, Linux commands familiarity
Click to Register
  • Core i3 (64-bit) and above
  • RAM min 8 GB. Recommended: 16 GB+.
  • 64-bit Linux – Ubuntu or CentOS – Java 8.0 JDK installed.
Click to Register
  • Data science (math/stat) - However implementation of stats formulae in Spark job will be covered.
  • Machine Learning - However simple ML program using Spark MLLib will be demonstrated.
  • Spark administration - However some basic config and performance related config will be discussed.
  • Reporting and visualization tools.
  • Spark cluster on cloud - However multi-node cluster with minimal configuration will be taught.
     
Click to Register

Java8 Functional Programming

  • Functional interfaces
  • Method references
  • Lambda expressions
  • Java8 Streams & operations

 

Scala Programming Language

  • Scala Introduction & REPL
  • Data types & Variables
  • Basic programming constructs
  • Functions: Simple, Anonymous, Currying, High Order, Pure, Closures.
  • Collections: Array, Tuple, List, Map, ...
  • Functional Programming, Lambda expressions, apply(), 
  • OOP: Class, Getter/Setter, Constructors, Inheritance, Overriding, abstract, Traits
  • Case classes & Pattern matching
  • Generics, Variances
  • Companion Objects
  • Functors & Monads

 

Apache Spark2

  • Apache Spark 2
  • Distributed Computing Challenges
  • Spark & Hadoop
  • Spark Architecture & Components
  • Spark Installation & Deployment
  • Spark Shell
  • Spark Web UI

 

Apache Spark2 - Spark Core

  • Spark RDD, Transformations & Actions, Data Load & Save
  • Types of RDD: Key-value, Two Pair, ...
  • RDD Internals: Distributed/Partitions, Lineage, Persistence
  • Spark in Eclipse IDE
  • Implementing & Submitting Spark Job
  • Execution of Spark Job (RDD)

 

Apache Spark2 - Spark SQL

  • Spark SQL Introduction
  • Architecture
  • SQLContext
  • Data Frames & Datasets
  • Implementing & Executing Spark SQL job
  • Interoperating with RDDs
  • User Defined Functions
  • File Formats & Loading data
  • Spark-Hive Integration

 

Apache Spark2 - Spark Streaming

  • Apache Kafka Introduction
  • Kafka Architecture
  • Kafka Cluster Components
  • Kafka Cluster Configuration
  • Kafka Java API
  • Kafka Applications
  • Spark Streaming Introduction
  • Features & Workflow
  • Streaming Context & DStreams
  • Transformations on DStreams
  • Windowing Concept, Windowed Operators:Slice, Window and ReduceByWindow, Stateful Operators
  • Streaming Data Source
  • Apache Kafka Data Sources

 

Apache Spark2 - Spark ML Introduction

  • Machine Learning Quick Overview
  • Introduction to Spark Machine Learning
  • Implement an ML program using MLLib
Click to Register
Click to Register

Contact us

Pune Centre

SunBeam Institute of Information Technology, Pune

'Sunbeam', Plot No.R/2, Market Yard Road, Behind Hotel Fulora, Gultekdi,    Pune - 411 037. MH-INDIA.

+91-20-2427 2383 / 2426 4291 / 2426 0308 / 7410 071 951
Karad Centre

SunBeam Institute of Information Technology, Karad

'Anuda Chambers', 203 Shaniwar Peth, Near Gujar Hospital, Karad - 415 110,     Dist. Satara, MH-INDIA.

02164 - 225500 / 225800