• Hinjawadi Pune: +91 7410071951 / 7410071952 / 9881208115
  • Karad: +91 9561213447 / 9881208114

PG Diploma in Big Data Analytics (PG-DBDA)

Installation (Ubuntu and CentOS), Basics of Linux, Configuring Linux, Shells, Commands, and Navigation, Common Text Editors, Administering Linux, Introduction to Users and Groups, Linux shell scripting, shell computing, Introduction to enterprise computing, Remote access.

Introduction to Cloud Computing: Cloud Computing Basics, Understanding Cloud Vendors (AWS/Azure/GCP), Definition, Characteristics, Components, Cloud provider, SAAS, PAAS, IAAS and other Organizational scenarios of clouds, Administering & Monitoring cloud services, benefits and limitations, Deploy application over cloud. Comparison among SAAS, PAAS, IAAS, Cloud Products and Solutions, Cloud Pricing, Compute Products and Services, Elastic Cloud Compute, Dashboard, Launching Linux VM, Accessing Linux VM, Launching and Accessing Windows server VM, Launching WordPress website, Storage, Databases, Migration Hub, Security, identity and Compliance, Monitoring and Management Services, Analytics on Cloud, Machine Learning framework on Cloud, Hadoop Framework on cloud

Download Admission Booklet

Python Programming: Python basics, If, If- else, Nested if-else, Looping, For, While, Nested loops, Control Structure, Break, Continue, Pass, Strings and Tuples, Accessing Strings, Basic Operations, String slices, Working with Lists, Accessing list, Operations, Function and Methods, Files, Modules, Dictionaries, Functions and Functional Programming, Declaring and calling Functions, Declare, assign and retrieve values from Lists, Introducing Tuples, Accessing tuples, Visualizing using  Matplotlib, Seaborn, OOPs concept, Class and object, Attributes, Inheritance, Overloading, Overriding, Data hiding, Operations Exception, Exception Handling, except clause, Try-finally clause, User Defined Exceptions, Data wrangling, Data cleaning

R Programming: Reading and Getting Data into R, Exporting Data from R, Data Objects-Data Types & Data Structure. Viewing Named Objects, Structure of Data Items, Manipulating and Processing Data in R (Creating, Accessing, Sorting data frames, Extracting, Combining, Merging, reshaping data frames), Control Structures, Functions in R (numeric, character, statistical), working with objects, Viewing Objects within Objects, Constructing Data Objects, Packages – Tidyverse, Dplyr, Tidyr etc., Queuing Theory, Non parametric Tests- ANOVA, chi-Square, t-Test, U-Test, Interactive reporting with R markdown, Introduction to Rshiny 

Download Admission Booklet

Oops Concepts, Data Types, Operators and Language, Constructs, Inner Classes and Inheritance, Interface and Package, Exceptions, Collections, Threads, Java.lang, Java.util, Java Virtual Machine, Reflection in JVM, JVM’s architecture, Lambda Expressions, Functional Programming and Interfaces, Introduction to Streams, Introduction of JDBC API

Download Admission Booklet

Introduction to Business Analytics using some case studies, Summary Statistics, Making Right Business Decisions based on data, Statistical Concepts, Descriptive Statistics and its measures, Probability theory, Probability Distributions (Continuous and discrete- Normal, Binomial and Poisson distribution) and Data, Sampling and Estimation, Statistical Interfaces, Predictive modeling and analysis, Bayes’ Theorem, Central Limit theorem, Data Exploration & preparation, Concepts of Correlation,  Covariance, Outliers, Regression Analysis, Forecasting Techniques, Simulation and Risk Analysis, Optimization, Linear, Nonlinear, Integer, Overview of Factor Analysis, Directional Data Analytics, Functional Data Analysis , Predictive Modelling (From Correlation To Supervised Segmentation): Identifying Informative Attributes, Segmenting Data By Progressive Attributive, Models, Induction And Prediction, Supervised Segmentation, Visualizing Segmentations, Trees As Set Of Rules, Probability Estimation; Overfitting And Its Avoidance: Generalization, Holdout Evaluation Vs Cross Validation; Decision Analytics: Evaluating Classifiers, Analytical Framework, Evaluation, Baseline, Performance And Implications For Investments In Data; Evidence And Probabilities: Explicit Evidence Combination With Bayes Rule, Probabilistic Reasoning; Business Strategy: Achieving Competitive Advantages, Sustaining Competitive Advantages

Python Libraries – Pandas, Numpy, Scipy, Scrapy, Plotly, Beautiful soup

Download Admission Booklet

Database Concepts (File System and DBMS), OLAP vs OLTP, Database Storage Structures (Tablespace, Control files, Data files), Structured and Unstructured data, SQL Commands (DDL, DML & DCL), Stored functions and procedures in SQL, Conditional Constructs in SQL, data collection, Designing Database schema, Normal Forms and ER Diagram, Relational Database modelling, Stored Procedures. The tools and how data can be gathered in a systematic fashion, Data ware Housing concept, No-SQL, Data Models - XML, working with MongoDB, Cassandra- overview, architecture, comparison with MongoDB, working with Cassendra, Connecting DB’s with Python, Introduction to Data Driven Decisions, Enterprise Data Management, data preparation and cleaning techniques

Download Admission Booklet

Introduction to Big Data- Big Data - Beyond The Hype, Big Data Skills And Sources Of Big Data, Big Data Adoption, Research And Changing Nature Of Data Repositories, Data Sharing And Reuse Practices And Their Implications For Repository Data Curation, Overlooked And Overrated Data Sharing, Data Curation Services In Action, Open Exit: Reaching The End Of The Data Life Cycle, The Current State Of Meta-Repositories For Data, Curation Of Scientific Data At Risk Of Loss: Data Rescue And Dissemination

Hadoop: Introduction of Big data programming-Hadoop, The ecosystem and stack, The Hadoop Distributed File System (HDFS), Components of Hadoop, Design of HDFS, Java interfaces to HDFS, Architecture overview, Development Environment, Hadoop distribution and basic commands, Eclipse development, The HDFS command line and web interfaces, The HDFS Java API (lab), Analyzing the Data with Hadoop, Scaling Out, Hadoop event stream processing, complex event processing, MapReduce Introduction, Developing a Map Reduce Application, How Map Reduce Works, The MapReduce Anatomy of a Map Reduce Job run, Failures, Job Scheduling, Shuffle and Sort, Task execution, Map Reduce Types and Formats, Map Reduce Features, Real-World MapReduce,

Hadoop Environment: Setting up a Hadoop Cluster, Cluster specification, Cluster Setup and Installation, Hadoop Configuration, Security in Hadoop, Administering Hadoop, HDFS – Monitoring & Maintenance, Hadoop benchmarks,

Apache Airflow: Introduction to Data warehousing and Data lakes, Designing Data warehousing for an ETL Data Pipeline, Designing Data Lakes for ETL Data Pipeline, ETL vs ELT

Introduction to HIVE, Programming with Hive: Data warehouse system for Hadoop, Optimizing with Combiners and Practitioners (lab), Bucketing, more common algorithms: sorting, indexing and searching (lab), Relational manipulation: map-side and reduce-side joins (lab), evolution, purpose and use, Case Studies on Ingestion and warehousing

HBase: Overview, comparison and architecture, java client API, CRUD operations and security

Apache Spark APIs for large-scale data processing: Overview, Linking with Spark, Initializing Spark, Resilient Distributed Datasets (RDDs), External Datasets, RDD Operations, Passing Functions to Spark, Job optimization, Working with Key-Value Pairs, Shuffle operations, RDD Persistence, Removing Data, Shared Variables, EDA using PySpark, Deploying to a Cluster Spark Streaming, Spark MLlib and ML APIs, Spark Data Frames/Spark SQL, Integration of Spark and Kafka, Setting up Kafka Producer and Consumer, Kafka Connect API, Mapreduce, Connecting DB’s with Spark

Download Admission Booklet

Business Intelligence- requirements, content and managements, information Visualization, Data analytics Life Cycle, Analytic Processes and Tools, Analysis vs. Reporting, MS Excel: Functions, Formula, charts, Pivots and Lookups, Data Analysis Tool pack: Descriptive Summaries, Correlation, Regression, Introduction to Power BI, Modern Data Analytic Tools, Visualization Techniques, Visual Encodings, Visualization algorithms, Data collection and binding, Cognitive issues, Interactive visualization, Visualizing big data – structured vs unstructured, Visual Analytics, Geo-mapping, Dashboard Design

Case Studies on Business intelligence, Analytics, Industry/Enterprise reports etc.

emoving Data, Shared Variables, Deploying to a Cluster

Apache Phoenix : Apache Phoenix Overview, Need of Phoenix, Features, Installation and Configurations, Views and Multi Tenancy, Secondary Indexes, Joins, Query Optimizations, Roadmap of Phoenix.

Download Admission Booklet

Supervised and Unsupervised Learning , Uses of Machine learning , Clustering, K means, Hierarchical Clustering, Decision Trees, Classification problems, Bayesian analysis and Naïve Bayes classifier, Random forest, Gradient boosting Machines, Association rules learning, PCA, Apriori, Support vector Machines, Linear and Non liner classification,  ARIMA, XG Boost, CAT Boost, Neural Networks and its application, Tensorflow 2.x framework: Deep learning algorithms, KNN, NLP, Bert in NLP,NLP transformers, NLTK, Introduction to Pytorch framework, AI and its application

Download Admission Booklet

Software: A Process, Various Phases in s/w Development, Software life cycle agile model (Self Study of other models), Introduction to Coding Standards, Software Quality Assurance

Download Admission Booklet
Download Admission Booklet

Contact us

Sunbeam Hinjawadi Pune

Authorized Training Center of C-DAC

"Sunbeam IT Park", Ground Floor, Phase 2 of Rajiv Gandhi Infotech Park, Hinjawadi, Pune - 411057, MH-INDIA

+91 7410 071 951 / 7410 071 952
Sunbeam Karad

Authorized Training Center of C-DAC

'Anuda Chambers', 203 Shaniwar Peth, Near Gujar Hospital, Karad - 415 110,     Dist. Satara, MH-INDIA.

02164 - 225500 / 225800