Greens Technologys located in Adyar and OMR provides Apache Spark training in Chennai to provide knowledge and skills to become a successful Spark Developer and prepare you for the Cloudera Certified Associate Spark Hadoop Developer Certification Exam CCA175.
You will get in-depth knowledge of concepts such as HDFS, Flume, Sqoop, RDDs, Spark Streaming, MLlib, SparkSQL, Kafka cluster & API by taking this Apache Spark Course in Chennai.
The Apache Spark Training course in Chennai enables you to master the essential skills in Apache Spark & Scala such as Real-time processing, Spark SQL, Spark streaming, Machine learning programming, GraphX programming, and Shell scripting spark.
Content:
- SCALA (Object Oriented and Functional Programming)
- Getting started With Scala.
- Scala Background, Scala Vs Java and Basics.
- Interactive Scala – REPL, data types, variables,expressions, simple functions.
- Running the program with Scala Compiler.
- Explore the type lattice and use type inference
- Define Methodsand Pattern Matching.
- Scala Environment Set up.
- Scala set up on Windows.
- Scala set up on UNIX.
- Functional Programming.
- What is Functional Programming.
- Differences between OOPS and FPP.
- Collections (Very Important for Spark)
- Iterating, mapping, filtering and counting
- Regular expressions and matching with them.
- Maps, Sets, group By, Options, flatten, flat Map
- Word count, IO operations,file access, flatMap
- Object Oriented Programming.
- Classes and Properties.
- Objects, Packaging and Imports.
- Traits.
- Objects, classes, inheritance, Lists with multiple related types, apply
- Integrations
- What is SBT?
- Integration of Scala in Eclipse IDE.
- Integration of SBT with Eclipse.
- SPARK CORE.
- Batch versus real-time data processing
- Introduction to Spark, Spark versus Hadoop
- Architecture of Spark.
- Coding Spark jobs in Scala
- Exploring the Spark shell -> Creating Spark Context.
- RDD Programming
- Operations on RDD.
- Transformations
- Actions
- Loading Data and Saving Data.
- Key Value Pair RDD.
- Broad cast variables.
- Persistence.
- Configuring and running the Spark cluster.
- Exploring to Multi Node Spark Cluster.
- Cluster management
- Submitting Spark jobs and running in the cluster mode.
- Developing Spark applications in Eclipse
- Tuning and Debugging Spark.
- CASSANDRA (N0SQL DATABASE)
- Learning Cassandra
- Getting started with architecture
- Installing Cassandra.
- Communicating with Cassandra.
- Creating a database.
- Create a table
- Inserting Data
- Modelling Data.
- Creating an Application with Web.
- Updating and Deleting Data.
- SPARK INTEGRATION WITH NO SQL (CASSANDRA) and AMAZON EC2
- Introduction to Spark and Cassandra Connectors.
- Spark With Cassandra -> Set up.
- Creating Spark Context to connect the Cassandra.
- Creating Spark RDD on the Cassandra Data base.
- Performing Transformation and Actions on the Cassandra RDD.
- Running Spark Application in Eclipse to access the data in the Cassandra.
- Introduction to Amazon Web Services.
- Building 4 Node Spark Multi Node Cluster in Amazon Web Services.
- Deploying in Production with Mesos and YARN.
- SPARK STREAMING
- Introduction of Spark Streaming.
- Architecture of Spark Streaming
- Processing Distributed Log Files in Real Time
- Discretized streams RDD.
- Applying Transformations and Actions on Streaming Data
- Integration with Flume and Kafka.
- Integration with Cassandra
- Monitoring streaming jobs.
- SPARK SQL
- Introduction to Apache Spark SQL
- The SQL context
- Importing and saving data
- Processing the Text files,JSON and Parquet Files
- DataFrames
- user-defined functions
- Using Hive
- Local Hive Metastore server
- SPARK MLIB.
- Introduction to Machine Learning
- Types of Machine Learning.
- Introduction to Apache Spark MLLib Algorithms.
- Machine Learning Data Types and working with MLLib.
- Regression and Classification Algorithms.
- Decision Trees in depth.
- Classification with SVM, Naive Bayes
- Clustering with K-Means
- Building the Spark server
Apache Spark Training Objectives:
- Understand what is Apache Spark and Scala programming
- Understand the difference between Apache Spark and Hadoop
- Learn Scala and its programming implementation
- Implement Spark on a cluster
- Write Spark Applications using Python, Java and Scala
- Understand RDD and its operation along with implementation of Spark Algorithms
- Define and explain Spark Streaming
- Learn about the Scala classes concept and execute pattern matching
- Learn Scala Java Interoperability and other Scala operations
- Work on Projects using Scala to run on Spark applications