Cassandra

by MobiGnosis Claim Listing

The large volume and extensive variety of data that is required by today’s business processes need for a highly available, low latency database. Apache Cassandra delivers this solution by enabling high-speed reads and writes across a replicated, distributed system.

Price : Enquire Now

Contact the Institutes

Fill this form

Advertisement

MobiGnosis Logo

img Duration

1.5 Month

Course Details

The large volume and extensive variety of data that is required by today’s business processes need for a highly available, low latency database. Apache Cassandra delivers this solution by enabling high-speed reads and writes across a replicated, distributed system.

This Apache Cassandra training program provides data modeling experience in order to take advantage of Cassandra’s linearly scalable peer-to-peer design.

The evolution of Big Data is now evolving the landscape of big businesses. While this raw data is difficult to harness, Apache Cassandra, the open source NoSQL distributed database management system is able to handle large amounts of data across many commodity servers.

MobiGnosis’ Cassandra training program will teach you all about the fundamentals of Cassandra, starting from the basics to the more advanced methodologies.

Here you will learn Cassandra Data models, Cassandra Architecture, about configuration, reading and writing data and integrating it with Hadoop from our Cassandra training which also includes practice sessions for your better understanding. Knowledge of this new-age technology is just what you require to have a successful career and our trainers will help you excel in it!

 

Key Learnings:

  • Architect Cassandra databases and implementation of commonly used design patterns
  • Model data in Cassandra based on query patterns
  • Access Cassandra databases using CQL and Java
  • Create a balance between read/write speed and data consistency
  • Integrate Cassandra with Hadoop, Pig, and Hive
  • How and where to use Cassandra and the core concepts that drive this database.
  • Learn how to use the fault-tolerant and high availability feature of Cassandra
  • Understand the Apache Cassandra architecture and the more complex inner workings such as gossip protocol, read repairs and Merkle trees
  • How to properly identify requirements and create a Cassandra data model by applying data modelling techniques
  • How to integrate Cassandra with Hadoop and use tools like Pig and Hive

 

Topics Covered During Classroom:

  • 1. Basics
  • Revise CAP Theorem
  • Good fit use cases
  • 2. Concepts
  • Who uses it?
  • Database or Datastore?
  • Masterless architecture
  • Seed node(s)
  • Gossip
  • Detecting a failed node
  • Replication
  • Partitioner
  • Snitch – summary
  • Snitch – property file, when to use? An example
  • Virtual node, ring architecture
  • Commodity vs Specialized hardware
  • Bootstrap process
  • Elastic linear scalability
  • Debate – heterogeneous machines, adding capacity
  • Deployment – 4 dimensions
  • Distributed workloads, Multi-DC setup
  • Regions and Zones (Cloud setup)
  • SEDA
  • 3. Setup and installation
  • Acquiring and Installing C*
  • Understand the key components of Cassandra.yaml
  • Configuring and Installation structure
  • Directories – Data, Commit Log, Cache
  • System log configuration
  • Nodetool & CqlSh
  • 4. Concepts II
  • Keyspace
  • Admin/system keyspace
  • Column Family / Table
  • Primary key components
  • Visualizing PK based storage, on disk cells & row sizing
  • Fault tolerance via replication
  • Coordinator
  • Consistency Levels – read/write, immediate
  • Quorum
  • Applied consistency level – scenario game
  • Inconsistencies across nodes
  • Anti-entropy op & Read repair
  • Hinted handoff
  • Debate – RF change impact
  • 5. Write and Read path
  • Why C* writes fast?
  • Components of the write
  • Storage – a primer
  • A bit more about LSM
  • Write path flow
  • Data state
  • Memtable, SSTables, Commit log
  • When does the flush trigger?
  • Data file name structure
  • Overview of CDC
  • Row cache, Key cache, Chunk cache
  • Bloom filters
  • Key index sample  Partition index sample
  • Read path flow
  • Eager retry
  • Last write wins with tombstone example
  • Compaction
  • NTP and why it is important
  • 6. Modeling
  • QDD
  • De-normalization
  • Row key & data partitioning
  • Visualizing the components of the PK
  • Choice of row key
  • How to fix a wide row?
  • And more.
  • Bangalore Branch

    41, Sri Krishna Mansion, 3rd Floor, S End D Cross Rd, Bangalore

© 2026 coursetakers.com All Rights Reserved. Terms and Conditions of use | Privacy Policy