The open source distributed database management system – Cassandra, is designed to provide high availability with no point of failure while handling massive data sizes across various commodity servers.
The Cassandra training course is designed to build a deep understanding of Apache Cassandra for processing very large volumes of data streaming at high speeds to retrieving valuable insights from this data.
Duration: 24 Hours
Basic knowledge of Linux
Introduction to Big Data / NoSQL
A brief into NoSQL
CAP theorem
When to use NoSQL
Columnar storage
NoSQL ecosystem
Cassandra Basics
Architecture and Design
Cassandra nodes, clusters, datacenters
Keyspaces, tables, rows and columns
Partitioning, replication, tokens
Quorum and consistency levels
Data Modeling basic to advanced
A brief into CQL
CQL Datatypes
Creating keyspaces & tables
Choosing columns and types
Choosing primary keys
Data layout for rows and columns
Time to live (TTL)
Querying with CQL
CQL updates
Collections (list / map / set)
Creating and using secondary indexes
Composite keys (partition keys and clustering keys)
Time series data
Best practices for time series data
Counters
Lightweight transactions (LWT)
Labs : creating and using indexes; modeling time series data
Cassandra Internals
Deep dive into the Cassandra design
Sstables, memtables, commit log
Administration Cassandra
Hardware selection
Cassandra distributions
Cassandra Nodes Communication
Writing and Reading data to/from the storage engine
Data directories
Anti-entropy operations
Cassandra Compaction
Choosing and Implementing compaction strategies
Cassandra best practices for garbage collection, composition, etc