Banner

Working with Cassandra

Live Classroom
Duration: 3 days
Live Virtual Classroom
Duration: 3 days
Pattern figure

Overview

Cassandra (C*) is a massively scalable NoSQL database that provides high availability and fault tolerance, as well as linear scalability when adding new nodes to a cluster. This course provides an in-depth introduction to working with Cassandra and using it create effective data models, while focusing on the practical aspects of working with C*. The course covers important topics such as internal architecture for making sound decisions, CQL (Cassandra Query Language) as well as Java APIs for writing Cassandra clients.

What You'll Learn

  • Understand the needs addressed by C*
  • Be familiar with the operation and structure of C*
  • Be able to install and set up a C* database
  • Use the C* tools, including cqlsh, nodetool and CCM (Cassandra Cluster Manager)
  • Familiarize with C* architecture and how a C* cluster is structured
  • Understand how data is distributed and replicated in a C* cluster
  • Understand core C* data modelling concepts and use them to create well-structured data models
  • Use data replication and eventual consistency intelligently
  • Understand and use CQL to create tables and query for data
  • Know and use the CQL data types (numerical, textual, uuid, etc.)
  • Understand the various kinds of primary keys available (simple, compound and composite primary keys)
  • Use more advanced capabilities like collections, counters, secondary indexes, CAS (Compare and Set), static columns and batches
  • Familiarize with the Java client API
  • Use the Java client API to write client programs that work with C*
  • Build and use dynamic queries with QueryBuilder
  • Understand and use asynchronous queries with the Java API

Curriculum

  • Why we need Cassandra
  • High level Cassandra overview
  • Cassandra features
  • Basic Cassandra installation and configuration

  • Cassandra architecture overview
  • Cassandra clusters and rings
  • Data replication in Cassandra
  • Cassandra consistency/eventual consistency
  • Introduction to CQL
  • Defining tables with a single primary key
  • Using cqlsh for interactive querying
  • Selecting and inserting/upserting data with CQL
  • Data replication and distribution
  • Basic data types (including uuid, timeuuid)

  • Defining a compound primary key
    • CQL for compound primary keys
    • Partition keys and data distribution
    • Clustering columns
    • Overview of internal data organization
  • Additional querying capabilities
    • Result ordering – ORDER BY and CLUSTERING ORDER BY
    • UPDATE and DELETE queries
    • Result filtering, ALLOW FILTERING
    • Batch queries
  • Data modelling guidelines
    • Denormalization
    • Data modelling workflow
    • Data modelling principles
    • Primary key considerations
  • Composite partition keys
    • Defining with CQL
    • Data distribution with composite partition keys
    • Overview of internal data organization

  • Indexing
    • Primary/partition keys and pagination with token()
    • Secondary indexes and usage guidelines
  • Cassandra counters
    • Counter structure and definition
    • Using counters
    • Counter limitations
    • Cassandra collections
    • Collection structure and uses
    • Defining collections (set, list, and map)
    • Querying collections (including insert, Update, Delete)
    • Limitations
    • Overview of internal storage organization
  • Static column – overview and usage
  • Static column guidelines
  • Materialized view: Overview and usage
  • Materialized view guidelines

  • Overview of consistency in Cassandra
  • CAP theorem
  • Eventual (tunable) consistency in C* – One, Quorum, All
  • Choosing CL One
  • Choosing CL Quorum
  • Achieving immediate consistency
  • Using other consistency levels
  • Internal repair mechanisms (Read repair, hinted handoff)

  • Overview of lightweight transactions
  • Using LWT, the [applied] column
  • IF EXISTS, IF NOT EXISTS, Other IF conditions
  • Basic CAS internals
  • Overhead and guidelines

  • Dealing with Write failure
    • Unavailable Node and NodeFailure
    • Requirements for Write operations
  • Key and row caches
    • Cache overview
    • Usage guidelines
  • Multi-data center support
    • Overview
    • Replication factor configuration
    • Additional Consistency Levels – LOCAL/EACH QUORUM
  • Deletes
    • CQL for Deletion
    • Tombstones
    • Usage Guidelines

  • API Overview
    • Introduction
    • Architecture and Features
  • Connecting to a Cluster
    • Cluster and Cluster.Builder
    • Contact Points, Connecting to a Cluster
    • Session Overview and API
    • Working with Sessions
  • The Query API
    • Overview
    • Dynamic Queries, Statement, SimpleStatement
    • Processing Query Results, ResultSet, Row
    • PreparedStatement, BoundStatement
    • Binding Values and Querying with PreparedStatements
    • CQL to Java Type Mapping
    • Working with UUIDs
    • Working with Time/Date Values
    • Working with Batches of SimpleStatement and PreparedStatement
  • Dynamic Queries and QueryBuilder
    • QueryBuilder Overview and API
    • Building SELECT, DELETE, INSERT, and UPDATE Queries
    • Creating WHERE Clauses
    • Other Query Examples
  • Configuring Query Behavior
    • Setting LIMIT and TTL
    • Working with Consistency
    • Using LWT
    • Working with Driver Policies
    • Load Balancing Policies – RoundRobinPolicy, DCAwareRoundRobinPolicy
    • Retry Policies – DefaultRetryPolicy, DowngradingConsistencyRetryPolicy, Other Policies
    • Reconnection Policies
  • Asynchronous Querying Overview
    • Synchronous vs. Asynchronous Querying
    • Executing Asynchronous Queries
    • util.concurrent.Future
    • Cassandra ResultSetFuture
waves
Ripple wave

Who should attend

The course is highly recommended for –

  • Java developers
  • Database administrators
  • Spring developers
  • Architects
  • Full stack developers/engineers
  • DevOps developers/engineers

Prerequisites

Participants need to have experience in Java development and working with databases. Participants also need to be able to navigate the Linux command line, and have a basic knowledge of Linux editors (such as VI/nano) for editing code.

Interested in this Course?

    Ready to recode your DNA for GenAI?
    Discover how Cognixia can help.

    Get in Touch
    Pattern figure
    Ripple wave