Data Restore

Restoring data from backups is crucial for ensuring data availability, especially in scenarios such as keyspace deletion, launching a new cluster from backup data, or replacing a node. The restoration process typically involves utilizing snapshots and incremental backup files.

There are two primary methods for restoring data from backups:

Using nodetool refresh

The nodetool refresh command enables the loading of newly placed SSTables onto the system without requiring a restart. This method is useful when a new node replaces an unrecoverable node. To restore data from a snapshot using this method, follow these steps:

  • Create the necessary schema if it doesn't already exist.

  • Truncate the table if needed.

  • Locate the snapshot folder (e.g., /var/lib/keyspace_name/table_name-UUID/snapshots/snapshot_name) and copy the snapshot SSTable directory to the /var/lib/keyspace/table_name-UUID directory.

  • Execute the nodetool refresh command.

Using sstableloader:

The sstableloader is a tool for loading a set of SSTable files into a Cassandra cluster. It offers options for loading external data, existing SSTables, and restoring snapshots. To restore data using sstableloader, follow these steps:

  • Create the required schema if it's not already present.

  • Truncate the table if necessary.

  • Bring the backup data to a node from a storage service like AWS S3, Google Cloud, or MS Azure (e.g., download the backup data to /home/data).

  • Run the following command:

    sstableloader -d <ip> /home/data

Note: Replace <ip> with the appropriate IP address.

By following these methods, data can be effectively restored from backups, ensuring data availability and recovery in various scenarios.

Last updated

Copyright (c) 2023 EkStep Foundation under MIT License