This guide describes DRBD version 8.4 and above. For 8.3 please look here.


The DRBD User’s Guide

Brian Hellman

Florian Haas

Philipp Reisner

Lars Ellenberg

This guide has been released to the DRBD community, and its authors strive to improve it permanently. Feedback from readers is always welcome and encouraged. Please use the DRBD public mailing list for enhancement suggestions and corrections.

License information

The text of and illustrations in this document are licensed under a Creative Commons Attribution–Share Alike 3.0 Unported license ("CC-BY-SA", brief explanation, full license text).

In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.

Trademarks used in this guide

DRBD®, the DRBD logo, LINBIT®, and the LINBIT logo are trademarks or registered trademarks of LINBIT Information Technologies GmbH in Austria, the United States and other countries.

AMD is a registered trademark of Advanced Micro Devices, Inc.

Citrix is a registered trademark of Citrix, Inc.

Debian is a registered trademark of Software in the Public Interest, Inc.

Dolphin Interconnect Solutions and SuperSockets are trademarks or registered trademarks of Dolphin Interconnect Solutions ASA.

IBM is a registered trademark of International Business Machines Corporation.

Intel is a registered trademark of Intel Corporation.

Linux is a registered trademark of Linus Torvalds.

Oracle, MySQL, and MySQL Enterprise are trademarks or registered trademarks of Oracle Corporation and/or its affiliates.

Red Hat, Red Hat Enterprise Linux, and RPM are trademarks or registered trademarks of Red Hat, Inc.

SuSE, SUSE, and SUSE Linux Enterprise Server are trademarks or registered trademarks of Novell, Inc.

Xen is a registered trademark of Citrix, Inc.

Other names mentioned in this guide may be trademarks or registered trademarks of their respective owners.


Table of Contents

Please Read This First
I. Introduction to DRBD
1. DRBD Fundamentals
1.1. Kernel module
1.2. User space administration tools
1.3. Resources
1.4. Resource roles
2. DRBD Features
2.1. Single-primary mode
2.2. Dual-primary mode
2.3. Replication modes
2.4. Multiple replication transports
2.5. Efficient synchronization
2.5.1. Variable-rate synchronization
2.5.2. Fixed-rate synchronization
2.5.3. Checksum-based synchronization
2.6. Suspended replication
2.7. On-line device verification
2.8. Replication traffic integrity checking
2.9. Split brain notification and automatic recovery
2.10. Support for disk flushes
2.11. Disk error handling strategies
2.12. Strategies for dealing with outdated data
2.13. Three-way replication
2.14. Long-distance replication with DRBD Proxy
2.15. Truck based replication
2.16. Floating peers
II. Building, installing and configuring DRBD
3. Installing pre-built DRBD binary packages
3.1. Packages supplied by LINBIT
3.2. Packages supplied by distribution vendors
3.2.1. SUSE Linux Enterprise Server
3.2.2. Debian GNU/Linux
3.2.3. CentOS
3.2.4. Ubuntu Linux
4. Building and installing DRBD from source
4.1. Downloading the DRBD sources
4.2. Checking out sources from the public DRBD source repository
4.3. Building DRBD from source
4.3.1. Checking build prerequisites
4.3.2. Preparing the kernel source tree
4.3.3. Preparing the DRBD build tree
4.3.4. Building DRBD userspace utilities
4.3.5. Compiling DRBD as a kernel module
4.4. Building a DRBD RPM package
4.5. Building a DRBD Debian package
5. Configuring DRBD
5.1. Preparing your lower-level storage
5.2. Preparing your network configuration
5.3. Configuring your resource
5.3.1. Example configuration
5.3.2. The global section
5.3.3. The common section
5.3.4. The resource sections
5.4. Enabling your resource for the first time
5.5. The initial device synchronization
5.6. Using truck based replication
III. Working with DRBD
6. Common administrative tasks
6.1. Checking DRBD status
6.1.1. Retrieving status with drbd-overview
6.1.2. Status information in /proc/drbd
6.1.3. Connection states
6.1.4. Resource roles
6.1.5. Disk states
6.1.6. I/O state flags
6.1.7. Performance indicators
6.2. Enabling and disabling resources
6.2.1. Enabling resources
6.2.2. Disabling resources
6.3. Reconfiguring resources
6.4. Promoting and demoting resources
6.5. Basic Manual Fail-over
6.6. Upgrading DRBD
6.6.1. Updating your repository
6.6.2. Upgrading the packages
6.6.3. Migrating your configs
6.7. Downgrading DRBD 8.4 to 8.3
6.8. Enabling dual-primary mode
6.8.1. Permanent dual-primary mode
6.8.2. Temporary dual-primary mode
6.8.3. Automating promotion on system startup
6.9. Using on-line device verification
6.9.1. Enabling on-line verification
6.9.2. Invoking on-line verification
6.9.3. Automating on-line verification
6.10. Configuring the rate of synchronization
6.10.1. Permanent fixed sync rate configuration
6.10.2. Temporary fixed sync rate configuration
6.10.3. Variable sync rate configuration
6.11. Configuring checksum-based synchronization
6.12. Configuring congestion policies and suspended replication
6.13. Configuring I/O error handling strategies
6.14. Configuring replication traffic integrity checking
6.15. Resizing resources
6.15.1. Growing on-line
6.15.2. Growing off-line
6.15.3. Shrinking on-line
6.15.4. Shrinking off-line
6.16. Disabling backing device flushes
6.17. Configuring split brain behavior
6.17.1. Split brain notification
6.17.2. Automatic split brain recovery policies
6.18. Creating a three-node setup
6.18.1. Device stacking considerations
6.18.2. Configuring a stacked resource
6.18.3. Enabling stacked resources
6.19. Using DRBD Proxy
6.19.1. DRBD Proxy deployment considerations
6.19.2. Installation
6.19.3. License file
6.19.4. Configuration
6.19.5. Controlling DRBD Proxy
6.19.6. About DRBD Proxy plugins
6.19.7. Using a WAN Side Bandwidth Limit
6.19.8. Troubleshooting
7. Troubleshooting and error recovery
7.1. Dealing with hard drive failure
7.1.1. Manually detaching DRBD from your hard drive
7.1.2. Automatic detach on I/O error
7.1.3. Replacing a failed disk when using internal meta data
7.1.4. Replacing a failed disk when using external meta data
7.2. Dealing with node failure
7.2.1. Dealing with temporary secondary node failure
7.2.2. Dealing with temporary primary node failure
7.2.3. Dealing with permanent node failure
7.3. Manual split brain recovery
IV. DRBD-enabled applications
8. Integrating DRBD with Pacemaker clusters
8.1. Pacemaker primer
8.2. Adding a DRBD-backed service to the cluster configuration
8.3. Using resource-level fencing in Pacemaker clusters
8.3.1. Resource-level fencing with dopd
8.3.2. Resource-level fencing using the Cluster Information Base (CIB)
8.4. Using stacked DRBD resources in Pacemaker clusters
8.4.1. Adding off-site disaster recovery to Pacemaker clusters
8.4.2. Using stacked resources to achieve 4-way redundancy in Pacemaker clusters
8.5. Configuring DRBD to replicate between two SAN-backed Pacemaker clusters
8.5.1. DRBD resource configuration
8.5.2. Pacemaker resource configuration
8.5.3. Site fail-over
9. Integrating DRBD with Red Hat Cluster
9.1. Red Hat Cluster background information
9.1.1. Fencing
9.1.2. The Resource Group Manager
9.2. Red Hat Cluster configuration
9.2.1. The cluster.conf file
9.3. Using DRBD in Red Hat Cluster fail-over clusters
9.3.1. Setting up your cluster configuration
10. Using LVM with DRBD
10.1. LVM primer
10.2. Using a Logical Volume as a DRBD backing device
10.3. Using automated LVM snapshots during DRBD synchronization
10.4. Configuring a DRBD resource as a Physical Volume
10.5. Adding a new DRBD volume to an existing Volume Group
10.6. Nested LVM configuration with DRBD
10.6.1. Switching the VG to the other node
10.7. Highly available LVM with Pacemaker
11. Using GFS2 with DRBD
11.1. GFS primer
11.2. Creating a DRBD resource suitable for GFS2
11.2.1. Enable resource fencing for dual-primary resource
11.3. Configuring CMAN
11.4. Creating a GFS2 filesystem
11.5. Using your GFS2 filesystem with Pacemaker
12. Using OCFS2 with DRBD
12.1. OCFS2 primer
12.2. Creating a DRBD resource suitable for OCFS2
12.3. Creating an OCFS2 filesystem
12.4. Pacemaker OCFS2 management
12.4.1. Adding a Dual-Primary DRBD resource to Pacemaker
12.4.2. Adding OCFS2 management capability to Pacemaker
12.4.3. Adding an OCFS2 filesystem to Pacemaker
12.4.4. Adding required Pacemaker constraints to manage OCFS2 filesystems
12.5. Legacy OCFS2 management (without Pacemaker)
12.5.1. Configuring your cluster to support OCFS2
12.5.2. Using your OCFS2 filesystem
13. Using Xen with DRBD
13.1. Xen primer
13.2. Setting DRBD module parameters for use with Xen
13.3. Creating a DRBD resource suitable to act as a Xen VBD
13.4. Using DRBD VBDs
13.5. Starting, stopping, and migrating DRBD-backed domU’s
13.6. Internals of DRBD/Xen integration
13.7. Integrating Xen with Pacemaker
V. Optimizing DRBD performance
14. Measuring block device performance
14.1. Measuring throughput
14.2. Measuring latency
15. Optimizing DRBD throughput
15.1. Hardware considerations
15.2. Throughput overhead expectations
15.3. Tuning recommendations
15.3.1. Setting max-buffers and max-epoch-size
15.3.2. Tweaking the I/O unplug watermark
15.3.3. Tuning the TCP send buffer size
15.3.4. Tuning the Activity Log size
15.3.5. Disabling barriers and disk flushes
16. Optimizing DRBD latency
16.1. Hardware considerations
16.2. Latency overhead expectations
16.3. Tuning recommendations
16.3.1. Setting DRBD’s CPU mask
16.3.2. Modifying the network MTU
16.3.3. Enabling the deadline I/O scheduler
VI. Learning more about DRBD
17. DRBD Internals
17.1. DRBD meta data
17.1.1. Internal meta data
17.1.2. External meta data
17.1.3. Estimating meta data size
17.2. Generation Identifiers
17.2.1. Data generations
17.2.2. The generation identifier tuple
17.2.3. How generation identifiers change
17.2.4. How DRBD uses generation identifiers
17.3. The Activity Log
17.3.1. Purpose
17.3.2. Active extents
17.3.3. Selecting a suitable Activity Log size
17.4. The quick-sync bitmap
17.5. The peer fencing interface
18. Getting more information
18.1. Commercial DRBD support
18.2. Public mailing list
18.3. Public IRC Channels
18.4. Official Twitter account
18.5. Publications
18.6. Other useful resources
VII. Appendices
A. Recent changes
A.1. Volumes
A.1.1. Changes to udev symlinks
A.2. Changes to the configuration syntax
A.2.1. Boolean configuration options
A.2.2. syncer section no longer exists
A.2.3. protocol option is no longer special
A.2.4. New per-resource options section
A.3. On-line changes to network communications
A.3.1. Changing the replication protocol
A.3.2. Changing from single-Primary to dual-Primary replication
A.4. Changes to the drbdadm command
A.4.1. Changes to pass-through options
A.4.2. --force option replaces --overwrite-data-of-peer
A.5. Changed default values
A.5.1. Number of concurrently active Activity Log extents (al-extents)
A.5.2. Run-length encoding (use-rle)
A.5.3. I/O error handling strategy (on-io-error)
A.5.4. Variable-rate synchronization
A.5.5. Number of configurable DRBD devices (minor-count)
B. DRBD system manual pages
drbd.confDRBD Configuration Files
drbdadmUtility for DRBD administration
drbdsetupConfigure the DRBD kernel module
drbdmetaManipulate the DRBD on-disk metadata
Index

List of Figures

1.1. DRBD’s position within the Linux I/O stack
2.1. DRBD resource stacking
6.1. Syncer rate example, 110MB/s effective available bandwidth
6.2. Syncer rate example, 80MB/s effective available bandwidth
8.1. DRBD resource stacking in Pacemaker clusters
8.2. DRBD resource stacking in Pacemaker clusters
8.3. Using DRBD to replicate between SAN-based clusters
10.1. LVM overview
17.1. Calculating DRBD meta data size (exactly)
17.2. Estimating DRBD meta data size (approximately)
17.3. GI tuple changes at start of a new data generation
17.4. GI tuple changes at start of re-synchronization
17.5. GI tuple changes at completion of re-synchronization
17.6. Active extents calculation based on sync rate and target sync time
17.7. Active extents calculation based on sync rate and target sync time (example)

List of Tables

4.1. Options supported by DRBD’s configure script
4.2. DRBD userland RPM packages
17.1. fence-peer handler exit codes

This guide describes DRBD version 8.4 and above. For 8.3 please look here.