Configuring Simpana® Storage Resources
Module 2 – Configuring Simpana Storage Resources
50 | Configuring Simpana Storage Resources
No unauthorized use, copy or distribution.
• Configure Media Agents
• Media Agent Functions
• Indexing Structure
• Add and Configure Disk Libraries
• Understanding Disk Libraries
• Deduplication Building Block Guidelines
• Detect and Configure Tape Libraries
• Supported Libraries and Settings
• Tape Media Lifecycle
• Media Handling
Topics
Topics
Configuring Simpana Storage Resources I 51
CONFIGURE MEDIAAGENTS
Configure MediaAgents
52 | Configuring Simpana Storage Resources
No unauthorized use, copy or distribution.
MediaAgent Functions
MediaAgents are the transition point for data moving on the data pipe to/from the Client Agent and the data path to/from the protected media. All data moving to/from protected storage must move through a MediaAgent. As such, resource provisioning for MediaAgent hosts (e.g.
CPU, Memory, and bandwidth) must be significant and adequate for both the volume and the concurrency of data movement you expect it to handle.
A MediaAgent provides device control over media changers and removable media devices - and writers to disk devices. This control defines the path upon which data moves to/from protected storage. In addition to normal device integrity checks, the MediaAgent can validate the
integrity of data stored on the media during a recovery operation and validate the integrity of the data on the network during a data protection operation.
In the case where the MediaAgent component is co-located on the same host as the Client Agent, the exchange of data is contained within the host. This is called a SAN MediaAgent
Configuring Simpana Storage Resources I 53
Device Control
With Tape libraries, MediaAgents are the primary control agent for media changers and tape devices. The Media and Library Manager service on the CommServe host determines the required tape media and its location. It then determines which MediaAgent has control of the media changer to load/unload the tape. Once loaded, the MediaAgent having the specified data path access to the tape device will mount/unmount the tape for reading/writing. In this manner a tape library can be shared with multiple MediaAgents. For IP-based libraries where the media changer is managed by 3rd party software, a MediaAgent will be the interface point of communication.
For disk libraries, data paths are either local or network. Shared disk devices can have only one local path (exception would be Global File Systems – see BOL for usage) and any number of network paths. Concurrent writes to a disk library is managed by the Common Internet File System (CIFS) protocol.
Deduplication Database
Deduplication is also managed by a MediaAgent through the use of a Deduplication Data Base.
Data block signatures are compared with original blocks indexed and written to storage media and duplicate blocks indexed using the location of the original block. The Deduplication Data Base is not used when restoring data. If multiple MediaAgents are involved, one MediaAgent can be dedicated to managing the Deduplication Data Base.
Tip: The only data movement that does NOT require a MediaAgent is replication using Simpana ContinuousDataReplicator (CDR). If replicated volume is to be backed up or physically snapshot then a MediaAgent would be required.
54 | Configuring Simpana Storage Resources
No unauthorized use, copy or distribution.
Indexing Structure
Index file log shipping to ICS
Index file copied to media
Dedicated Index Cache and Index Cache Server
Index Cache Server Shared
Index Cache
Indexing Structure
Simpana® software uses a distributed indexing structure that provides for enterprise level scalability and automated index management. This works by using the CommServe database to only retain job based metadata which will keep the database relatively small. Job and detailed index information will be kept on the MediaAgent protecting the job, automatically copied to media containing the job and optionally copied to an Index Cache Server.
Job summary data maintained in the CommServe database will keep track of all data chunks being written to media. As each chunk completes it is logged in the CommServe database. This information will also maintain media identities where the job was written to which can be used when recalling off site media back for restores. This data will be held in the database for as long as the job exists. This means even if the data has exceeded defined retention rules, the
summary information will still remain in the database until the job has been overwritten. An option to browse aged data can be used to browse and recover data on media that has exceeded retention but has not been overwritten.
Configuring Simpana Storage Resources I 55
Index Cache Server
Index Cache Server is an index cache sharing mechanism that saves an additional copy of the index cache for sharing purposes. This additional copy, the Index Cache Server, is located on one of the MediaAgent computers participating in the share. This Index Cache Server can be
accessed by all participating MediaAgents.
Index Cache Server provides the following advantages:
• Index cache restores for data protection operations.
• Job restartability in GridStor™ Technology scenarios (when used with transaction logging).
• Index cache rebuilding in failover scenarios (when used with transaction logging).
• Maintaining a local cache prevents network disruptions from affecting the data protection operations.
Transaction logging
The index copy on the Index Cache Server is created by either copying the original index during the Archive Index phase of the data protection job, or dynamically through transactional log replay. Transactional logs are sent at the completion of each storage chunk. In the event the local cache is lost while indexing a job, the job can be restarted at the last transaction
successfully entered on the Index Cache Server.
Shared Index Cache (Network Share)
A Network Share is a designated location on the network where one or more MediaAgents store their index cache. The Index Cache stored in a network share can be accessed from all
participating MediaAgents. You might use a network share if you have a dedicated partition created exclusively for Index Cache and you wish to use this partition for index cache sharing.
Ensure that you have enough space to accommodate the index cache from all participating MediaAgents.
Note: When using a network share, the local index and the shared index are one and the same.
A network disruption might corrupt the index and jobs might have to be restarted due to index cache failure.
56 | Configuring Simpana Storage Resources
ADD AND CONFIGURE DISK LIBRARIES
Add and Configure Disk Libraries
Configuring Simpana Storage Resources I 57
No unauthorized use, copy or distribution.
Understanding Disk Libraries
As cost of disk continues to go down, the speed and concurrency advantages of disk libraries make them the primary protected storage media of choice.
Types
There are three basic library configurations:
Dedicated disk libraries are created by first “adding” a disk library entity to the MediaAgent using either the right-click All Tasks menu or the Control Panel’s Library and Drive Configuration Tool. One or more “mount paths” can be created/added to the library. Mount Paths are
configured as Shared Disk Devices. The Shared Disk Device in a Dedicated disk library has only one Primary Sharing Folder.
NOTE: EMC® Centera and HDS Data Retention Utility (DRU) devices can also be configured as direct attached disk libraries with support for hardware single instancing. Hardware single instancing is a Library property option that can be selected. The hardware single instancing
58 | Configuring Simpana Storage Resources
access to the same directory. For UNIX hosted MediaAgents, Network File Share (NFS) protocol can be used. NFS shared disks appear to the MediaAgent as local drives.
Replicated disk libraries are configured similar to a shared disk library with the exception that the Shared Disk Device has a replicated data path defined to a volume accessible via another MediaAgent. Replicated folders are read-only and replication can be configured using
CommVault’s® ContinuousDataReplicator (CDR) product or third party replication hardware or software application.
Settings
While there are other settings available, the three most important settings are:
Usage Pattern determines how both writers and volumes are used when more than one data stream is in action. The default usage pattern is “Fill & Spill” which will use all writers and the available capacity of the first mount path before another mount path is used. The alternative is
“Spill & Fill” which distributes each job stream to different mount paths which can improve performance if mount paths use different I/O devices.
Usable Capacity is managed through settings for Reserved capacity and Managed Threshold.
Reserve capacity can be set on each mount path to allow maintenance (defragmentation) or use by other applications. Managed Thresholds are set at the library level and enabled through associated Storage Policy copies. Managed Thresholds allow the administrator to make maximum use of available capacity extending retention and thus availability of data in the disk library.
Allocation management is available through library and mount path settings for the max number of concurrent writers. Maximum number of allowed writers should be set to prevent over-saturation of MediaAgent or disk resources. Pushing too many concurrent streams with inadequate resources can be detrimental to overall throughput.
Another allocation management tool is the fragmentation setting for reserving concurrent blocks for writing. With concurrent writes to the same disk, fragmentation is a concern that can impact both restores and auxiliary copy operations.
Maintenance
You can schedule and run an analysis of fragmentation on a disk library’s mount paths. The
Configuring Simpana Storage Resources I 59
No unauthorized use, copy or distribution.
• Must meet IOPs requirements
• Iometer.org
• Media Agent minimum 32 GB RAM Up to 50 concurrent write streams Up to 120 TB usable capacity Windows or Linux 300 – 500 GB capacity MediaAgent
Deduplication Database (DDB)
Disk Library
Deduplication Building Blocks Guidelines
Simpana® software offers a variety of deduplication features that drastically changes the way data protection is conducted. Client side deduplication can greatly reduce network usage, Dash Full can significantly reduce the time of Synthetic full backups, and Dash Copy will greatly reduce the time it takes to copy backups to off-site disk storage. Additionally, SILO storage can copy deduplicated data to tape still in its deduplicated state. This chapter details how
deduplication works and how to best configure and manage deduplicated storage.
When using Simpana deduplication, CommVault recommends using building block guidelines for scalability in large environments. There are two layers to a building block, the physical layer and the logical layer.
For the physical layer, each building block will consist of one or more MediaAgents, one disk library and one deduplication database.
For the logical layer, each building block will contain one or more storage policies. If multiple storage policies are going to be used they should all be linked to a single global deduplication
60 | Configuring Simpana Storage Resources
It is critical to provide adequate hardware to achieve maximum performance for a building block.
Performance starts with properly scaling the MediaAgent. There should be a minimum of 32 GB of RAM on each MediaAgent hosting the deduplication database.
The disks library can be sized up to 100 TB for a single building block. Mount paths should be configured between 2 – 8 TB.
In order to meet deduplication database IOPs requirements, high performance disks in a RAID array must be used. Enterprise class Solid State Disks are recommended. For more information go to:
http://documentation.commvault.com/commvault/release_10_0_0/books_online_1/english_us /prod_info/dedup_building_block.htm
Configuring Simpana Storage Resources I 61
DETECT AND CONFIGURE TAPE LIBRARIES
Detect and Configure Tape Libraries
62 | Configuring Simpana Storage Resources
No unauthorized use, copy or distribution.
Supported Libraries and Settings
Supported Libraries and Settings
Removable media libraries provide the most economical means to collect and move protected data off-site. While optical and USB Drives are also supported removable media types, the pre-eminent removable media is tape. For this discussion we will use the term tape library as a suitable substitute for removable media library.
Types
There are three basic removable media type libraries:
Standalone tape drives are still classified and treated as libraries. The difference being that a standalone tape drive has no robotic media changer and no internal storage slots. Multiple standalone drives controlled by the same MediaAgent can be pooled together in order to support multi-stream jobs or cascade of a single stream job without having to respond to media handling requests. Media used by a Standalone library can be pre-stamped or new, and will be prompted for, by backup or restore jobs as necessary. Media Handling prompts for user action appear on the MediaAgent, CommCell console, or can be sent as an alert if configured.
Configuring Simpana Storage Resources I 63 library drives. Barcode-ed libraries have a barcode reader and maintain an internal
map/inventory of media in the library. A “blind” library has no barcode reader and is supported by the CommVault® software maintaining the map/inventory externally in the CommServe®
metadata.
A common, but not required, characteristic of a robotic library is multiple drives. Drives are usually of the same type (and firmware!). A multiple drive library can be used to support either a multi-stream job, or multiple concurrent jobs. A group of drives accessible in this manner from the same MediaAgent is called a Drive Pool. This gives the software flexibility in assigning idle drives from the the pool rather than requesting and waiting for a specific drive.
Static or Dynamic Drive Libraries are distinguished by their ability to be accessed by two or more MediaAgent hosts.
A shared library is a static configuration where the drives and media changer are connected to only one of several MediaAgent hosts. For example: In a library with four tape drives, one MediaAgent may have control of the media changer and two drives within the library while another MediaAgent may have control over the other two tape drives. A drive connected to one MediaAgent host is not accessible from the other MediaAgent hosts. Should the MediaAgent component having media changer control fail, no further loading/unloading of media can occur until that MediaAgent is active again. Shared libraries in today’s world of Storage Area
Networks (SAN) are not common.
Dynamic Drive Libraries are the most common configuration in larger environments and maximize the utility of tape libraries. In a Dynamic Library the library drives and media changer are on a SAN and can be accessed by multiple MediaAgent hosts. Drives not being used by one MediaAgent can be assigned to and used by another MediaAgent. If the MediaAgent with control of the media changer fails the control can be automatically passed to another MediaAgent. The primary advantage of a Dynamic Drive library is the use of multiple
MediaAgents for processing reads/writes. Dynamic Drive capability is refered to as GridStor™
Technology. GridStor technology is a licensed option that enables load balancing and failover of data protection jobs.
Settings
While there are other library management settings available (and should be reviewed by the administrator), the three most important settings are:
64 | Configuring Simpana Storage Resources
circumstances (exported, stuck tape) it may not be loadable. Default action is to use new media. While this allows jobs to continue without manual intervention, it could result in some tapes not being used to full capacity. (See next option!)
Appendable media option allows the continued writing to previously active media that still have capacity remaining. If for some reason media was not fully used, default action is to make the unused capacity available next time new media is requested for writing. Appendable media can only be written to by the same storage policy copy stream and only if the last previous write occurred within the specified time. These restrictions are there to ensure only similarly
retained data is written to the media.
Maintenance
Maintenance is essential in a tape library as dirty drive heads are common and can make reading/writing data difficult. Newer libraries have sensors to automatically initiate cleaning when required. Otherwise, cleaning can be scheduled or conducted based on vendor
recommended thresholds for usage and errors. Selecting the correct cleaning option for your environment is important.
Configuring Simpana Storage Resources I 65
Tape Media Lifecycle
Removable Media moves primarily in a cycle between three logical states – Spare -> Active->
Full, then back to spare. New media imported into a library is considered to be Undiscovered until the library inventory has been updated to the CommServe database. Appendable media is media that is assigned to a storage policy copy; no longer active; but not yet full. There are various reasons why media might not be filled with data ranging from a user initiated command to start new media to the media not being available for the next write action. Availability of Appendable media is controlled by a library setting, configurable to an allowed time span, and restricted to the same storage policy copy stream.
Media can also be marked Bad by the system or user, in which case no more writes will be attempted on the tape, but readable data on the tape is still restorable. Like Bad Media, Retired media is no longer used for writes, but may contain aged data which is restorable. Both Retired and Bad media should ultimately be deleted from the library/CommCell environment.
Removable media states are represented by icons as shown in the graphic above. Media should
66 | Configuring Simpana Storage Resources
Read - Media in any logical state can be read if it contains valid data. Valid data is defined as data obtained by a data protection job that has been logged into the CommServe metadata. As data is written and tracked in large contiguous files called
“chunks”, this means data from any recorded chunk can be read – regardless of whether the associated job succeeded, failed, or was killed.
NOTE: Only successful job data is copied during an auxiliary copy operation. As such,
secondary copies do not contain chunks from failed or killed jobs. If numerous failed or killed jobs exist, an Auxiliary copy job can be sometimes used to consolidate data on fewer media.
Data on the source copy can then be deleted and the associated media freed up for re-use.
Ownership - Assigned media is “owned” by a Storage Policy copy stream. Once assigned, only that Storage Policy copy stream is able to write to that media. This ensures consistent retention and handling of data. Ownership is relinquished when the media is reclassified as spare media.
However, data on aged media in the spare media pool can still be used for restore until the media has been re-assigned and written to.
Capacity - Capacity is measured in two forms – Used and Available. Used capacity is reported on the properties dialog for each media and is also available in various reports. It’s important to note that with hardware compression enabled for tape drives, the size of the written data reported is an estimate. Depending upon actual compression, the available remaining capacity
Capacity - Capacity is measured in two forms – Used and Available. Used capacity is reported on the properties dialog for each media and is also available in various reports. It’s important to note that with hardware compression enabled for tape drives, the size of the written data reported is an estimate. Depending upon actual compression, the available remaining capacity