Jump to content

Backup: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Line 47: Line 47:


Some backup systems<ref>such as [[Time_Machine_(macOS)#Operation|Apple's Time Machine]]</ref> can create a synthetic full backup from a series of incrementals, thus providing the equivalent of ''frequently'' doing a full backup.<ref name="NakivoTypesOfBackup"/>
Some backup systems<ref>such as [[Time_Machine_(macOS)#Operation|Apple's Time Machine]]</ref> can create a synthetic full backup from a series of incrementals, thus providing the equivalent of ''frequently'' doing a full backup.<ref name="NakivoTypesOfBackup"/>

=====[[Incremental backup#Synthetic full backup|Synthetic full backup=====
Tapes of disk archives of from multiple backups of the same source(s) can be consolidated onto a single '''Synthetic full backup'''<ref>in part to satisfy legal retention requirements, and may
intentionally omit some backups, either because they're redundant or if retaining them would violate regulations such as the European [[General_Data_Protection_Regulation#Right_to_erasure|GDPR Right_to_erasure]].<ref name="NetBackupAboutSyntheticBackups">{{cite web
|title=About synthetic backups
|url=https://www.veritas.com/content/support/en_US/doc/18716246-126559472-0/id-SF780163836-126559472
|website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017
|date=25 September 2017}}</ref><ref name="BackupExecSyntheticFullBackup">{{cite web
|title=Symantec Backup Exec: About the synthetic backup feature
|url=http://backup-exec.helpmax.net/en/symantec-backup-exec-advanced-disk-based-backup-option/about-the-synthetic-backup-feature
|website=''Helpmax.net'' |publisher=HelpMax Software Help & Shop Inc. |accessdate=13 January 2018}}</ref>


=====Reverse incremental=====
=====Reverse incremental=====
Line 275: Line 286:
The [[Hard disk drive#Price evolution|steady improvement in hard disk drive price per byte]] has made feasible a [[Backup#Manipulation of data and dataset optimization|disk-to-disk-to-tape]] strategy, combining the speed of disk backup and restore with the capacity and low cost of tape for offsite archival and disaster recovery purposes.<ref name="FernandoCombineDiskTapeBenefits">{{cite web |last1=Fernando |first1=Sal |title=Combine disk, tape benefits to protect data |url=http://www.zdnet.com/article/combine-disk-tape-benefits-to-protect-data/ |publisher=ZDNet |accessdate=13 November 2017 |date=30 April 2008}}</ref> This, with [[Comparison of file systems#File capabilities|file system technology]], has led to features suited to [[Backup#Manipulation_of_data_and_dataset_optimization|optimization]] such as:
The [[Hard disk drive#Price evolution|steady improvement in hard disk drive price per byte]] has made feasible a [[Backup#Manipulation of data and dataset optimization|disk-to-disk-to-tape]] strategy, combining the speed of disk backup and restore with the capacity and low cost of tape for offsite archival and disaster recovery purposes.<ref name="FernandoCombineDiskTapeBenefits">{{cite web |last1=Fernando |first1=Sal |title=Combine disk, tape benefits to protect data |url=http://www.zdnet.com/article/combine-disk-tape-benefits-to-protect-data/ |publisher=ZDNet |accessdate=13 November 2017 |date=30 April 2008}}</ref> This, with [[Comparison of file systems#File capabilities|file system technology]], has led to features suited to [[Backup#Manipulation_of_data_and_dataset_optimization|optimization]] such as:
; Improved disk-to-disk-to-tape capabilities: Enable automated transfers to tape for safe offsite storage of disk archive files that were created for fast onsite restores.<ref name="EMCRetroWindows7">{{cite web |title=New EMC Dantz Retrospect 7 Improves Data Protection for SMBs and the Distributed Enterprise |url=http://www.emc.com/about/news/press/us/2005/20050131-2906.htm |website=DellEMC [current] |publisher=EMC Corp. [orig. publisher] |accessdate=23 November 2016 |date=31 January 2005}}</ref><ref name="NetBackupAboutReplicationDirector">{{cite web |title=About NetBackup Replication Director |url=https://www.veritas.com/support/en_US/doc/59229900-126796169-0/v58079997-126796169 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |date=13 July 2017}}</ref><ref name="BackupExecDuplicatingBackedUpData">{{cite web |title=Symantec Backup Exec: About duplicating backed up data |url=http://backup-exec.helpmax.net/en/backing-up-data/about-duplicating-backed-up-data/ |website=''Helpmax.net'' |publisher=HelpMax Software Help & Shop Inc. |accessdate=13 January 2018}}</ref>
; Improved disk-to-disk-to-tape capabilities: Enable automated transfers to tape for safe offsite storage of disk archive files that were created for fast onsite restores.<ref name="EMCRetroWindows7">{{cite web |title=New EMC Dantz Retrospect 7 Improves Data Protection for SMBs and the Distributed Enterprise |url=http://www.emc.com/about/news/press/us/2005/20050131-2906.htm |website=DellEMC [current] |publisher=EMC Corp. [orig. publisher] |accessdate=23 November 2016 |date=31 January 2005}}</ref><ref name="NetBackupAboutReplicationDirector">{{cite web |title=About NetBackup Replication Director |url=https://www.veritas.com/support/en_US/doc/59229900-126796169-0/v58079997-126796169 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |date=13 July 2017}}</ref><ref name="BackupExecDuplicatingBackedUpData">{{cite web |title=Symantec Backup Exec: About duplicating backed up data |url=http://backup-exec.helpmax.net/en/backing-up-data/about-duplicating-backed-up-data/ |website=''Helpmax.net'' |publisher=HelpMax Software Help & Shop Inc. |accessdate=13 January 2018}}</ref>
; Create synthetic full backups: For example, onto tapes from existing disk archive files—by ''copying'' multiple backups of the same source(s) ''from one archive file to another''. The second archive file is typically created in part to satisfy legal retention requirements, and may intentionally omit some backups—either because there is no need to retain them or because retaining them would violate regulations such as the European [[General_Data_Protection_Regulation#Right_to_erasure|GDPR Right_to_erasure]]. Therefore one application can exclude<ref group=note name=RetrospectExclusionInclusion>Exclusion and/or inclusion is done with Selectors in the Windows variant; this misleading term has been changed to Rules in the Macintosh variant.</ref> files and folders from the synthetic full backup.<ref name="RetrospectWindows12UG" /> This is termed a [[Incremental backup#Synthetic full backup|"synthetic full backup"]] because, after the transfer, the destination archive file contains the same data it would after a full backup of the non-excluded data.<ref name="EMCRetroWindows7" /><ref name="NetBackupAboutSyntheticBackups">{{cite web |title=About synthetic backups |url=https://www.veritas.com/content/support/en_US/doc/18716246-126559472-0/id-SF780163836-126559472 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |date=25 September 2017}}</ref><ref name="BackupExecSyntheticFullBackup">{{cite web |title=Symantec Backup Exec: About the synthetic backup feature |url=http://backup-exec.helpmax.net/en/symantec-backup-exec-advanced-disk-based-backup-option/about-the-synthetic-backup-feature/ |website=''Helpmax.net'' |publisher=HelpMax Software Help & Shop Inc. |accessdate=13 January 2018}}</ref>
; Automated data grooming: Frees up space on disk archive files by removing out-of-date backup data—usually based on an administrator-defined retention period.<ref name="eWeekWorldBackupDay" /><ref name="FernandoCombineDiskTapeBenefits" /><ref name="EMCRetroWindows7" /><ref name="NetBackupStorageLifecyclePolicy">{{cite web |last1=Kaczorek |first1=Mariusz |title=NetBackup Storage Lifecycle Policy (SLP): Overview
|url=https://www.settlersoman.com/netbackup-storage-lifecycle-policy-slp-overview
|website=Settlersoman |publisher=Settlersoman
|accessdate=2 February 2018 |date=15 August 2015}}</ref><ref name="BackupExecDataGrooming">{{cite web |last1=Jain |first1=Hemant |title=VOX Knowledge Base: Data Protection Knowledge Base: Data Protection |url=https://vox.veritas.com/t5/Articles/Automated-Disk-management-and-Data-retention-in-Backup-Exec-DLM/ta-p/809167 |website=VOX |publisher=Veritas Technologies LLC
|accessdate=13 January 2018 |date=14 April 2015
|quote=Employee [of Veritas]}}</ref><ref name="TechTargetTivoliSMVersusTraditional">{{cite web |last1=Dorion |first1=Pierre |title=IBM Tivoli Storage Manager vs. traditional backup |url=https://searchdatabackup.techtarget.com/tip/IBM-Tivoli-Storage-Manager-vs-traditional-backup |website=TechTarget |publisher=Tech Target Inc. |accessdate=30 October 2018 |date=January 2007 |at=Backup versions}}</ref><ref group=note>Some backup applications—notably [[Rsync#History|rsync]] and [[Code42#File_backup_and_sharing_services|CrashPlan]]—term removing backup data "pruning" instead of "grooming".[https://linux.die.net/man/1/rsync][https://support.code42.com/Administrator/5/Monitoring_and_managing/Archive_maintenance#Prune]</ref> One method of removing data is to keep the last backup of each day/week/month for the last respective week/month/specified-number-of-months, permitting compliance with regulatory requirements.<ref name="RetrospectMac12UG">{{cite web |title=Retrospect ® 12.0 Mac User's Guide |url=http://download.retrospect.com/docs/mac/v12/user_guide/Retrospect_Mac_User_Guide-EN.pdf |website=Retrospect |publisher=Retrospect Inc. |accessdate=28 December 2017 |format=PDF |year=2015 |pages=8-9(Improved Grooming)}}</ref> One application has a "performance-optimized grooming" mode that only removes outdated information from an archive file that it can quickly delete.<ref name="TitBITSMacintosh13">{{cite web |last1=Schmitz |first1=Agen |title=Retrospect 13 |url=https://tidbits.com/article/16311 |website=TitBITS |publisher=TidBITS Publishing Inc. |accessdate=27 October 2016 |date=5 March 2016}}</ref> This is the only mode of grooming allowed for cloud archive files, and is also up to 5 times as fast when used on locally stored disk archive files. The "storage-optimized grooming" mode reclaims more space because it rewrites the archive file, and in this application also permits exclusion compliance with the [[General_Data_Protection_Regulation#Right_to_erasure|GDPR "right of erasure" ]]<ref name="RetrospectKnowledgeBase">{{cite web |title=Support: Knowledge Base |url=https://www.retrospect.com/en/support/kb/ |website=Retrospect |publisher=Retrospect Inc. |accessdate=4 May 2019 |date=24 April 2019 |at=#Resources (Auto Launching Guide ..., ... difference between "Backup" and "Duplicate", Avid Support ..., Instant Scan FAQ, Can't use Open File Backup ...), #Email Backup, #Top Articles (BackupBot – Deep Dive into ProactiveAI, How to Set Up Remote Backup, GDPR – Deep Dive into Data Retention Policies, Deep Dive - Components ['''and phases'''] of a Retrospect Backup, How to Set Up the Management Console, Management Console - How to Use Shared Scripts, How to Use Storage Groups, Support End-of-Life Announcement for Mac OS X 10.3, 10.4, and 10.5, Retrospect Compatibility with Apple File System (APFS)), #Hooks (Script Hooks: External Scripting with Event Handlers, Script Hooks: How to Protect MongoDB with Retrospect, Script Hooks: How to Protect MySQL with Retrospect, Script Hooks: How to Protect PostgreSQL with Retrospect)}}</ref> via rules<ref group=note name=RetrospectExclusionInclusion />—that can instead be used for other filtering.<ref name="TitBITSMacintosh15.1.1">{{cite web |last1=Schmitz |first1=Agen |title=Retrospect 15.1.1 |url=https://tidbits.com/watchlist/retrospect-15-1-1/ |website=TitBITS |publisher=TidBITS Publishing Inc. |accessdate=20 June 2018 |date=28 May 2018}}</ref>
; Multithreaded backup server: Capable of simultaneously performing multiple backup, restore, and copy operations in separate "activity threads" (once needed only by those who could afford multiple tape drives).<ref name="EnterpriseBackupChallenges" /><ref name="NetBackupMultistreamMultiplex">{{cite web |title=What is the difference between multiplexing and multistreaming? |url=https://www.veritas.com/support/en_US/article.TECH10085 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=19 November 2017 |date=29 January 2015}}</ref><ref name="BackupExecRunConcurrentJobs">{{cite web |last1=McMillen |first1=Robert |title=How to run concurrent jobs in Backup exec 15 |url=https://www.youtube.com/watch?v=1-9x9So038g |via=YouTube |publisher=Google |accessdate=14 January 2018 |format=Video |date=21 July 2015}}</ref> In one application, all the categories of information for a particular "backup server" are stored by it; when an [[backup#User interface|"Administration Console"]] process is started, its process synchronizes information with all running LAN/WAN backup servers.<ref name="TidBITSEMCShips" />
; Multithreaded backup server: Capable of simultaneously performing multiple backup, restore, and copy operations in separate "activity threads" (once needed only by those who could afford multiple tape drives).<ref name="EnterpriseBackupChallenges" /><ref name="NetBackupMultistreamMultiplex">{{cite web |title=What is the difference between multiplexing and multistreaming? |url=https://www.veritas.com/support/en_US/article.TECH10085 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=19 November 2017 |date=29 January 2015}}</ref><ref name="BackupExecRunConcurrentJobs">{{cite web |last1=McMillen |first1=Robert |title=How to run concurrent jobs in Backup exec 15 |url=https://www.youtube.com/watch?v=1-9x9So038g |via=YouTube |publisher=Google |accessdate=14 January 2018 |format=Video |date=21 July 2015}}</ref> In one application, all the categories of information for a particular "backup server" are stored by it; when an [[backup#User interface|"Administration Console"]] process is started, its process synchronizes information with all running LAN/WAN backup servers.<ref name="TidBITSEMCShips" />
; Block-level incremental backup: The ability to back up only the blocks of a file that have changed, a [[Incremental backup#Block level incremental|refinement of incremental backup]] that saves space<ref name="TitBITSMacintosh11">{{cite web |last1=Schmitz |first1=Agen |title=Retrospect 11 |url=https://tidbits.com/article/14573
; Block-level incremental backup: The ability to back up only the blocks of a file that have changed, a [[Incremental backup#Block level incremental|refinement of incremental backup]] that saves space<ref name="TitBITSMacintosh11">{{cite web |last1=Schmitz |first1=Agen |title=Retrospect 11 |url=https://tidbits.com/article/14573

Revision as of 21:16, 26 May 2019

In information technology, a backup, or data backup, or the process of backing up, refers to the copying into an archive file[note 1][1] of computer data that is already in secondary storage—so that it may be used to restore the original after a data loss event. The verb form is "back up" (a phrasal verb), whereas the noun and adjective form is "backup".[2] (This article assumes at least a random access index to the secondary storage data to be backed up, and therefore does not discuss the venerable practice of pure tape-to-tape copying.)

Backups are primarily to recover data after its loss from data deletion or corruption, and secondarily to recover data from an earlier time, based on a user-defined data retention policy.[3] Though backups represent a simple form of disaster recovery and should be part of any disaster recovery plan, backups by themselves should not be considered a complete disaster recovery plan. One reason for this is that not all backup systems are able to reconstitute a computer system or other complex configuration such as a computer cluster, active directory server, or database server by simply restoring data from a backup.[4]

Since a backup system contains at least one copy of all data considered worth saving, the data storage requirements can be significant. Organizing this storage space and managing the backup process can be a complicated undertaking. An information repository model may be used to provide structure to the storage. Nowadays, there are many different types of data storage devices that are useful for making backups. There are also many different ways in which these devices can be arranged to provide geographic redundancy, data security, and portability.

Before data are sent to their storage locations, they are selected, extracted, and manipulated. Many different techniques have been developed to optimize the backup procedure. These include optimizations for dealing with open files and live data sources as well as compression, encryption, and de-duplication, among others. Every backup scheme should include dry runs that validate the reliability of the data being backed up. It is important to recognize the limitations[5] and human factors involved in any backup scheme.

Storage, the base of a backup system

How, and how long, to store backup data are key decisions. A backup rotation scheme will reflect the planned data-retention policy.[1]

A backup strategy starts with a concept of an information repository, "a secondary storage space for data".[6]

Information repository models

Information repository

A repository is "a central place in which an aggregation of data is kept and maintained in an organized way, usually in computer storage."[7] It "may be just the aggregation of data itself into some accessible place of storage or it may also imply some ability to selectively extract data."[7]

Backup types

Full only / System imaging : A repository using this backup method contains complete source data copies taken at one or more specific points in time.[8] With system images, this technology is frequently used by computer technicians to record known good configurations. Imaging[9] is generally more useful for deploying a standard configuration to many systems rather than as a tool for making ongoing backups of diverse systems.

An incremental backup stores data from successive points in time. Duplicate copies of unchanged data aren't copied.[8] Typically a full (usually non-image) backup of all files is made on one occasion (or at infrequent intervals), serving as the reference point for an incremental repository. After that, a number of incremental backups are made after successive time periods. Restores begins with the last full backup and then apply the incremental.[10]

Some backup systems[11] can create a synthetic full backup from a series of incrementals, thus providing the equivalent of frequently doing a full backup.[8]

[[Incremental backup#Synthetic full backup|Synthetic full backup

Tapes of disk archives of from multiple backups of the same source(s) can be consolidated onto a single Synthetic full backupCite error: A <ref> tag is missing the closing </ref> (see the help page).[12]

Reverse incremental

A Reverse incremental backup method stores a recent archive file "mirror" of the source data and a series of differences between the mirror in its current state and its previous states. A reverse incremental backup method starts with a non-image full backup. After the full backup is performed, the system periodically synchronizes the full backup with the live copy, while storing the data necessary to reconstruct older versions.[13] This can either be done using hard links—as Apple Time Machine does, or using binary diffs. Reverse incremental works particularly well if most restores are of latest versions.

Each differential backup saves the data that has changed since the last full backup.[8] This backup method has the advantage that only a maximum of two backups from the repository are used to restore the data. One disadvantage, compared to the incremental backup method, is that as time from the last full backup (and thus the accumulated changes in data) increases, so does the time to perform the differential backup. Restoring an entire system requires starting from the most recent full backup and then applying just the last differential backup since the last full backup.

By standard definition, a differential backup copies files that have been created or changed since the last full backup, regardless of whether any other differential backups have been made since then, whereas an incremental backup copies files that have been created or changed since the most recent backup of any type (full or incremental). Other variations of incremental backup include multi-level incrementals[13] and block-level incrementals[13] that compare parts of files instead of just entire files.

Continuous data protection

Continuous data protection (CDP), also called continuous backup[14][15] or real-time backup, refers to backup of computer data by automatically saving a copy of every change made to that data, essentially capturing every version of the data that the user saves. It allows restoring data to any point in time.[16][17] The technique was patented by British entrepreneur Pete Malcolm in 1989.[18]

CDP logs every change on the host system, often by saving byte or block-level differences rather than file-level differences.[19][8] This backup method differs from simple disk mirroring[8] in that it enables a roll-back of the log and thus restoration of old images of data.

Ideal continuous data protection is that the recovery point objective is unlimited in content, even if the recovery time objective is not.[20]

CDP differs from RAID, replication, or mirroring by enabling rollback to any point in time. A related technique is journaling.

Captured changes can provide fine granularities of restorable objects ranging from crash-consistent images to logical objects such as files, databases and logs.[21]

Network bandwidth throttling[15] may be needed to reduce the impact of CDP in multimedia and CAD design environments.[22]

An alternative is snapshots, a near-continuous solution, whereby restore points are periodically created to track changes.

Storage media

From left to right, a DVD disc in plastic cover, a USB flash drive and an external hard drive

Regardless of the repository model that is used, the data has to be copied onto some archive file data storage medium.

Magnetic tape
Magnetic tape has long been the most commonly used medium for bulk data storage, backup, archiving, and interchange. Tape has typically had an order of magnitude better capacity-to-price ratio when compared to hard disk, but the ratios for tape and hard disk have become closer.[23] Many tape formats have been proprietary or specific to certain markets like mainframes or a particular brand of personal computer, but by 2014 LTO was edging out two other remaining viable "super" formats—IBM 3592 (now also referred to as the TS11xx series) and Oracle StorageTek T10000,[24] and further development of the smaller-capacity DDS format had been canceled. By 2017 Spectra Logic, which builds tape libraries for both the LTO and TS11xx formats, was predicting that "Linear Tape Open (LTO) technology has been and will continue to be the primary tape technology."[25] Tape is a sequential access medium, so even though access times may be poor, the rate of continuously writing or reading data can actually be very fast.
Hard disk
The capacity-to-price ratio of hard disks has been improving for many years, making them more competitive with magnetic tape as a bulk storage medium. The main advantages of hard disk storage are low access times, availability, capacity and ease of use.[26] External disks can be connected via local interfaces like SCSI, USB, FireWire, or eSATA, or via longer distance technologies like Ethernet, iSCSI, or Fibre Channel. Some disk-based backup systems, via Virtual Tape Libraries or otherwise, support data deduplication, which can dramatically reduce the amount of disk storage capacity consumed by daily and weekly backup data.[27][28][29] One disadvantage of hard disk backups vis-a-vis tape is that hard drives are close-tolerance mechanical devices and may be more easily damaged, especially while being transported (e.g., for off-site backups).[30] In the mid-2000s, several drive manufacturers began to produce portable drives employing ramp loading and accelerometer technology (sometimes termed a "shock sensor"),[31][32] and—by 2010—the industry average in drop tests for drives with that technology showed drives remaining intact and working after a 36-inch non-operating drop onto industrial carpeting.[33] The manufacturers do not, however, guarantee these results and note that a drive may fail to survive even a shorter drop.[33] Some manufacturers also offer 'ruggedized' portable hard drives, which include a shock-absorbing case around the hard disk, and claim a range of higher drop specifications.[33][34][35] Another disadvantage is that over a period of years the stability of hard disk backups is shorter than that of tape backups.[24][36][30]
Optical storage
Recordable CDs, DVDs, and Blu-ray Discs are commonly used with personal computers and generally have low media unit costs. However, the capacities and speeds of these and other optical discs have traditionally been lower than that of hard disks or tapes (though advances in optical media are slowly shrinking that gap[37][38]). Many optical disk formats are WORM type, which makes them useful for archival purposes since the data cannot be changed. The use of an auto-changer or jukebox can make optical discs a feasible option for larger-scale backup systems. Some optical storage systems allow for cataloged data backups without human contact with the discs, allowing for longer data integrity. A 2008 French study indicated the lifespan of typically-sold CD-Rs was 2-10 years,[39] but one manufacturer later estimated the longevity of its CD-Rs with a gold-sputtered layer to be as high as 100 years.[40]
SSD/Solid-state drive
Also known as flash memory, thumb drives, USB flash drives, CompactFlash, SmartMedia, Memory Stick, Secure Digital cards, etc., these devices are relatively expensive for their low capacity in comparison to hard disk drives, but are very convenient for backing up relatively low data volumes. A solid-state drive does not contain any movable parts unlike its magnetic drive counterpart, making it less susceptible to physical damage, and can have huge throughput in the order of 500Mbit/s to 6Gbit/s. The capacity offered from SSDs continues to grow and prices are gradually decreasing as they become more common.[41][34] Over a period of years the stability of flash memory backups is shorter than that of hard disk backups.[24]
Remote backup service AKA cloud backup
Adding cloud-based backup to the benefits of local and offsite tape archiving, the New York Times wrote, "adds a layer of data protection."[42] Offsite has historically been used to protect against events such as fires, floods, or earthquakes which could destroy locally stored backups.[43]

Factors for success include:

  • initial seed loading / cloud seeding
  • trusting a provider to maintain the privacy and integrity of their data (with confidentiality enhanced by encryption)
Floppy disk and its derivatives
During the 1980s and early 1990s, many personal/home computer users associated backing up mostly with copying to floppy disks. However, the data capacity of floppy disks did not keep pace with growing demands, rendering them effectively obsolete. Later "superfloppy" devices and related "non-floppy" devices provide greater storage capacity and remain supported as backup media by some developers.[27]

Managing the information repository

Regardless of the information repository model, or data storage media used for backups, a balance needs to be struck between accessibility, security and cost. These media management methods are not mutually exclusive and are frequently combined to meet the user's needs. Using on-line disks for staging data before it is sent to a near-line tape library is a common example.

Information repository implementations include[44][45]:

On-line
On-line backup storage is typically the most accessible type of data storage, which can begin a restore in milliseconds. An internal hard disk or a disk array (maybe connected to SAN) is one example of an on-line backup. This type of storage is convenient and speedy, but is relatively expensive and is vulnerable to being deleted or overwritten, either by accident, by malevolent action, or in the wake of a data-deleting virus payload.
Near-line
Near-line storage is typically less accessible and less expensive than on-line storage, but still useful for backup data storage. A good example would be a tape library with restore times ranging from seconds to a few minutes. A mechanical device is usually used to move media units from storage into a drive where the data can be read or written. Generally it has safety properties similar to on-line storage.
Off-line
Off-line storage requires some direct action to provide access to the storage media: for example inserting a tape into a tape drive or plugging in a cable. Because the data are not accessible via any computer except during limited periods in which they are written or read back, they are largely immune to a whole class of on-line backup failure modes. Access time will vary depending on whether the media are on-site or off-site.
Off-site data protection
Backup media may be sent to an off-site vault to protect against a disaster or other site-specific problem. The vault can be as simple as a system administrator's home office or as sophisticated as a disaster-hardened, temperature-controlled, high-security bunker with facilities for backup media storage. Importantly a data replica can be off-site but also on-line (e.g., an off-site RAID mirror). Such a replica has fairly limited value as a backup, and should not be confused with an off-line backup.
Backup site or disaster recovery center (DR center)
In the event of a disaster, the data on backup media will not be sufficient to recover. Computer systems onto which the data can be restored and properly configured networks are necessary too. Some organizations have their own data recovery centers that are equipped for this scenario. Other organizations contract this out to a third-party recovery center. Because a DR site is itself a huge investment, backing up is very rarely considered the preferred method of moving data to a DR site. A more typical way would be remote disk mirroring, which keeps the DR data as up to date as possible.

Selection and extraction of data

A successful backup job starts with selecting and extracting coherent units of data. Most data on modern computer systems is stored in discrete units, known as files. These files are organized into filesystems. Files that are actively being updated can be thought of as "live" and present a challenge to back up. It is also useful to save metadata that describes the computer or the filesystem being backed up.

Deciding what to back up at any given time involves tradeoffs. By backing up too much redundant data, the information repository will fill up too quickly. Backing up an insufficient amount of data can eventually lead to the loss of critical information.[46]

Files

Copying files
With file-level approach, making copies of files is the simplest and most common way to perform a backup. A means to perform this basic function is included in all backup software and all operating systems.
Partial file copying
Instead of copying whole files, a backup may include only the blocks or bytes within a file that have changed in a given period of time. This technique can substantially reduce needed storage space, but requires a high level of sophistication to reconstruct files in a restore situation. Some implementations require integration with the source file system.
Deleted files
To prevent the unintentional restoration of files that have been intentionally deleted, a record of the deletion must be kept.

Filesystems

Filesystem dump
Instead of copying files within a file system, a copy of the whole filesystem itself in block-level can be made. This is also known as a raw partition backup and is related to disk imaging. The process usually involves unmounting the filesystem and running a program like dd (Unix).[47] Because the disk is read sequentially and with large buffers, this type of backup can be much faster than reading every file normally, especially when the filesystem contains many small files, is highly fragmented, or is nearly full. But because this method also reads the free disk blocks that contain no useful data, this method can also be slower than conventional reading, especially when the filesystem is nearly empty. Some filesystems, such as XFS, provide a "dump" utility that reads the disk sequentially for high performance while skipping unused sections. The corresponding restore utility can selectively restore individual files or the entire volume at the operator's choice.[48]
Identification of changes
Some filesystems have an archive bit for each file that says it was recently changed. Some backup software looks at the date of the file and compares it with the last backup to determine whether the file was changed.
Versioning file system
A versioning filesystem tracks all changes to a file. The number of versions can be all the way back to the file's creation time, or less. The Wayback versioning filesystem for Linux is an example.[49]

Live data

A snapshot is an instantaneous function of some filesystems that presents a copy of the filesystem as if it were frozen at a specific point in time, often by a copy-on-write mechanism. An effective way to back up live data is to temporarily quiesce them (e.g., close all files), take a snapshot, and then resume live operations. At this point the snapshot can be backed up through normal methods.[50] Snapshotting a file while it is being changed results in a corrupted file that is unusable, as most large files contain internal references between their various parts that must remain consistent throughout the file. This is also the case across interrelated files, as may be found in a conventional database or in applications such as Microsoft Exchange Server. The term fuzzy backup can be used to describe a backup of live data that looks like it ran correctly, but does not represent the state of the data at a single point in time.[51]

Backup options for data files that cannot be or are not quiesced include:[52]

Open file backup
Many backup software applications undertake to back up open files in an internally consistent state.[53] File locking would be useful for regulating access to open files, but this may be inconvenient for the user. Some applications simply check whether open files are in use and try again later.[27] Other applications exclude open files that are updated very frequently.[54]
Interrelated database files backup
Some interrelated database file systems offer a means to generate a "hot backup"[55] of the database while it is online and usable. This may include a snapshot of the data files plus a snapshotted log of changes made while the backup is running. Upon a restore, the changes in the log files are applied to bring the copy of the database up to the point in time at which the initial backup ended.[56]

Metadata

Not all information stored on the computer is stored in files. Accurately recovering a complete system from scratch requires keeping track of this non-file data too.[57]

System description
System specifications are needed to procure an exact replacement after a disaster.
Boot sector
The boot sector can sometimes be recreated more easily than saving it. Still, it usually isn't a normal file and the system won't boot without it.
Partition layout
The layout of the original disk, as well as partition tables and filesystem settings, is needed to properly recreate the original system.
File metadata
Each file's permissions, owner, group, ACLs, and any other metadata need to be backed up for a restore to properly recreate the original environment.
System metadata
Different operating systems have different ways of storing configuration information. Microsoft Windows keeps a registry of system information that is more difficult to restore than a typical file.

Manipulation of data and dataset optimization

It is frequently useful or required to manipulate the data being backed up to optimize the backup process. These manipulations can provide many benefits including improved backup speed, restore speed, data security, media usage and/or reduced bandwidth requirements.

Consolidation
One backup program describes its Consolidation option as "puts copies of all media files ... into the ... folder, and leaves the original files in their current locations."[58]
Compression
Various schemes can be employed to shrink the size of the source data to be stored so that it uses less storage space. Compression is frequently a built-in feature of tape drive hardware.[59]
Deduplication
When multiple similar systems are backed up to the same destination storage device, there exists the potential for much redundancy within the backed up data. For example, if 20 Windows workstations were backed up to the same archive file, they might share a common set of system files. The archive file only needs to store one copy of those files to be able to restore any one of those workstations. This technique can be applied at the file level or even on raw blocks of data, potentially resulting in a massive reduction in required storage space.[59] Deduplication can occur on a server before any data moves to backup media, sometimes referred to as source/client side deduplication. This approach also reduces bandwidth required to send backup data to its target media. The process can also occur at the target storage device, sometimes referred to as inline or back-end deduplication.
Duplication
Sometimes backup jobs are duplicated to a second set of storage media. This can be done to rearrange the backup images to optimize restore speed or to have a second copy at a different location or on a different storage medium.
Encryption
High-capacity removable storage media such as backup tapes present a data security risk if they are lost or stolen.[60] Encrypting the data on these media can mitigate this problem, but presents new problems. Encryption is a CPU intensive process that can slow down backup speeds, and the security of the encrypted backups is only as effective as the security of the key management policy.[59]
Multiplexing
When there are many more computers to be backed up than there are destination storage devices, the ability to use a single storage device with several simultaneous backups can be useful.[61]
Refactoring
The process of rearranging the backup sets in a archive file is known as refactoring. For example, if a backup system uses a single tape each day to store the incremental backups for all the protected computers, restoring one of the computers could potentially require many tapes. Refactoring could be used to consolidate all the backups for a single computer onto a single tape. This is especially useful for backup systems that do incrementals forever style backups.
Staging
Sometimes backup jobs are copied to a staging disk before being copied to tape.[61] This process is sometimes referred to as D2D2T, an acronym for Disk to Disk to Tape. This can be useful if there is a problem matching the speed of the final destination device with the source device as is frequently faced in network-based backup systems. It can also serve as a centralized location for applying other data manipulation techniques.

Objectives

Recovery point objective (RPO)
The point in time that the restarted infrastructure will reflect. Essentially, this is the roll-back that will be experienced as a result of the recovery. The most desirable RPO would be the point just prior to the data loss event. Making a more recent recovery point achievable requires increasing the frequency of synchronization between the source data and the backup repository.[62][63]
Recovery time objective (RTO)
The amount of time elapsed between disaster and restoration of business functions.[64]
Data security
In addition to preserving access to data for its owners, data must be restricted from unauthorized access. Backups must be performed in a manner that does not compromise the original owner's undertaking. This can be achieved with data encryption and proper media handling policies.[65]
Data retention period
Regulations and policy can lead to situations where backups are expected to be retained for a particular period, but not any further. Retaining backups after this period can lead to unwanted liability and sub-optimal use of storage media.[65]

Limitations

An effective backup scheme will take into consideration the following situational limitations[66]:

Backup window
The period of time when backups are permitted to run on a system is called the backup window. This is typically the time when the system sees the least usage and the backup process will have the least amount of interference with normal operations. The backup window is usually planned with users' convenience in mind. If a backup extends past the defined backup window, a decision is made whether it is more beneficial to abort the backup or to lengthen the backup window.
Performance impact
All backup schemes have some performance impact on the system being backed up. For example, for the period of time that a computer system is being backed up, the hard drive is busy reading files for the purpose of backing up, and its full bandwidth is no longer available for other tasks. Such impacts should be analyzed.
Costs of hardware, software, labor
All types of storage media have a finite capacity with a real cost. Matching the correct amount of storage capacity (over time) with the backup needs is an important part of the design of a backup scheme. Any backup scheme has some labor requirement, but complicated schemes have considerably higher labor requirements. The cost of commercial backup software can also be considerable.
Network bandwidth
Distributed backup systems can be affected by limited network bandwidth.

Implementation

Meeting the defined objectives in the face of the above limitations can be a difficult task. The tools and concepts below can make that task more achievable.

Scheduling
Using a job scheduler can greatly improve the reliability and consistency of backups by removing part of the human element. Many backup software packages include this functionality.
Authentication
Over the course of regular operations, the user accounts and/or system agents that perform the backups need to be authenticated at some level. The power to copy all data off of or onto a system requires unrestricted access. Using an authentication mechanism is a good way to prevent the backup scheme from being used for unauthorized activity.
Chain of trust
Removable storage media are physical items and must only be handled by trusted individuals. Establishing a chain of trusted individuals (and vendors) is critical to defining the security of the data.

Managing the backup process

Those who perform or oversee backups need to know how successful the backups are.

Measuring the process

To ensure that the backup scheme is working as expected, the following best practices should be enacted[67][68][69]:

Backup validation
(also known as "backup success validation") Provides information about the backup, and proves compliance to regulatory bodies outside the organization: for example, an insurance company in the USA might be required under HIPAA to demonstrate that its client data meet records retention requirements.[70] Disaster, data complexity, data value and increasing dependence upon ever-growing volumes of data all contribute to the anxiety around and dependence upon successful backups to ensure business continuity. Thus many organizations rely on third-party or "independent" solutions to test, validate, and optimize their backup operations (backup reporting).
Reporting
In larger configurations, reports are useful for monitoring media usage, device status, errors, vault coordination and other information about the backup process.
Logging
In addition to the history of computer generated reports, activity and change logs are useful for monitoring backup system events.
Validation
Many backup programs use checksums or hashes to validate that the data was accurately copied. These offer several advantages. First, they allow data integrity to be verified without reference to the original file: if the file as copied to the archive file has the same checksum as the saved value, then it is very probably correct. Second, some backup programs can use checksums to avoid making redundant copies of files, and thus improve backup speed. This is particularly useful for the de-duplication process.
Monitored backup
Backup processes can be monitored locally via a software dashboard or by a third party monitoring center. Both alert users to any errors that occur during automated backups. Some third-party monitoring services also allow collection of historical metadata, that can be used for storage resource management purposes like projection of data growth and locating redundant primary storage capacity and reclaimable backup capacity.

Enterprise client-server backup

A computer sends its data to a backup server, during a scheduled backup window.

"Enterprise client-server" backup software describes a class of software applications that back up data from a variety of client computers centrally to one or more server computers, with the particular needs of enterprises in mind. They may employ a scripted client–server[71] backup model[72] with a backup server program running on one computer, and with small-footprint client programs (referred to as "agents" in some applications) running on the other computer(s) being backed up—or alternatively as another process on the same computer as the backup server program. Enterprise-specific requirements[72] include the need to back up large amounts of data on a systematic basis, to adhere to legal requirements for the maintenance and archiving of files and data, and to satisfy short-recovery-time objectives. To satisfy these requirements (which World Backup Day (31 March)[73][74][75] highlights), it is typical for an enterprise to appoint a backup administrator—who is a part of office administration rather than of the IT staff and whose role is "being the keeper of the data".[76]

In a client-server backup application, the server program initiates the backup activity by the client program.[1] This is distinct from a "personal" backup application such as Apple's Time Machine, in which "Time Machine runs on each Mac, independently of any other Macs, whether they're backing-up to the same destination or a different one." [77] If the backup server and client programs are running on separate computers, they are connected in either a single platform or mixed platform network. The client-server backup model was originated when magnetic tape was the only financially-feasible storage medium for doing backups of multiple computers onto a single archive file;[note 1][note 2][78] because magnetic tape is a sequential access medium, it was imperative (barring "multiplexed backup") that the client computers be backed up one at a time—as initiated by the backup server program.

What is described in the preceding paragraph is the "two-tier" configuration (in one application's diagram, the second-tier backup server program is named "server" preceded by the name of the application, and first-tier "agents" are backing up interactive server applications). That configuration controls the backup server program via either an integrated GUI or a separate Administration Console. In some client-server backup applications, a "three-tier" configuration splits off the backup and restore functions of the server program to run on what are called media servers—computers to which devices containing archive files are attached either locally or as Network-attached storage (NAS). In those applications the decision on which media server a script is to run on is controlled using another program called either a master server[79] or an optional central admin. server.[80]

Performance

The steady improvement in hard disk drive price per byte has made feasible a disk-to-disk-to-tape strategy, combining the speed of disk backup and restore with the capacity and low cost of tape for offsite archival and disaster recovery purposes.[81] This, with file system technology, has led to features suited to optimization such as:

Improved disk-to-disk-to-tape capabilities
Enable automated transfers to tape for safe offsite storage of disk archive files that were created for fast onsite restores.[82][83][84]
Multithreaded backup server
Capable of simultaneously performing multiple backup, restore, and copy operations in separate "activity threads" (once needed only by those who could afford multiple tape drives).[72][85][86] In one application, all the categories of information for a particular "backup server" are stored by it; when an "Administration Console" process is started, its process synchronizes information with all running LAN/WAN backup servers.[78]
Block-level incremental backup
The ability to back up only the blocks of a file that have changed, a refinement of incremental backup that saves space[87][88][89] and may save time.[72][90] Such partial file copying is especially applicable to a database.
"Instant" scanning of source volumes
Uses the USN Journal on Windows NTFS and FSEvents on macOS (for non-APFS source volumes only) to reduce time of the scanning phase[91] on both incremental backups, thus fitting more sources into the scheduled backup window,[72][92][93] and on restores.[94]
Cramming or evading the scheduled backup window
One application has the "multiplexed backup" capability of cramming the scheduled backup window by sending data from multiple clients to a single tape drive simultaneously; "this is useful for low end clients with slow throughput ... [that] cannot send data fast enough to keep the tape drive busy .... will reduce the performance of restores."[85] Another application allows an enterprise that has computers transiently connecting to the network over a long workday to evade the scheduled window by using Proactive scripts.

Source file integrity

Backing up interactive applications via pausing
Interactive applications can be protected by having their services paused while their live data is being backed up, and then unpaused.[95] Alternatively, the backup application can back up a snapshot initiated at a natural pause.[27][91] Some enterprise backup applications accomplish pausing and unpausing of services via built-in provisions—for many specific databases and other interactive applications—that become automatically part of the backup software's script execution; these provisions may be purchased separately.[96][97][27] However another application has also added "script hooks" that enable the optional automatic execution—at specific events during runs of a GUI-coded backup script—of portions of an external script containing commands pre-written in a standard scripting language.[91] For some databases—such as MongoDB and MySQL—that can be run on filesystems that do not support snapshots, the external script can pause writing during backup.[91] Since the external script is provided by an installation's backup administrator, its code activated by the "script hooks" may accomplish not only data protection—via pausing/unpausing interactive services—[91]but also integration with monitoring systems.[98]
Backing up interactive applications via coordinated snapshots
Some interactive applications such as databases must have all portions of their component files coordinated while their live data is being backed up. One database system—PostgreSQL—can do this via its own "snapshotting" MVCC running on filesystems that do not support snapshots, and can therefore be backed up without pausing using an external script containing commands that use "script hooks".[91] Another equivalent approach is to use some filesystems' capability of taking a snapshot, and to back up the snapshot without pausing the application itself. An enterprise backup application using filesystem snapshotting can be used either to back up all user applications running on a virtual machine[99][100] or to back up a particular interactive application that directly uses its filesystem's snapshot capability.[27] Conceptually this approach can still be considered client-server backup; the snapshotting capability by itself constitutes the client, and the backup server runs as a separate process that initiates (second paragraph) and then reads the snapshot on the machine that generated it. The software installed on each machine to be backed up is referred to as an "agent"; if "agents" are being used to back up all user applications running on a virtual machine, one or more such "agents" are controlled by a console.[101][100]

User interface

To accommodate the requirements of a backup administrator who may not be part of the IT staff with access to the secure server area, enterprise client-server software may include features such as:

Administration Console
The backup administrator's backup server GUI management and near-term reporting tool.[68] Its window shows the selected backup server, with a standard toolbar on top. A sidebar on the left or navigation bar shows the clickable categories of backup server information for it; each category shows a panel, which may have a specialized toolbar below or in place of the standard toolbar. The built-in categories include activities—thus providing monitored backup, past backups of each individual source, scripts/policies/jobs (terminology depending on the application), sources (directly/indirectly), archive files, and storage devices.[98][102][103]
User-initiated backups and restores
These supplement the administrator-initiated backups and restores which backup applications have always had, and relieve the administrator of time-consuming tasks.[76] The user designates the date of the past backup from which files or folders are to be restored—once IT staff has mounted the proper volume(s) of the relevant archive file on the backup server.[81][98][104][105]
High-level/medium-term reports supplementing the Administration Console[68]
Within one application's Console panel displayed by clicking the name of the backup server itself in the sidebar, an activities pane on the top left of the displayed Dashboard has a moving bar graph for each activity going on for the backup server together with a pause and stop button for the activity. Three more backup validation panes give the results of activities in the past week: backups each day, sources backed up, and sources not backed up; as of 2019 the last two panes—together with failed backups—are summarized in an additional color-coded bullseye pane.[106] Finally a storage reporting pane has a line for each archive file, showing the last-modified date and depictions of the total bytes used and available;[87][98] as of 2019 this is supplemented by a pane that gives a linear-regression prediction for growth of each archive file.[106] For the application's Windows variant, the Dashboard acts as a display-only substitute for a non-existent Console[27]—but was upgraded in 2019 into an optional two-way Web-based Management Console.[91] Other applications have a separate reporting and monitored backup facility that can cover multiple backup servers.[107][108]
E-mailing of notifications about operations to chosen recipients[68]
Can alert the recipient to, e.g., errors or warnings, including extracts of logging to assist in pinpointing problems.[27][107][109]
Integration with monitoring systems[68]
Such systems provide longer-term backup validation. One application's administrators can deploy custom scripts that—invoking webhook code via script hooks—populate such systems as the freeware Nagios and IFTTT and the freemium Slack with script successes and failures corresponding to the activities category of the Console, per-source backup information corresponding to the past backups category of the Console, and media requests.[98] Another application has integration with two of the developer's monitoring systems, one that is part of the client-server backup application and one that is more generalized.[107] Yet another application has integration with a monitoring system that is part of the client-server backup application,[110] but can also be integrated with Nagios.[111]

LAN/WAN/Cloud

Advanced network client support
All applications includes support for multiple network interfaces.[72][112][113] However one application, unless deduplication is done by a separate sub-application between the client and the backup server, cannot provide "resilient network connections" for machines on a WAN.[114] One application can extend support to "remote" clients anywhere on the Internet for a Proactive script and for user-initiated backups/restores.[91]

Cloud seeding and Large-Scale Recovery

Because of a large amount of data already backed up,[72] an enterprise adopting cloud backup likely will need to do "seeding". This service uses a synthetic full backup to copy a large locally-stored archive file onto a large-capacity disk device, which is then physically shipped to the cloud storage site and uploaded.[115][116] After the large initial upload, the enterprise's backup software may facilitate reconfiguration for writing to and reading from the archive file incrementally in its cloud location.[117] The service may need to be employed in reverse for faster large-scale data recovery times than would be possible via an Internet connection.[115] Some applications offer seeding and large-scale recovery via third-party services, which may use a high-speed Internet channel to/from cloud storage rather than a shipable physical device.[118][119]

See also

About backup
Related topics

See also


Notes

  1. ^ a b In contrast to everyday use of the term "archive", the data stored in an "archive file" is not necessarily old or of historical interest.
  2. ^ Several client-server applications use the term "archiving" to describe a backup operation that deletes data from a client source once the data's backup is complete. Bokelman, Seth (26 February 2012). "what is archiving in Netbackup?". VOX. Veritas Technologies LLC. Retrieved 13 May 2018."Retrospect ® 14.0 Mac User's Guide" (PDF). Retrospect. Retrospect Inc. March 2017. pp. 124-126(Archiving). Retrieved 28 March 2017."Backup Exec Archiving Option is no longer supported for Backup Exec 15 Feature Pack 1". Veritas Support. Veritas Technologies LLC. 20 June 2015. Retrieved 13 May 2018.

References

  1. ^ a b c Joe Kissell (2007). Take Control of Mac OS X Backups (PDF) (Version 2.0 ed.). Ithaca, NY: TidBITS Electronic Publishing. pp. 18-20 (The Archive), 24 (client-server), 82-83 (archive file), 112-114 (Off-site storage backup rotation scheme), 126-141 (old Retrospect terminology and GUI—still used in Windows variant), 165 (client-server), 128 (subvolume—later renamed Favorite Folder in Macintosh variant). ISBN 0-9759503-0-4. Retrieved 17 May 2019.
  2. ^ "back•up". The American Heritage Dictionary of the English Language. Houghton Mifflin Harcourt. 2018. Retrieved 9 May 2018.
  3. ^ S. Nelson (2011). "Chapter 1: Introduction to Backup and Recovery". Pro Data Backup and Recovery. Apress. pp. 1–16. ISBN 978-1-4302-2663-5. Retrieved 8 May 2018.
  4. ^ Cougias, D.J.; Heiberger, E.L.; Koop, K. (2003). "Chapter 1: What's a Disaster Without a Recovery?". The Backup Book: Disaster Recovery from Desktop to Data Center. Network Frontiers. pp. 1–14. ISBN 0-9729039-0-9.
  5. ^ Terry Sullivan (11 January 2018). "A Beginner's Guide to Backing Up Photos". The New York Times. a hard drive ... an established company ... declared bankruptcy ... where many ... had ...
  6. ^ McMahon, Mary (1 April 2019). "What Is an Information Repository?". wiseGEEK. Conjecture Corporation. Retrieved 8 May 2019. In the sense of an approach to data management, an information repository is a secondary storage space for data.
  7. ^ a b Rouse, Margaret (April 2005). "Definition: repository". whatis.com. TechTarget. Retrieved 1 May 2019.
  8. ^ a b c d e f Mayer, Alex (6 November 2017). "Backup Types Explained: Full, Incremental, Differential, Synthetic, and Forever-Incremental". Nakivo Blog. Nakivo. Full Backup, Incremental Backup, Differential Backup, Mirror Backup, Reverse Incremental Backup, Continuous Data Protection (CDP), Synthetic Full Backup, Forever-Incremental Backup. Retrieved 17 May 2019.
  9. ^ "Five key questions to ask about your backup solution". sysgen.ca. Archived from the original on 4 March 2016. Retrieved 23 September 2015. {{cite web}}: Unknown parameter |dead-url= ignored (|url-status= suggested) (help)
  10. ^ Incremental Backup Archived 21 June 2016 at the Wayback Machine. Retrieved 10 March 2006
  11. ^ such as Apple's Time Machine
  12. ^ "Symantec Backup Exec: About the synthetic backup feature". Helpmax.net. HelpMax Software Help & Shop Inc. Retrieved 13 January 2018. {{cite web}}: Italic or bold markup not allowed in: |website= (help)
  13. ^ a b c Reed, Jessie (27 February 2018). "What Is Incremental Backup?". Nakivo Blog. Nakivo. Reverse incremental, Multilevel incremental, Block-level. Retrieved 17 May 2019.
  14. ^ "the Continuous option"
  15. ^ a b David Pogue (4 January 2007). "Fewer Excuses for Not Doing a PC Backup". The New York Times. options like "Enable Bandwidth Throttle" and "Don't back up if the CPU is over this % busy."
  16. ^ Behzad Behtash (10 May 2010). "Why Continuous Data Protection's Getting More Practical". Disaster recovery/business continuity. Informationweek. Retrieved 12 November 2011.
  17. ^ http://www.eweek.com/c/a/Data-Storage/How-to-Use-Continuous-Data-Protection-to-Improve-Backup-Disaster-Recovery/
  18. ^ Peter B. Malcolm (13 November 1989). "US Patent 5086502: Method of operating a data processing system". Google Patents. Retrieved 29 November 2016. Filing date Nov 13, 1989
  19. ^ Brian Posey (August 2016). "CDP technology offers organizations a steady data protection method". DataBackup. TechTarget. Other differentiators, Value in block-level backup. Retrieved 10 May 2019.
  20. ^ Richard May. "Finding RPO and RTO". Archived from the original on 3 March 2016. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  21. ^ "An Overview of Continuous Data Protection". Infosectoday.com. Retrieved 12 November 2011.
  22. ^ Carter, Nick (5 August 2010). "Off-Site Backup - The Bandwidth Hog". Accel Networks. Archived from the original on 7 July 2011.
  23. ^ Disk to Disk Backup versus Tape – War or Truce? Archived 12 July 2016 at the Wayback Machine (9 December 2004). Retrieved 10 March 2007
  24. ^ a b c Coughlin, Tom (29 June 2014). "Keeping Data for a Long Time". Forbes. Forbes Media LLC. para. Magnetic Tapes(popular formats, storage life), para. Hard Disk Drives(active archive), para. First consider flash memory in archiving(... may not have good media archive life). Retrieved 19 April 2018.
  25. ^ "Digital Data Storage Outlook 2017" (PDF). Spectra. Spectra Logic. 2017. p. 14(Tape). Retrieved 11 July 2018.
  26. ^ "Bye Bye Tape, Hello 5.3TB eSATA". Retrieved 22 April 2007.
  27. ^ a b c d e f g h "Retrospect ® 12 Windows User's Guide" (PDF). Retrospect. Retrospect Inc. 2017. pp. 30-31(deduplication via "Snapshots"—a Retrospect term which predates and is distinct from Snapshot_(computer_storage)), 31-32(Dashboard), 41-43(removable disk drives), 216-218(selector as subset filter for synthetic full backups), 230-233(Scripted Verification), 280(Multiple Executions), 369(Duplicate Execution Options), 420(Startup Preferences—Launcher for auto-launch), 426-427(E-mail), 433-434(Open File Backup Tips—VSS snapshot at natural pause), 530-544(SQL Server Agent—coordinating VSS snapshot), 545-566(Exchange Server Agent—coordinating VSS snapshot). Retrieved 2 September 2018.
  28. ^ "Symantec Shows Backup Exec a Little Dedupe Love; Lays out Source Side Deduplication Roadmap – DCIG". DCIG. Archived from the original on 4 March 2016. Retrieved 26 February 2016. {{cite web}}: Unknown parameter |dead-url= ignored (|url-status= suggested) (help)
  29. ^ "Veritas NetBackup™ Deduplication Guide". Veritas. Veritas Technologies LLC. 2016. Retrieved 26 July 2018.
  30. ^ a b Jacobi, John L. (29 February 2016). "Hard-core data preservation: The best media and methods for archiving your data". PC World. sec. External Hard Drives(on the shelf, magnetic properties, mechanical stresses, vulnerable to shocks). Retrieved 19 April 2018.
  31. ^ "Ramp Load/Unload Technology in Hard Disk Drives" (PDF). HGST. Western Digital. November 2007. p. 3(sec. Enhanced Shock Tolerance). Retrieved 29 June 2018.
  32. ^ "Toshiba Portable Hard Drive (Canvio® 3.0)". Toshiba Data Dynamics Singapore. Toshiba Data Dynamics Pte Ltd. 2018. sec. Overview(Internal shock sensor and ramp loading technology). Retrieved 16 June 2018.
  33. ^ a b c "Iomega ® Drop Guard ™ Technology" (PDF). Hard Drive Storage Solutions. Iomega Corp. 20 September 2010. pp. 2(What is Drop Shock Technology?, What is Drop Guard Technology? (... features special internal cushioning .... 40% above the industry average)), 3(*NOTE). Retrieved 12 July 2018.
  34. ^ a b John Burek (15 May 2018). "The Best Rugged Hard Drives and SSDs". PC Magazine. Ziff Davis. What Exactly Makes a Drive Rugged?(When a drive is encased ... you're mostly at the mercy of the drive vendor to tell you the rated maximum drop distance for the drive). Retrieved 4 August 2018.
  35. ^ Justin Krajeski; Kimber Streams (20 March 2017). "The Best Portable Hard Drive". The New York Times. Retrieved 4 August 2018. {{cite web}}: Check |archiveurl= value (help)
  36. ^ "Best Long-Term Data Archive Solutions". Iron Mountain. Iron Mountain Inc. 2018. sec. More Reliable(average mean time between failure ... rates, best practice for migrating data). Retrieved 19 April 2018.
  37. ^ S. Wan; Q. Cao; C. Xie (2014). "Optical storage: An emerging option in long-term digital preservation". Frontiers of Optoelectronics. 7 (4): 486–492. doi:10.1007/s12200-014-0442-2.
  38. ^ Q. Zhang; Z. Xia; Y.-B. Cheng; M. Gu (2018). "High-capacity optical long data memory based on enhanced Young's modulus in nanoplasmonic hybrid glass composites". Nature Communications. 9: 1183. doi:10.1038/s41467-018-03589-y.
  39. ^ Gérard Poirier; Foued Berahou (3 March 2008). "Journal de 20 Heures". Institut national de l'audiovisuel. approximately minute 30 of the TV news broadcast. Retrieved 3 March 2008.
  40. ^ "Archival Gold CD-R "300 Year Disc" Binder of 10 Discs with Scratch Armor Surface". Delkin Devices. Delkin Devices Inc. Archived from the original on 27 September 2013.
  41. ^ R. Micheloni; P. Olivo (2017). "Solid-State Drives (SSDs)". Proceedings of the IEEE. 105 (9): 1586–88. doi:10.1109/JPROC.2017.2727228. Retrieved 8 May 2018.
  42. ^ J. D. Biersdorfer (5 April 2018). "Monitoring the Health of a Backup Drive". The New York Times.
  43. ^ "Remote Backup". EMC Glossary. Dell, Inc. Retrieved 8 May 2018.
  44. ^ Stackpole, B.; Hanrion, P. (2007). Software Deployment, Updating, and Patching. CRC Press. pp. 164–165. ISBN 978-1-4200-1329-0. Retrieved 8 May 2018.
  45. ^ Gnanasundaram, S.; Shrivastava, A., eds. (2012). Information Storage and Management: Storing, Managing, and Protecting Digital Information in Classic, Virtualized, and Cloud Environments. John Wiley and Sons. p. 255. ISBN 978-1-118-23696-3. Retrieved 8 May 2018.
  46. ^ Lees, D. (25 January 2017). "What to backup – a critical look at your data". Irontree Blog. Irontree Internet Services CC. Retrieved 8 May 2018.
  47. ^ Preston, W.C. (2007). Backup & Recovery: Inexpensive Backup Solutions for Open Systems. O'Reilly Media, Inc. pp. 111–114. ISBN 978-0-596-55504-7. Retrieved 8 May 2018.
  48. ^ Preston, W.C. (1999). Unix Backup & Recovery. O'Reilly Media, Inc. pp. 73–91. ISBN 978-1-56592-642-4. Retrieved 8 May 2018.
  49. ^ Wayback: A User-level V File System for Linux Archived 6 April 2007 at the Wayback Machine (2004). Retrieved 10 March 2007
  50. ^ Staimer, Marc (2011). "Using different types of storage snapshot technologies for data protection". TechTarget. TechTarget Inc. Retrieved 4 December 2018.
  51. ^ Liotine, M. (2003). Mission-critical Network Planning. Artech House. p. 244. ISBN 978-1-58053-559-5. Retrieved 8 May 2018.
  52. ^ de Guise, P. (2008). Enterprise Systems Backup and Recovery: A Corporate Insurance Policy. CRC Press. pp. 50–54. ISBN 978-1-4200-7640-0.
  53. ^ "Open File Backup Software for Windows". Handy Backup. Novosoft LLC. 8 November 2018. Retrieved 29 November 2018.
  54. ^ Reitshamer, Stefan (5 July 2017). "Troubleshooting backing up open/locked files on Windows". Arq Blog. Haystack Software. Stefan Reitshamer is the principal developer of Arq. Retrieved 29 November 2018.
  55. ^ Boss, Nina (10 December 1997). "Oracle Tips Session #3: Oracle Backups". www.wisc.edu. University of Wisconsin. Retrieved 1 December 2018. {{cite web}}: |archive-date= requires |archive-url= (help)
  56. ^ "What is ARCHIVE-LOG and NO-ARCHIVE-LOG mode in Oracle and the advantages & disadvantages of these modes?". Arcserve Backup. Arcserve. 27 September 2018. Retrieved 29 November 2018.
  57. ^ Grešovnik, Igor (April 2016). "Preparation of Bootable Media and Images". Archived from the original on 25 April 2016. Retrieved 21 April 2016.
  58. ^ J. D. Biersdorfer (30 December 2016). "Back Up Your iTunes Collection — or Your Whole Computer". NYTimes.com.
  59. ^ a b c D. Cherry (2015). Securing SQL Server: Protecting Your Database from Attackers. Syngress. pp. 306–308. ISBN 978-0-12-801375-5. Retrieved 8 May 2018.
  60. ^ Backups tapes a backdoor for identity thieves Archived 5 April 2016 at the Wayback Machine (28 April 2004). Retrieved 10 March 2007
  61. ^ a b Preston, W.C. (2007). Backup & Recovery: Inexpensive Backup Solutions for Open Systems. O'Reilly Media, Inc. pp. 219–220. ISBN 978-0-596-55504-7. Retrieved 8 May 2018.
  62. ^ Definition of recovery point objective Archived 13 May 2007 at the Wayback Machine. Retrieved 10 March 2007
  63. ^ "Top four things to consider in business continuity planning". sysgen.ca. Archived from the original on 4 March 2016. Retrieved 23 September 2015. {{cite web}}: Unknown parameter |dead-url= ignored (|url-status= suggested) (help)
  64. ^ Definition of recovery time objective Archived 16 May 2007 at the Wayback Machine. Retrieved 7 March 2007
  65. ^ a b Little, D.B. (2003). "Chapter 2: Business Requirements of Backup Systems". Implementing Backup and Recovery: The Readiness Guide for the Enterprise. John Wiley and Sons. pp. 17–30. ISBN 978-0-471-48081-5. Retrieved 8 May 2018.
  66. ^ Nelson, S. (2011). "Chapter 9: Putting It All Together: Sample Backup Environments". Pro Data Backup and Recovery. Apress. pp. 203–246. ISBN 978-1-4302-2663-5. Retrieved 8 May 2018.
  67. ^ Akhtar, A.N.; Buchholtz, J.; Ryan, M.; Setty, K. (2012). "Database Backup and Recovery Best Practices". ISACA Journal. 1: 1–6. Retrieved 8 May 2018.
  68. ^ a b c d e Dorion, Pierre (June 2008). "Why you need a data backup reporting tool". TechTarget. Tech Target Inc. Retrieved 13 November 2017.
  69. ^ Pritchard, S. (December 2017). "Cloud-to-cloud backup: What it is and why you need it". Computer Weekly. TechTarget. Retrieved 8 May 2018.
  70. ^ HIPAA Advisory Archived 11 April 2007 at the Wayback Machine. Retrieved 10 March 2007
  71. ^ Gripman, Stuart (27 March 2012). "Retrospect 9.0: powerful backup for professionals, organizations". MacWorld. Scheduling scripts(GUI scripting), Restoring(Proactive priorities). Retrieved 3 November 2017.
  72. ^ a b c d e f g Rassokhin?, Alexander? (2012). "Enterprise Network Backup Challenges". All About Backup. Novosoft LLC. Retrieved 13 November 2017.
  73. ^ Misener, Dan (29 March 2016). "World Backup Day highlights importance of protecting data". CBC News.
  74. ^ =Anja Schmoll-Trautmann (31 March 2017). "World Backup Day: deutliche Lücken zwischen Sicherheitsrisiko und Nutzerverhalten" (in German). ZDNet.{{cite web}}: CS1 maint: extra punctuation (link)
  75. ^ Preimesberger, Chris (31 March 2017). "World Backup Day 2017: 'We Don't Know the Day Nor the Hour'". eWeek. QuinStreet. Ian Wood of Veritas. Retrieved 11 November 2017.
  76. ^ a b Dorion, Pierre (4 August 2008). "The true role of a backup administrator". TechTarget. TechTarget, Inc. Retrieved 13 November 2017. On the other hand, the role of a backup administrator should be one of administration, not operation....whose role is "being the keeper of the data"
  77. ^ Pond, James (26 August 2013). "Time Machine - FAQs 33. Backing - up multiple Macs". baligu.com. James Pond (originally). Retrieved 28 October 2018.
  78. ^ a b Engst, Adam (23 March 2009). "EMC Ships Modernized Retrospect 8". TidBITS. TidBITS Publishing Inc. New Backup Capabilities. Retrieved 3 November 2018.
  79. ^ "Backing Up Databases with Veritas NetBackup". Pivotal Documentation. Pivotal Software, Inc. 2018. About NetBackup Software. Retrieved 18 January 2019.
  80. ^ "Symantec Backup Exec: How CASO Works". Helpmax.net. HelpMax Software Help & Shop Inc. Retrieved 18 January 2019.
  81. ^ a b Fernando, Sal (30 April 2008). "Combine disk, tape benefits to protect data". ZDNet. Retrieved 13 November 2017.
  82. ^ "New EMC Dantz Retrospect 7 Improves Data Protection for SMBs and the Distributed Enterprise". DellEMC [current]. EMC Corp. [orig. publisher]. 31 January 2005. Retrieved 23 November 2016.
  83. ^ "About NetBackup Replication Director". Veritas Support. Veritas Technologies LLC (US). 13 July 2017. Retrieved 18 November 2017.
  84. ^ "Symantec Backup Exec: About duplicating backed up data". Helpmax.net. HelpMax Software Help & Shop Inc. Retrieved 13 January 2018. {{cite web}}: Italic or bold markup not allowed in: |website= (help)
  85. ^ a b "What is the difference between multiplexing and multistreaming?". Veritas Support. Veritas Technologies LLC (US). 29 January 2015. Retrieved 19 November 2017.
  86. ^ McMillen, Robert (21 July 2015). "How to run concurrent jobs in Backup exec 15" (Video). Google. Retrieved 14 January 2018 – via YouTube.
  87. ^ a b Schmitz, Agen (6 March 2014). "Retrospect 11". TitBITS. TidBITS Publishing Inc. Retrieved 27 April 2017.
  88. ^ "How Veritas NetBackup block-level incremental backup works for Oracle database files". Symantec. Veritas Technologies LLC (US). 2013. Retrieved 18 November 2017.
  89. ^ Harbaugh, Logan (Fall 2015). "Developing a Real Backup Plan with Symantec's Backup Exec 15". EdTech. CDW LLC. Retrieved 14 January 2018.
  90. ^ Whitehouse, Lauren (September 2008). "The pros and cons of file-level vs. block-level data deduplication technology". TechTarget. Tech Target Inc. Retrieved 13 November 2017.
  91. ^ a b c d e f g h Cite error: The named reference RetrospectKnowledgeBase was invoked but never defined (see the help page).
  92. ^ "About the Accelerator feature in NetBackup 7.5". Veritas Support. Veritas Technologies LLC (US). 10 November 2017. Retrieved 18 November 2017.
  93. ^ "Veritas Backup Exec Administrator's Guide: How Backup Exec determines if a file has been backed up". Veritas Support. Veritas Technologies LLC. 11 November 2017. Retrieved 7 February 2018.
  94. ^ Adam Engst (6 November 2012). "Retrospect 10 Reduces Backup Time with Instant Scan Technology". TidBITS. TidBITS Publishing Inc. Retrieved 25 October 2016.
  95. ^ Rassokhin?, Alexander? (2012). "Enterprise Backup Software: Backup Network Workstations, Email and Databases". All about Backup. Novosoft LLC. Retrieved 24 January 2018.
  96. ^ "Veritas NetBackup ™ 8.0 – 8.x.x Database and Application Agent Compatibility List". Veritas. Veritas Technologies LLC (US). 17 November 2017. Retrieved 19 November 2017.
  97. ^ "Backup Exec TM 16 Agents and Options" (PDF). Veritas. Veritas Technologies LLC. 2016. Retrieved 14 January 2018.
  98. ^ a b c d e "Retrospect ® 14.0 Mac User's Guide" (PDF). Retrospect. Retrospect Inc. March 2017. pp. 8-9(Script Hooks—backing up interactive applications with pausing and integration with monitoring system), 18-26(Overview of the Retrospect Console), 27-28(High-level Dashboard—high-level/medium-term reports), 29(How Retrospect Works—Smart Incremental), 31-33(Media Sets), 73(Adding network shares), 74-75(User-initiated backups and restores), 124-126(Archiving), 168-169(Email Preferences), 217(Retrospect for iOS). Retrieved 27 April 2019.
  99. ^ Seget, Vladan (20 December 2017). "Veeam Backup and Replication 9.5 U3 Released". ESXVirtualization. Retrieved 20 December 2017.
  100. ^ a b "Retrospect: Retrospect Virtual". Retrospect.com. Retrospect Inc. 2018. Retrieved 28 October 2018.
  101. ^ "Backup & Replication Console". Veeam Help Center. Veeam Software. 5 April 2016. Retrieved 28 October 2018.
  102. ^ "Symantec NetBackup ™ Administrator's Guide, Volume I Windows" (PDF). Symantec. Veritas Technologies LLC (US). 2012. pp. 35–45(Administration Console), 833–843(Activity Monitor), 888–894(Reports utility), 912(Remote Administration Console), 915–938(Java Console). Retrieved 18 November 2017.
  103. ^ "Symantec Backup Exec: About the Administration Console". Helpmax.net. HelpMax Software Help & Shop Inc. Retrieved 10 December 2017. {{cite web}}: Italic or bold markup not allowed in: |website= (help)
  104. ^ "OpsCenter Operational Restore". Veritas Support. Veritas Technologies LLC (US). 12 March 2012. Retrieved 18 November 2017.
  105. ^ "How Backup Exec Retrieve works". Helpmax.net. HelpMax Software Help & Shop Inc. Retrieved 14 January 2018. {{cite web}}: Italic or bold markup not allowed in: |website= (help)
  106. ^ a b "Data Hooks: Modular Web Plugins for Retrospect Dashboard". Retrospect. Retrospect Inc. 2019. screenshots. Retrieved 14 April 2019.
  107. ^ a b c Antony, Erica; Tim Burlowski (January 2008). "NetBackup Operations Manager: Monitoring, Alerting and Reporting for Veritas NetBackup" (PDF attachment). Symantec. Veritas Technologies LLC (US). pp. 4–5(monitoring), 6–7(alerting), 7(3rdPartyEventMgmt.), 11–18(reporting). Retrieved 18 November 2017.
  108. ^ "Windows® Enterprise Data Protection with Symantec Backup Exec™" (PDF). Symantec. Veritas Technologies LLC. 2007. pp. 5–8 (CASO). Retrieved 14 January 2018.
  109. ^ "How to configure notification recipients in Backup Exec 12.0 and above". Veritas Support. Veritas Technologies LLC. 10 November 2017. Retrieved 15 January 2018.
  110. ^ "Veritas Backup Exec Administrator's Guide: About the Job Monitor". Veritas Support. Veritas Technologies LLC. 11 November 2017. Retrieved 15 January 2018.
  111. ^ "Nagios plugins for monitoring BackupExec". Nagios Exchange. Nagios Enterprises. Retrieved 15 January 2018.
  112. ^ "EMC Announces Retrospect 8.0 Backup and Recovery Software For Mac". DellEMC [current]. EMC Corp. [orig. publisher]. 6 January 2009. Retrieved 10 November 2016.
  113. ^ "Veritas Backup Exec Administrator's Guide: Configuring network options for backup jobs". Veritas Support. Veritas Technologies LLC. 17 November 2017. Retrieved 15 January 2018.
  114. ^ "Veritas NetBackup™ Deduplication Guide" (PDF). Veritas. Veritas Technologies LLC (US). 2016. p. 171(Resilient network properties). Retrieved 18 November 2017.
  115. ^ a b "What Is an AWS Snowball Appliance?". AWS. Amazon.com. 2018. Retrieved 8 March 2018.
  116. ^ Rouse, Margaret (December 2011). "Definition: cloud seeding". TechTarget. Tech Target Inc. Retrieved 16 November 2017.
  117. ^ "Changing paths Cloud Mac" (Video). Retrospect Inc. 29 February 2016. Retrieved 7 October 2016 – via YouTube.
  118. ^ High, Dave; Mahmud, Fozz (10 March 2016). "NBU and the Amazon Storage Gateway VTL" (Video). Veritas. Veritas Technologies LLC. Retrieved 17 January 2018.
  119. ^ "Backup Exec 16: Best Practices for Using the Veritas Backup Exec Cloud Connector". Veritas Support. Veritas Technologies LLC. 25 October 2017. Retrieved 15 January 2018.