Tape Recovery - How Tape Works

The use of tape for data storage and data recoveryThere is nothing either right or wrong with any of
in the computer industry goes back many decades.this, it is just they way they are. What tape gives
Tape provided a solid and robust means of storingyou is high volume, low cost per gigabyte storage
code and data, along with a far lowerthat you can drop on the floor, pick up and read
cost-of-ownership than the available hard disk options.afterwards. Don't try that with a hard disk and then
Today the cost of hard disk has plummeted, butexpect to be able to easily recover your data.
tape storage is still considered to be the bestFile Marks aka Tape Marks
available form of long-term archival storage in termsThese are a sub-divider that you won't find on a disk.
of price and resilience.A file mark is a data pattern encoded by the drive
Those of us who were born long enough ago canand used to allow spacing to a particular position on a
remember treading gingerly, and speaking intape. You want to recover data from backup set 3,
low-tones when passing the floor of the computerwell the backup software doesn't read through
building where the hard disks were housed. Disksbackup sets 1 and 2 first, it skips file marks and then
were unreliable, low volume, and expensive to runstarts to read and recover data once it has found
whereas to recover data from tapes was fastset 3.
enough and likely to work.With 4mm DAT there is an additional type of file
The concept of "near-line" storage developed, and stillmark named as the set mark. This allows there to be
exists in the world of mainframe, AS/400 and largetwo distinct types of data marker, though only Sytos
scale UNIX computing. Years ago a request toPlus, SBACKUP and a few proprietary formats ever
recover a file would result in a message popping upmade use of this feature.
on the computer operator's screen to fetch the openHelical Scan drives, AIT, Exabyte and DAT, encode
reel tape labelled KV19473D and load it on drive 15.file marks so that they can be found during high
The data was recovered from the tape after only aspeed search operations. Normally, as with a video
short delay to the user.recorder, the tape moves slowly during reading. It
These days the operator has been replaced by somewould take 2 to 3 hours to position down the tape
form of robotic tape library, and the open reel tapeat reading speed so they kick the drive into fast
by a tape cartridge that can be handled mechanicallyseek and can then get to the next file mark in a
(for example IBM 3590 and TS1120, STK 9840, 9940fraction of the time. In video terms this is a
and T10000, and of course LTO Ultrium and DLT)."fast-forward" and enable fast access when
This process developed into Hierarchical Storagerecovering data from the tape.
Management, also named HSM, and allows forDon't be fooled by the name though. They sound like
"infinite" storage (as infinite as you can afford space,small little markers when actually they can be several
tape drives and media).megabytes in size on some types of tape.
With smaller systems, this includes some systemsEnd of Recorded Media
that would look pretty large today such as MicroVax,When reading from a tape you might encounter a
there was a more procedural use of tape datacondition named "End of Recorded Media", sometimes
storage. Partly this was due to the cost of roboticreported as "Blank Check". On older drives when
equipment, but mostly as the rise of the mini andrecording completed the drive would erase a length
micro computer coincided with the start of the rise inof tape afterwards. Subsequent reading attempts
lower cost more reliable hard disks and the conceptwould run into this length of blank tape and know
of client/server and the daily tape backup as athey had reached the end. Modern drives encode a
source or data only required when a failure occurreddata pattern, similar in size to a file mark, that
and so to avoid requiring hard drive data recoverydenotes the end of recording. Data recovery via
work.normal means stops at this point, there is no way
Attempts at introducing HSM into this market, usingpast and specialist recovery methods and technology
intermediate storage such as Optical Disk and usingneed to be employed to gain access to this lost data.
tape for longer term archive, came and wentMainframe, and some midrange, systems did not rely
throughout the 90's but largely tape was used as aupon the drive reporting that the end of data had
backup and retrieval medium.been reached but relied upon their own devices. IBM
How Tape Storage differs from Disksystems would encode a double file mark, HP
Setting aside the material differences and thesystems used a triple file mark. These patterns
low-level recording technologies used the generaldenoted logical end of data.
concepts are no different between magnetic tapeThese systems will still rely on their logical mechanism
and hard disk. Each uses magnetism to encode datafor saying "that's it", but the drive will still do its own
on a suitably receptive recording medium.thing. The moment recording stops the EOD is
The real differences are in implementation and usage,written and that is that without professional data
and reflect the major physical differences betweenrecovery assistance.
the two.Block Modes
The short answer to "what is the difference" is thatVariable Block Mode
disk is a random access medium and tape is aDisks are typically formatted with recordable sectors
sequential one. To go into greater depth, disks areeach of 512 bytes. IBM for the AS/400 use 520 or
generally pre-formatted with a known number of522 bytes. Tapes, of course, have to be different.
recordable "sectors" whereas tape is writtenModern tape drives can record in either fixed block
on-the-fly.of variable block mode. This is to enable them to plug
The sequential access nature of tape reflects itsinto systems that have differing pedigrees.
physical character, it is long and narrow and to get toMainframe systems, for example IBM 380/390 and
some data at the far end the drive has to traverseAS/400 (OK it is not a mainframe but it behaves like
the length of the tape. With disk recording toone), write data in chunks that were the correct size
recover any recorded sector all the drive must to isfor their purpose. The label block at the start of an
position the read head to the right track and wait forIBM Labelled tape was defined as being 80 bytes
the data to spin past. So an access time of smalllong, so an 80 byte block was written to the tape.
fractions of seconds versus anything up to a coupleSince 80 byte blocks were not a practical proposition
of minutes, you wouldn't get far implementingwhen dealing with open real tapes the actual data
random access on tape.was written in larger chunks limited in size only by
The issue of formatting though is far from clear.available memory in either the system of the tape
Early open reel tapes, Exabytes and quarter-inchdrive formatter.
cartridges (the older version of SLR often known asFixed Mode Recording
streamers) had erase mechanisms that cleaned theSmaller systems and cheaper drives tended to deal
tape ahead of the write head so recording waswith data pretty much as they did with disk. It did
always to blank tape.not matter how big the data was it would be written
The smaller quarter-inch cartridges, DC2000 and moreout in equal sized chunks and the drives available in
recently Travan, ADR and Ditto, were formattedthis market segment obliged. The early quarter inch
with sectors (usually during manufacture). The verycartridge drives would only record data in 512 byte
first DC2000 drives ran from the diskette controller insections. Smaller UNIX systems and PC systems have
a PC and operated like diskettes. So in theory theya tradition of recording in this manner still do. The
were random access, but the practical access timeonly real difference between disk and tape here is
would few people would live long enough to usethat the tape block sizes for fixed mode recording
them in that manner for a significant volume of data.have typically extended to 64KB or higher.
Newer tape formats (SDLT, LTO Ultirum, 3590, 3570Later drives have been designed to be backwards
and many others), whilst not being pre-formattedcompatible with this more primitive format and with
with data sectors do have a lot of servo datathe more expensive drives that operate in variable
written to them during manufacture and if they aremode, or to be plugged in as direct replacement for
erased become useless. This includes servo trackingthese drives and so can operate as either Fixed or
data that is used to assist in the data alignmentVariable Block recording devices.
process now that the recording densities haveBlock Numbering
increased and there is little or no space left unused.Early drives relied on skipping file marks to position
One, often unwelcome, feature of tape storage isalong tape, but later tape devices introduced the
the concept of "the last thing you wrote is the lastconcept of block numbering. So each tape block has
thing you can recover". With a hard disk each sectora unique number starting at 0.
is uniquely addressable. If data is written to sector 79This partly explains why the tape block sizes used
it has no impact upon sectors 78 and 80. With tape,have increased over time. The SCSI specification
as soon as recording finishes the drive determinesdescribes the block number using 3 bytes, a
that the last thing written is the new end-of-data. Somaximum of 16,777,215 blocks. With 512 byte blocks
if you have a tape containing 400GB and write 2MBthis would mean that the maximum capacity of tape
to the start of it, there is just under 400GB sitting onwould be in the region of 9GB, not very helpful if
the tape that cannot be accessed without recoursewriting to an 800GB Ultrium 3 data cartridge.
to a tape data recovery service.Recording Techniques
In Data Recovery parlance this is over-writing orThree fundamental tape storage formats have
re-initialisation. Don't be fooled into thinking that theredeveloped since the late 1980s.
is any chance of getting the data back that hasMulti-track parallel
actually been over-written, that is the stuff ofHelical Scan
science fiction, but the remaining but inaccessible dataSerpentine
can often be recovered from the tapeAlthough the ground between parallel and serpentine
The advantage that tape gives is that each file isformats has closed more recently with drives having
usually stored contiguously and there are none of theelements of both formats.
frailties of file allocation tables involved when½" open reel - AKA known as 9-track parallel
accessing the data.The drive records 9 tracks of data at once to the
This is all generally true, but there are no rules. Sometape surface. Recording begins at the physical start
tape recording formats (Legato Networker,of the tape (PBOT) and ends at the physical end of
NetBackup and ARCserve amongst them) take datathe tape (PEOT). This format developed from the
from multiple sources and intertwine it on the tapepunch card idea with the eight bit byte and a parity
(sometimes known as multiplexing or multi-streaming).bit. So this is one byte at a time recording.
As said earlier there is nothing to stop theThe capacity of these tapes is tiny by today's
development of random-access tape, but the shapestandards. NRZI recording format managed a
is wrong and it would never catch on.staggering 23MB at 800bpi on a 2400 foot tape. In
There are, however, compromises. IBM 3570 andits heyday, with a massive 6250 bits-per-inch the
STK 9840 attempt to split the difference betweencapacity rose to an impressive 180MB.
the two styles of recording. They use a tapeHelical Scan
cassette, so the tape is on two reels within the caseWe are all more familiar with helical scan than we
rather then like DLT and Ultrium where there is amight realise. It is a technology that was developed
single spool and the tape is transferred to a take-upfor video recording (VHS and Video8) and sound
reel within the drive. The "start" of tape is actually inrecording (DAT).
the middle, so at load time the tape is half way fromThe tape is wrapped around a cylinder that contains
either end, and the data is stored on multiple tracksthe read and write heads. The tape moves slowly
so that the drive can position across and along thewhilst the cylinder spins quickly with each rotation
tape to locate data. So a nod towardsallowing data to be written and then read back to
random-access and faster access time than yourcheck (Read-after-write).
average tape though the time to recover data fromThe name Helical scan springs from the patter
any single file is still generally considerably longer thatdescribed by the head passing along a slowly moving
with disk.tape as "describing a portion of a helix". (it is probably
Tape Storage Conceptsa more marketable name than "diagonal data")
We can set aside the actual recording technique andExabyte Corporation took the Sony Video8 8mm
put the clock back to the 9-track ½-inch openrecorder, added a SCSI interface and some additional
reel tape for the concepts involved in tape datachecking and came up with a 2GB data storage
storage and tape data recovery. This type of tapeformat which was way ahead of its rivals, albeit
was predominant during the 1980's and to an extentbriefly.
the drives that followed had to imitate theHP and Sony adapted developments in the audio
methodology followed in order to replace it. Thismarket with 4mm media named Digital Audio Tape,
means that an Ultrium drive, a DLT drive and a DATadded additional error correction and came up with
drive all take data and give it back exactly as theDDS DAT. Sony later created AIT based, an 8mm
open reel drive did, even though they use radicallyhelical scan format and even one of the STK
different recording formats.mainframe drives used this technique.
With the open reel tape data was transferred to theSerpentine
drive as a sequence of data buffer loads namedThe name arises from the pattern of the recording
blocks. The drive would encode each of these withbeing forward and backwards for a number of
its own identification and error correction data, andtracks, apparently a bit snake-like in character
with a gap in between each one. This inter-block gap(according to some imaginative marketing person).
is why you might sometimes hear people saying thatEarly drives had a pair of recording heads, one for
they "used a larger block size to increase capacity".forwards recording and one for reverse. The drive
On open reel tapes the gap was of a fixed size sowould record forwards until Physical end-od-tape
the smaller the block size the greater the number of(PEOT), reverse until physical beginning-of-tape
blocks required to store an amount of data. The(PBOT), then re-position the heads and repeat the
greater the number of blocks, the greater theprocess. Early drives recorded 4 tracks, the latest
number of gaps and so capacity was lost. Thenrecord hundreds and overlap with the parallel formats
again, with older tapes the larger the block size theby recording several tracks simultaneously.
more chance of hitting an unusable area of tape soEqually parallel format recording drives now records
the whole thing was a bit hit-and-miss.along the tape forwards and then reverse so they
With modern drives the data block is merely whathave become almost serpentine.
you send to the drive, and what you get back.In the data recovery context there is the issue that
Internally it is a matter of encoding and has little tophysical damage impacts multiple places in the
do with how data is actually stored.recording since the drive passes across each area of
The above had a couple of exceptions, notably thetape. Of course this is only an issue if the tape snaps
earlier Exabyte 8mm helical scan drives. These splitor becomes crumpled, and there is an argument as
data into 1024 byte sections when writing to tapeto how likely this is compared with helical scan
and would not share a 1024 byte storage unitdevices which have a much more complex tape path.
between user data blocks. The consequence of thisWe have no intention of entering the affray
was that if you write 1025 byte blocks to a tapebetween exponents of each style of recording.
then each was written as 2048 bytes and theConclusion
capacity of a tape was halved. There are exceptionsTape still has a major part to play in data security
to all rules.and the long term archival of important information.
Tape conceptsAs a data recovery specialist I see both failed hard
So, tape drives record to theoretically blank tape,disk drives and damaged tapes, and whilst tape
have no pre-formatting, and if you record data atrecovery comes with its own set of challenges that
the start you have lost everything that you havecan make it a tortuous process, seldom is a tape a
overwritten and anything after the point at whichcomplete failure and the data recovery success rate
you stop writing.is well over 95%.