| The use of tape for data storage and data recovery | | | | There is nothing either right or wrong with any of |
| in the computer industry goes back many decades. | | | | this, it is just they way they are. What tape gives |
| Tape provided a solid and robust means of storing | | | | you is high volume, low cost per gigabyte storage |
| code and data, along with a far lower | | | | that you can drop on the floor, pick up and read |
| cost-of-ownership than the available hard disk options. | | | | afterwards. Don't try that with a hard disk and then |
| Today the cost of hard disk has plummeted, but | | | | expect to be able to easily recover your data. |
| tape storage is still considered to be the best | | | | File Marks aka Tape Marks |
| available form of long-term archival storage in terms | | | | These are a sub-divider that you won't find on a disk. |
| of price and resilience. | | | | A file mark is a data pattern encoded by the drive |
| Those of us who were born long enough ago can | | | | and used to allow spacing to a particular position on a |
| remember treading gingerly, and speaking in | | | | tape. You want to recover data from backup set 3, |
| low-tones when passing the floor of the computer | | | | well the backup software doesn't read through |
| building where the hard disks were housed. Disks | | | | backup sets 1 and 2 first, it skips file marks and then |
| were unreliable, low volume, and expensive to run | | | | starts to read and recover data once it has found |
| whereas to recover data from tapes was fast | | | | set 3. |
| enough and likely to work. | | | | With 4mm DAT there is an additional type of file |
| The concept of "near-line" storage developed, and still | | | | mark named as the set mark. This allows there to be |
| exists in the world of mainframe, AS/400 and large | | | | two distinct types of data marker, though only Sytos |
| scale UNIX computing. Years ago a request to | | | | Plus, SBACKUP and a few proprietary formats ever |
| recover a file would result in a message popping up | | | | made use of this feature. |
| on the computer operator's screen to fetch the open | | | | Helical Scan drives, AIT, Exabyte and DAT, encode |
| reel tape labelled KV19473D and load it on drive 15. | | | | file marks so that they can be found during high |
| The data was recovered from the tape after only a | | | | speed search operations. Normally, as with a video |
| short delay to the user. | | | | recorder, the tape moves slowly during reading. It |
| These days the operator has been replaced by some | | | | would take 2 to 3 hours to position down the tape |
| form of robotic tape library, and the open reel tape | | | | at reading speed so they kick the drive into fast |
| by a tape cartridge that can be handled mechanically | | | | seek and can then get to the next file mark in a |
| (for example IBM 3590 and TS1120, STK 9840, 9940 | | | | fraction of the time. In video terms this is a |
| and T10000, and of course LTO Ultrium and DLT). | | | | "fast-forward" and enable fast access when |
| This process developed into Hierarchical Storage | | | | recovering data from the tape. |
| Management, also named HSM, and allows for | | | | Don't be fooled by the name though. They sound like |
| "infinite" storage (as infinite as you can afford space, | | | | small little markers when actually they can be several |
| tape drives and media). | | | | megabytes in size on some types of tape. |
| With smaller systems, this includes some systems | | | | End of Recorded Media |
| that would look pretty large today such as MicroVax, | | | | When reading from a tape you might encounter a |
| there was a more procedural use of tape data | | | | condition named "End of Recorded Media", sometimes |
| storage. Partly this was due to the cost of robotic | | | | reported as "Blank Check". On older drives when |
| equipment, but mostly as the rise of the mini and | | | | recording completed the drive would erase a length |
| micro computer coincided with the start of the rise in | | | | of tape afterwards. Subsequent reading attempts |
| lower cost more reliable hard disks and the concept | | | | would run into this length of blank tape and know |
| of client/server and the daily tape backup as a | | | | they had reached the end. Modern drives encode a |
| source or data only required when a failure occurred | | | | data pattern, similar in size to a file mark, that |
| and so to avoid requiring hard drive data recovery | | | | denotes the end of recording. Data recovery via |
| work. | | | | normal means stops at this point, there is no way |
| Attempts at introducing HSM into this market, using | | | | past and specialist recovery methods and technology |
| intermediate storage such as Optical Disk and using | | | | need to be employed to gain access to this lost data. |
| tape for longer term archive, came and went | | | | Mainframe, and some midrange, systems did not rely |
| throughout the 90's but largely tape was used as a | | | | upon the drive reporting that the end of data had |
| backup and retrieval medium. | | | | been reached but relied upon their own devices. IBM |
| How Tape Storage differs from Disk | | | | systems would encode a double file mark, HP |
| Setting aside the material differences and the | | | | systems used a triple file mark. These patterns |
| low-level recording technologies used the general | | | | denoted logical end of data. |
| concepts are no different between magnetic tape | | | | These systems will still rely on their logical mechanism |
| and hard disk. Each uses magnetism to encode data | | | | for saying "that's it", but the drive will still do its own |
| on a suitably receptive recording medium. | | | | thing. The moment recording stops the EOD is |
| The real differences are in implementation and usage, | | | | written and that is that without professional data |
| and reflect the major physical differences between | | | | recovery assistance. |
| the two. | | | | Block Modes |
| The short answer to "what is the difference" is that | | | | Variable Block Mode |
| disk is a random access medium and tape is a | | | | Disks are typically formatted with recordable sectors |
| sequential one. To go into greater depth, disks are | | | | each of 512 bytes. IBM for the AS/400 use 520 or |
| generally pre-formatted with a known number of | | | | 522 bytes. Tapes, of course, have to be different. |
| recordable "sectors" whereas tape is written | | | | Modern tape drives can record in either fixed block |
| on-the-fly. | | | | of variable block mode. This is to enable them to plug |
| The sequential access nature of tape reflects its | | | | into systems that have differing pedigrees. |
| physical character, it is long and narrow and to get to | | | | Mainframe systems, for example IBM 380/390 and |
| some data at the far end the drive has to traverse | | | | AS/400 (OK it is not a mainframe but it behaves like |
| the length of the tape. With disk recording to | | | | one), write data in chunks that were the correct size |
| recover any recorded sector all the drive must to is | | | | for their purpose. The label block at the start of an |
| position the read head to the right track and wait for | | | | IBM Labelled tape was defined as being 80 bytes |
| the data to spin past. So an access time of small | | | | long, so an 80 byte block was written to the tape. |
| fractions of seconds versus anything up to a couple | | | | Since 80 byte blocks were not a practical proposition |
| of minutes, you wouldn't get far implementing | | | | when dealing with open real tapes the actual data |
| random access on tape. | | | | was written in larger chunks limited in size only by |
| The issue of formatting though is far from clear. | | | | available memory in either the system of the tape |
| Early open reel tapes, Exabytes and quarter-inch | | | | drive formatter. |
| cartridges (the older version of SLR often known as | | | | Fixed Mode Recording |
| streamers) had erase mechanisms that cleaned the | | | | Smaller systems and cheaper drives tended to deal |
| tape ahead of the write head so recording was | | | | with data pretty much as they did with disk. It did |
| always to blank tape. | | | | not matter how big the data was it would be written |
| The smaller quarter-inch cartridges, DC2000 and more | | | | out in equal sized chunks and the drives available in |
| recently Travan, ADR and Ditto, were formatted | | | | this market segment obliged. The early quarter inch |
| with sectors (usually during manufacture). The very | | | | cartridge drives would only record data in 512 byte |
| first DC2000 drives ran from the diskette controller in | | | | sections. Smaller UNIX systems and PC systems have |
| a PC and operated like diskettes. So in theory they | | | | a tradition of recording in this manner still do. The |
| were random access, but the practical access time | | | | only real difference between disk and tape here is |
| would few people would live long enough to use | | | | that the tape block sizes for fixed mode recording |
| them in that manner for a significant volume of data. | | | | have typically extended to 64KB or higher. |
| Newer tape formats (SDLT, LTO Ultirum, 3590, 3570 | | | | Later drives have been designed to be backwards |
| and many others), whilst not being pre-formatted | | | | compatible with this more primitive format and with |
| with data sectors do have a lot of servo data | | | | the more expensive drives that operate in variable |
| written to them during manufacture and if they are | | | | mode, or to be plugged in as direct replacement for |
| erased become useless. This includes servo tracking | | | | these drives and so can operate as either Fixed or |
| data that is used to assist in the data alignment | | | | Variable Block recording devices. |
| process now that the recording densities have | | | | Block Numbering |
| increased and there is little or no space left unused. | | | | Early drives relied on skipping file marks to position |
| One, often unwelcome, feature of tape storage is | | | | along tape, but later tape devices introduced the |
| the concept of "the last thing you wrote is the last | | | | concept of block numbering. So each tape block has |
| thing you can recover". With a hard disk each sector | | | | a unique number starting at 0. |
| is uniquely addressable. If data is written to sector 79 | | | | This partly explains why the tape block sizes used |
| it has no impact upon sectors 78 and 80. With tape, | | | | have increased over time. The SCSI specification |
| as soon as recording finishes the drive determines | | | | describes the block number using 3 bytes, a |
| that the last thing written is the new end-of-data. So | | | | maximum of 16,777,215 blocks. With 512 byte blocks |
| if you have a tape containing 400GB and write 2MB | | | | this would mean that the maximum capacity of tape |
| to the start of it, there is just under 400GB sitting on | | | | would be in the region of 9GB, not very helpful if |
| the tape that cannot be accessed without recourse | | | | writing to an 800GB Ultrium 3 data cartridge. |
| to a tape data recovery service. | | | | Recording Techniques |
| In Data Recovery parlance this is over-writing or | | | | Three fundamental tape storage formats have |
| re-initialisation. Don't be fooled into thinking that there | | | | developed since the late 1980s. |
| is any chance of getting the data back that has | | | | Multi-track parallel |
| actually been over-written, that is the stuff of | | | | Helical Scan |
| science fiction, but the remaining but inaccessible data | | | | Serpentine |
| can often be recovered from the tape | | | | Although the ground between parallel and serpentine |
| The advantage that tape gives is that each file is | | | | formats has closed more recently with drives having |
| usually stored contiguously and there are none of the | | | | elements of both formats. |
| frailties of file allocation tables involved when | | | | ½" open reel - AKA known as 9-track parallel |
| accessing the data. | | | | The drive records 9 tracks of data at once to the |
| This is all generally true, but there are no rules. Some | | | | tape surface. Recording begins at the physical start |
| tape recording formats (Legato Networker, | | | | of the tape (PBOT) and ends at the physical end of |
| NetBackup and ARCserve amongst them) take data | | | | the tape (PEOT). This format developed from the |
| from multiple sources and intertwine it on the tape | | | | punch card idea with the eight bit byte and a parity |
| (sometimes known as multiplexing or multi-streaming). | | | | bit. So this is one byte at a time recording. |
| As said earlier there is nothing to stop the | | | | The capacity of these tapes is tiny by today's |
| development of random-access tape, but the shape | | | | standards. NRZI recording format managed a |
| is wrong and it would never catch on. | | | | staggering 23MB at 800bpi on a 2400 foot tape. In |
| There are, however, compromises. IBM 3570 and | | | | its heyday, with a massive 6250 bits-per-inch the |
| STK 9840 attempt to split the difference between | | | | capacity rose to an impressive 180MB. |
| the two styles of recording. They use a tape | | | | Helical Scan |
| cassette, so the tape is on two reels within the case | | | | We are all more familiar with helical scan than we |
| rather then like DLT and Ultrium where there is a | | | | might realise. It is a technology that was developed |
| single spool and the tape is transferred to a take-up | | | | for video recording (VHS and Video8) and sound |
| reel within the drive. The "start" of tape is actually in | | | | recording (DAT). |
| the middle, so at load time the tape is half way from | | | | The tape is wrapped around a cylinder that contains |
| either end, and the data is stored on multiple tracks | | | | the read and write heads. The tape moves slowly |
| so that the drive can position across and along the | | | | whilst the cylinder spins quickly with each rotation |
| tape to locate data. So a nod towards | | | | allowing data to be written and then read back to |
| random-access and faster access time than your | | | | check (Read-after-write). |
| average tape though the time to recover data from | | | | The name Helical scan springs from the patter |
| any single file is still generally considerably longer that | | | | described by the head passing along a slowly moving |
| with disk. | | | | tape as "describing a portion of a helix". (it is probably |
| Tape Storage Concepts | | | | a more marketable name than "diagonal data") |
| We can set aside the actual recording technique and | | | | Exabyte Corporation took the Sony Video8 8mm |
| put the clock back to the 9-track ½-inch open | | | | recorder, added a SCSI interface and some additional |
| reel tape for the concepts involved in tape data | | | | checking and came up with a 2GB data storage |
| storage and tape data recovery. This type of tape | | | | format which was way ahead of its rivals, albeit |
| was predominant during the 1980's and to an extent | | | | briefly. |
| the drives that followed had to imitate the | | | | HP and Sony adapted developments in the audio |
| methodology followed in order to replace it. This | | | | market with 4mm media named Digital Audio Tape, |
| means that an Ultrium drive, a DLT drive and a DAT | | | | added additional error correction and came up with |
| drive all take data and give it back exactly as the | | | | DDS DAT. Sony later created AIT based, an 8mm |
| open reel drive did, even though they use radically | | | | helical scan format and even one of the STK |
| different recording formats. | | | | mainframe drives used this technique. |
| With the open reel tape data was transferred to the | | | | Serpentine |
| drive as a sequence of data buffer loads named | | | | The name arises from the pattern of the recording |
| blocks. The drive would encode each of these with | | | | being forward and backwards for a number of |
| its own identification and error correction data, and | | | | tracks, apparently a bit snake-like in character |
| with a gap in between each one. This inter-block gap | | | | (according to some imaginative marketing person). |
| is why you might sometimes hear people saying that | | | | Early drives had a pair of recording heads, one for |
| they "used a larger block size to increase capacity". | | | | forwards recording and one for reverse. The drive |
| On open reel tapes the gap was of a fixed size so | | | | would record forwards until Physical end-od-tape |
| the smaller the block size the greater the number of | | | | (PEOT), reverse until physical beginning-of-tape |
| blocks required to store an amount of data. The | | | | (PBOT), then re-position the heads and repeat the |
| greater the number of blocks, the greater the | | | | process. Early drives recorded 4 tracks, the latest |
| number of gaps and so capacity was lost. Then | | | | record hundreds and overlap with the parallel formats |
| again, with older tapes the larger the block size the | | | | by recording several tracks simultaneously. |
| more chance of hitting an unusable area of tape so | | | | Equally parallel format recording drives now records |
| the whole thing was a bit hit-and-miss. | | | | along the tape forwards and then reverse so they |
| With modern drives the data block is merely what | | | | have become almost serpentine. |
| you send to the drive, and what you get back. | | | | In the data recovery context there is the issue that |
| Internally it is a matter of encoding and has little to | | | | physical damage impacts multiple places in the |
| do with how data is actually stored. | | | | recording since the drive passes across each area of |
| The above had a couple of exceptions, notably the | | | | tape. Of course this is only an issue if the tape snaps |
| earlier Exabyte 8mm helical scan drives. These split | | | | or becomes crumpled, and there is an argument as |
| data into 1024 byte sections when writing to tape | | | | to how likely this is compared with helical scan |
| and would not share a 1024 byte storage unit | | | | devices which have a much more complex tape path. |
| between user data blocks. The consequence of this | | | | We have no intention of entering the affray |
| was that if you write 1025 byte blocks to a tape | | | | between exponents of each style of recording. |
| then each was written as 2048 bytes and the | | | | Conclusion |
| capacity of a tape was halved. There are exceptions | | | | Tape still has a major part to play in data security |
| to all rules. | | | | and the long term archival of important information. |
| Tape concepts | | | | As a data recovery specialist I see both failed hard |
| So, tape drives record to theoretically blank tape, | | | | disk drives and damaged tapes, and whilst tape |
| have no pre-formatting, and if you record data at | | | | recovery comes with its own set of challenges that |
| the start you have lost everything that you have | | | | can make it a tortuous process, seldom is a tape a |
| overwritten and anything after the point at which | | | | complete failure and the data recovery success rate |
| you stop writing. | | | | is well over 95%. |