What technology does raid 5 use? Types of RAID arrays

Many users have heard about the concept of RAID disk arrays, but in practice few people imagine what it is. But as it turns out, there is nothing complicated here. Let's look at the essence of this term, as they say, on the fingers, based on the explanation of information for the average person.

What are RAID disk arrays?

First, let's look at the general interpretation offered by online publications. Disk arrays are entire information storage systems consisting of a combination of two or more hard drives, serving either to increase the speed of access to stored information, or to duplicate it, for example, when saving backup copies.

In this combination, the number of hard drives in terms of installation theoretically has no restrictions. It all just depends on how many connections the motherboard supports. Actually, why are RAID disk arrays used? Here it is worth paying attention to the fact that in the direction of technology development (relative to hard drives), they have long frozen at one point (spindle speed 7200 rpm, cache size, etc.). The only exceptions in this regard are SSD models, but even they mainly only increase the volume. At the same time, in the production of processors or strips random access memory progress is more noticeable. Thus, due to the use of RAID arrays, the performance gain when accessing hard drives is increased.

RAID disk arrays: types, purpose

As for the arrays themselves, they can be conditionally divided according to the numbering used (0, 1, 2, etc.). Each such number corresponds to the performance of one of the declared functions.

The main ones in this classification are disk arrays with numbers 0 and 1 (later it will be clear why), since they are the ones assigned the main tasks.

When creating arrays with multiple hard drives connected, you should initially use BIOS settings, where the SATA configuration section is set to RAID. It is important to note that the connected drives must have absolutely identical parameters in terms of volume, interface, connection, cache, etc.

RAID 0 (Striping)

Zero disk arrays are essentially designed to speed up access to stored information (writing or reading). As a rule, they can have from two to four hard drives in combination.

But the main problem here is that when you delete information on one of the disks, it disappears on the others. Information is written in the form of blocks alternately on each disk, and the increase in performance is directly proportional to the number of hard drives (that is, four disks are twice as fast as two). But the loss of information is only due to the fact that the blocks can be located on different disks, although the user in the same “Explorer” sees the files in a normal display.

RAID 1

Disk arrays with a single designation belong to the Mirroring category and are used to save data by duplicating.

Roughly speaking, in this state of affairs, the user loses somewhat in productivity, but he can be sure that if data disappears from one partition, it will be saved in another.

RAID 2 and higher

Arrays numbered 2 and higher have dual purpose. On the one hand, they are designed to record information, on the other hand, they are used to correct errors.

In other words, disk arrays of this type combine the capabilities of RAID 0 and RAID 1, but are not particularly popular among computer scientists, although their operation is based on the use

What is better to use in practice?

Of course, if the computer is supposed to use resource-intensive programs, for example, modern games, it is better to use RAID 0 arrays. In the case of working with important information, which needs to be saved in any way, you will have to turn to RAID 1 arrays. Due to the fact that bundles with numbers from two and above have not become popular, their use is determined solely by the desire of the user. By the way, the use of zero arrays is also practical if the user often downloads multimedia files to the computer, say, movies or music with a high bitrate for the MP3 format or in the FLAC standard.

For the rest, you will have to rely on your own preferences and needs. The use of this or that array will depend on this. And, of course, when installing a bundle it is better to give preference SSD drives, since compared to conventional hard drives they already initially have higher write and read speeds. But they must be absolutely identical in their characteristics and parameters, otherwise the connected combination simply will not work. And this is precisely one of the most important conditions. So you will have to pay attention to this aspect.

Almost everyone knows the proverb “Until thunder strikes, a man will not cross himself.” It is vital: until this or that problem touches the user closely, he will not even think about it. The power supply died and took a couple of devices with it - the user rushes to look for articles on relevant topics about tasty and healthy food. The processor burned out or began to malfunction due to overheating - in the “Favorites” there appear a couple of links to sprawling forum threads where CPU cooling is discussed.

It’s the same story with hard drives: as soon as the next screw, having cracked its heads goodbye, leaves our mortal world, the owner of the PC begins to fuss to ensure the improvement of the living conditions of the drive. But even the most sophisticated cooler cannot guarantee a long and happy life for the disk. The service life of the drive is influenced by many factors: manufacturing defects, and an accidental kick to the case (especially if the body is standing somewhere on the floor), and dust passing through the filters, and high-voltage interference sent by the power supply... There is only one way out - backup information, and if you need backup on the go, then it’s time to build a RAID array, since today almost every motherboard has some kind of RAID controller.

At this point we will stop and make a brief excursion into the history and theory of RAID arrays. The abbreviation RAID itself stands for Redundant Array of Independent Disks. Previously, inexpensive was used instead of independent, but over time this definition has lost its relevance: almost all disk drives have become inexpensive.

The history of RAID began in 1987, when the article "Enclosure for Redundant Arrays of Low-Cost Disks (RAID)" was published, signed by comrades Peterson, Gibson and Katz. The article described the technology of combining several ordinary disks into an array to obtain a faster and more reliable drive. The authors of the material also told readers about several types of arrays - from RAID-1 to RAID-5. Subsequently, a zero-level RAID array was added to the arrays described almost twenty years ago, and it gained popularity. So what are all these RAID-x? What is their essence? Why are they called redundant? We will try to figure this out.

To put it very in simple language, then RAID is a thing that allows the operating system not to know how many disks are installed in the computer. Combining hard drives into a RAID array is a process that is directly opposite to dividing a single space into logical drives: we form one logical drive based on several physical ones. In order to do this, we will need either the appropriate software (we won’t even talk about this option - it’s an unnecessary thing), or a RAID controller built into the motherboard, or a separate one inserted into a PCI or PCI Express slot. It is the controller that combines the disks into an array, and operating system It no longer works with the HDD, but with the controller, which does not tell it anything unnecessary. But there are a great many options for combining several disks into one, more precisely, about ten.

What are RAID types?

The simplest of them is JBOD (Just a Bunch of Disks). Two hard drives are glued into one in series, information is written first to one and then to the other disk without breaking it into pieces and blocks. From two 200 GB drives, we make one 400 GB drive, which operates at almost the same, and in reality slightly lower, speed as each of the two drives.

JBOD is a special case of a level-0 array, RAID-0. There is also another variant of the name of arrays at this level - stripe (strip), the full name is Striped Disk Array without Fault Tolerance. This option also involves combining n disks into one with a capacity increased by n times, but the disks are not combined sequentially, but in parallel, and information is written to them in blocks (the block size is specified by the user when forming a RAID array).

That is, if you need to write the sequence of numbers 123456 to two drives included in a RAID-0 array, the controller will divide this chain into two parts - 123 and 456 - and write the first to one disk, and the second to the other. Each disk can transfer data... well, at a speed of 50 MB/s, and the total speed of two disks from which data is taken in parallel is 100 MB/s. Thus, the speed of working with data should increase n times (in reality, of course, the increase in speed is less, since no one has canceled the losses for searching for data and transmitting it over the bus). But this increase is not given for nothing: if at least one disk fails, information from the entire array is lost.

RAID level zero. The data is divided into blocks and scattered across disks. There is no parity or redundancy.

That is, there is no redundancy and no redundancy at all. This array can only be considered a RAID array conditionally, however, it is very popular. Few people think about reliability; it can’t be measured by benchmarks, but everyone understands the language of megabytes per second. This is not bad or good, it just happens. Below we will talk about how to eat the fish and maintain reliability. Recovering RAID-0 after a failure

By the way, an additional disadvantage of the stripe array is that it is not portable. I don’t mean that he doesn’t tolerate certain types of food or, for example, his owners. He doesn’t care about this, but moving the array itself somewhere is a whole problem. Even if you drag both disks and controller drivers to your friend, it is not a fact that they will be defined as one array and the data will be able to be used. Moreover, there are cases where simply connecting (without writing anything!) stripe disks to a “non-native” (different from the one on which the array was formed) controller led to damage to the information in the array. We don’t know how relevant this problem is now, with the advent of modern controllers, but we still advise you to be careful.


Level 1 RAID array of four disks. The disks are divided into pairs, and the drives within the pair store the same data.

The first truly "redundant" array (and the first RAID to appear) was RAID-1. Its second name - mirror - explains the principle of operation: all disks allocated for the array are divided into pairs, and information is read and written to both disks at once. It turns out that each of the disks in the array has an exact copy. In such a system, not only the reliability of data storage increases, but also the speed of reading it (you can read from two hard drives at once), although the write speed remains the same as that of one drive.

As you might guess, the volume of such an array will be equal to half the sum of the volumes of all hard drives included in it. The disadvantage of this solution is that you need twice as many hard drives. But the reliability of this array is actually not even equal to the double reliability of a single disk, but much higher than this value. Failure of two hard drives within... well, let's say, a day is unlikely unless, for example, the power supply intervenes. At the same time, any sane person, seeing that one disk in a pair has failed, will immediately replace it, and even if the second disk fails immediately after that, the information will not go anywhere.

As you can see, both RAID-0 and RAID-1 have their drawbacks. How can I get rid of them? If you have at least four hard drives, you can create a RAID 0+1 configuration. To do this, RAID-1 arrays are combined into a RAID-0 array. Or vice versa, sometimes a RAID-1 array is created from several RAID-0 arrays (the result is RAID-10, the only advantage of which is less data recovery time when one disk fails).

The reliability of such a configuration of four hard drives is equal to the reliability of a RAID-1 array, and the speed is actually the same as that of RAID-0 (in reality, it will most likely be slightly lower due to the limited capabilities of the controller). At the same time, the simultaneous failure of two disks does not always mean a complete loss of information: this will only happen if the disks containing the same data fail, which is unlikely. That is, if four disks are divided into pairs 1-2 and 3-4 and the pairs are combined into a RAID-0 array, then only the simultaneous failure of disks 1 and 2 or 3 and 4 will lead to data loss, while in the event of the untimely death of the first and the third, second and fourth, first and fourth or second and third hard drives, the data will remain safe and sound.

However, the main disadvantage of RAID-10 is the high cost of disks. Still, the price of four (minimum!) hard drives cannot be called small, especially if the capacity of only two of them is actually available to us (few people think about reliability and how much it costs, as we have already said). Large (100%) redundancy of data storage makes itself felt. All this has led to the fact that recently an array variant called RAID-5 has gained popularity. To implement it you need three disks. In addition to the information itself, the controller also stores parity control blocks on the array drives.

We will not go into the details of how the parity control algorithm works; we will only say that if information is lost on one of the disks, it allows you to restore it using parity data and live data from other disks. The parity block has the volume of one physical disk and is evenly distributed across all hard drives of the system so that the loss of any disk allows you to recover information from it using a parity block located on another disk of the array. The information is divided into large blocks and written to the disks one by one, that is, according to the 12-34-56 principle in the case of a three-disk array.

Accordingly, the total volume of such an array is the volume of all disks minus the capacity of one of them. Data recovery, of course, does not occur instantly, but such a system has high performance and a margin of reliability at a minimum cost (for a 1000 GB array you need six 200 GB disks). However, the performance of such an array will still be lower than the speed of a stripe system: with each write operation, the controller also needs to update the parity index.

RAID-0, RAID-1 and RAID 0+1, sometimes also RAID-5 - these levels most often exhaust the capabilities of desktop RAID controllers. More high levels available only to complex systems based on SCSI hard drives. However, happy owners of SATA controllers with Matrix RAID support (such controllers are built into the ICH6R and ICH7R south bridges from Intel) can take advantage of RAID-0 and RAID-1 arrays with just two drives, and those with a card with ICH7R can combine RAID-5 and RAID-0 if they have four identical drives.

How is this implemented in practice? Let's look at a simpler case with RAID-0 and RAID-1. Let's say you bought two 400 GB hard drives. You split each drive into 100 GB and 300 GB logical drives. After that, using the Intel Application Accelerator RAID Option ROM utility built into the BIOS, you combine 100 GB partitions into a stripe array (RAID-0), and 300 GB partitions into a Mirror array (RAID-1). Now, on a fast disk with a capacity of 200 GB, you can store, say, toys, video material and other data that require high speed of the disk subsystem and, moreover, are not very important (that is, those that you will not very much regret losing), and on a mirrored 300 GB gigabyte drive you move work documents, mail archives, utility software and other vital files. If one drive fails, you lose what was placed on the stripe array, but the data you placed on the second logical drive is duplicated on the remaining drive.

An association RAID levels-5 and RAID-0 mean that part of the volume of four disks is allocated for a fast stripe array, and the other part (let it be 300 GB on each disk) falls on data blocks and parity blocks, that is, you get one ultra-fast disk with the capacity 400 GB (4 x 100 GB) and one reliable but slower 900 GB array of 4 x 300 GB minus 300 GB for parity blocks.

As you can see, this technology is extremely promising, and it would be nice if other chipset and controller manufacturers support it. It is very tempting to have arrays of different levels on two disks, fast and reliable.

These are, perhaps, all the types of RAID arrays that are used in home systems. However, in life you may encounter RAID-2, 3, 4, 6 and 7. So let's still see what these levels are.

RAID-2. In an array of this type, disks are divided into two groups - for data and for error correction codes, and if the data is stored on n disks, then n-1 disks are needed to store correction codes. Data is written to the corresponding hard drives in the same way as in RAID-0; they are divided into small blocks according to the number of disks intended for storing information. The remaining disks store error correction codes, which can be used to restore information if any hard drive fails. The Hamming method has long been used in ECC memory and allows on-the-fly correction of small one-bit errors if they suddenly occur, and if two bits are transmitted incorrectly, this will again be detected using parity systems. However, no one wanted to keep a bulky structure of almost double the number of disks for this purpose, and this type of array did not become widespread.

Array structure RAID-3 is this: in an array of n disks, the data is split into 1-byte blocks and distributed across n-1 disks, and another disk is used to store parity blocks. RAID-2 had n-1 disks for this purpose, but most of the information on these disks was used only for on-the-fly error correction, and for easy recovery in the event of a disk failure, a smaller amount is sufficient; one dedicated hard drive is enough.


RAID level 3 with a separate disk for storing parity information. There is no backup, but the data can be restored.

Accordingly, the differences between RAID-3 and RAID-2 are obvious: the impossibility of on-the-fly error correction and less redundancy. The advantages are as follows: the speed of reading and writing data is high, and very few disks are required to create an array, only three. But an array of this type is only good for single-tasking work with large files, since there are speed problems when frequent requests small amount of data.


A level 5 array differs from RAID-3 in that the parity blocks are evenly distributed across all disks in the array.

RAID-4 similar to RAID-3, but differs from it in that the data is divided into blocks rather than bytes. Thus, it was possible to “defeat” the problem of low data transfer speed of small volumes. Writing is slow due to the fact that parity for the block is generated during recording and written to a single disk. Arrays of this type are used very rarely.

RAID-6- this is the same RAID-5, but now two parity blocks are stored on each of the array disks. Thus, if two disks fail, information can still be recovered. Of course, increased reliability led to a decrease in the usable volume of disks and an increase in the minimum number of disks: now, if there are n disks in the array, the total volume available for recording data will be equal to the volume of one disk multiplied by n-2. The need to calculate two checksums at once determines the second drawback inherited by RAID-6 from RAID-5 - the low data writing speed.

RAID-7 is a registered trademark of Storage Computer Corporation. The structure of the array is as follows: data is stored on n-1 disks, one disk is used to store parity blocks. But several important details were added to eliminate the main drawback of arrays of this type: a data cache and a fast controller that manages request processing. This made it possible to reduce the number of disk accesses to calculate the data checksum. As a result, it was possible to significantly increase the speed of data processing (in some places by five or more times).



RAID level 0+1 array, or a design of two RAID-1 arrays combined into RAID-0. Reliable, fast, expensive.

New disadvantages have also appeared: the very high cost of implementing such an array, the complexity of its maintenance, the need for an uninterruptible power supply to prevent data loss in the cache memory during power failures. You are unlikely to see an array of this type, but if you suddenly see it somewhere, write to us, we will also be happy to look at it.

Creating an Array

I hope you have already managed to choose the array type. If your board has a RAID controller, you will not need anything other than the required number of disks and drivers for this controller. By the way, keep in mind: it makes sense to combine only disks of the same size into arrays, preferably one model. The controller may refuse to work with disks of different sizes, and most likely you will only be able to use a part of a large disk, equal in volume to the smaller disk. In addition, even the speed of a stripe array will be determined by the speed of the slowest disk. And my advice to you: do not try to make the RAID array bootable. This is possible, but if any failures occur in the system, you will have a hard time, since restoring functionality will be very difficult. In addition, it is dangerous to place several systems on such an array: almost all programs responsible for selecting the OS kill information from the service areas of the hard drive and, accordingly, damage the array. It is better to choose a different scheme: one disk is bootable, and the rest are combined into an array.



Matrix RAID in action. Part of the disk space is used by the RAID-0 array, the rest of the space is taken by the RAID-1 array.

Every RAID array starts with the RAID controller BIOS. Sometimes (only in the case of integrated controllers, and even then not always) it is built into the main BIOS of the motherboard, sometimes it is located separately and is activated after passing the self-test, but in any case you need to go there. It is in the BIOS that the necessary array parameters are set, as well as the sizes of data blocks, the hard drives used, and so on. Once you have determined all this, all you need to do is save the settings, exit the BIOS and return to the operating system.

There you must install the controller drivers (as a rule, a floppy disk with them is included with the motherboard or the controller itself, but they can be written to a disk with other drivers and utility software), reboot, and that’s it, the array is ready for use. You can split it into logical drives, format it and fill it with data. Just remember that RAID is not a panacea. It will save you from data loss if the hard drive dies and minimizes the consequences of such an outcome, but it will not save you from power surges and failures of a low-quality power supply, which kills both drives at once, regardless of their “massiveness”.

Neglect of high-quality power supply and temperature conditions of the disks can significantly shorten the life of the HDD; it happens that all the disks in the array fail, and all data is irretrievably lost. In particular, modern hard drives (especially IBM and Hitachi) are very sensitive to the +12 V channel and do not like even the slightest change in voltage on it, so before purchasing all the equipment necessary to build the array, it is worth checking the corresponding voltages and, if necessary, turning on a new one BP to the shopping list.

Powering hard drives, as well as all other components, from a second power supply, at first glance, is simple to implement, but there are many pitfalls in such a power supply scheme, and you need to think a hundred times before deciding to take such a step. With cooling, everything is simpler: you just need to ensure airflow for all hard drives, plus do not place them close to each other. Simple rules, but, unfortunately, not everyone follows them. And cases when both disks in an array die at the same time are not uncommon.

In addition, RAID does not replace the need to regularly create backups data. Mirroring is mirroring, but if you accidentally corrupt or erase files, the second disk will not help you at all. So make backups whenever you can. This rule applies regardless of the presence of RAID arrays inside the PC.

So, are you RAIDy? Yes? Great! Just in pursuit of volume and speed, don’t forget another proverb: “Make a fool pray to God, he’ll break his forehead.” We wish you strong disks and reliable controllers!

Cost benefit of noisy RAID

RAID is good even without regard to money. But let's calculate the price of the simplest 400 GB stripe array. Two Seagate Barracuda SATA 7200.8 drives of 200 GB each will cost you about $230. RAID controllers are built into most motherboards, that is, we get them for free.

At the same time, a 400 GB drive of the same model costs $280. The difference is $50, and with that money you can buy a powerful power supply, which you will undoubtedly need. I'm not even talking about the fact that the performance of a composite "disk" at a lower price will be almost twice as high as the performance of a single hard drive.

Let us now carry out the calculation, focusing on the total volume of 250 GB. There are no 125 GB drives, so let's take two 120 GB hard drives. The price of each disk is $90, the price of one 250 GB hard drive is $130. Well, with such volumes, performance comes at a price. What if we take a 300 GB array? Two 160 GB disks - approximately $200, one 300 GB disk - $170... Not the same again. It turns out that RAID is beneficial only when using very large disks.

© Andrey Egorov, 2005, 2006. TIM Group of Companies.

Forum visitors ask us the question: “Which RAID level is the most reliable?” Everyone knows that the most common level is RAID5, but it is not without serious drawbacks that are not obvious to non-specialists.

RAID 0, RAID 1, RAID 5, RAID6, RAID 10 or what are RAID levels?

In this article, I will try to characterize the most popular RAID levels, and then formulate recommendations for using these levels. To illustrate the article, I created a diagram in which I placed these levels in the three-dimensional space of reliability, performance and cost efficiency.

JBOD(Just a Bunch of Disks) is a simple spanning of hard drives, which is not formally a RAID level. A JBOD volume can be an array of a single disk or an aggregation of multiple disks. The RAID controller does not need to perform any calculations to operate such a volume. In our diagram, the JBOD drive serves as a “single” or starting point—its reliability, performance, and cost values ​​are the same as those of a single hard drive.

RAID 0(“Striping”) has no redundancy, and distributes information immediately across all disks included in the array in the form of small blocks (“stripes”). Due to this, performance increases significantly, but reliability suffers. As with JBOD, we get 100% of the disk capacity for our money.

Let me explain why the reliability of data storage on any composite volume decreases - since if any of the hard drives included in it fail, all information is completely and irretrievably lost. In accordance with probability theory, mathematically, the reliability of a RAID0 volume is equal to the product of the reliabilities of its constituent disks, each of which is less than one, so the total reliability is obviously lower than the reliability of any disk.

Good level - RAID 1(“Mirroring”, “mirror”). It has protection against failure of half of the available hardware (in general case– one of two hard drives), provides an acceptable write speed and gains in read speed due to parallelization of requests. The disadvantage is that you have to pay the cost of two hard drives to get the usable capacity of one hard drive.

Initially it is assumed that HDD- a reliable thing. Accordingly, the probability of failure of two disks at once is equal (according to the formula) to the product of the probabilities, i.e. orders of magnitude lower! Unfortunately, real life- not a theory! Two hard drives are taken from the same batch and operate under the same conditions, and if one of the disks fails, the load on the remaining one increases, so in practice, if one of the disks fails, urgent measures must be taken to restore redundancy. To do this, it is recommended to use hot spare disks with any RAID level (except zero) HotSpare. The advantage of this approach is maintaining constant reliability. The disadvantage is even greater costs (i.e. the cost of 3 hard drives to store the volume of one disk).

Mirror on many disks is a level RAID 10. When using this level, mirrored pairs of disks are arranged in a “chain”, so the resulting volume can exceed the capacity of a single hard drive. The advantages and disadvantages are the same as for the RAID1 level. As in other cases, it is recommended to include HotSpare hot spare disks in the array at the rate of one spare for every five workers.

RAID 5, indeed, the most popular of the levels - primarily due to its efficiency. By sacrificing the capacity of just one disk from the array for redundancy, we gain protection against failure of any of the volume’s hard drives. Writing information to a RAID5 volume requires additional resources, since additional calculations are required, but when reading (compared to a separate hard drive), there is a gain, because data streams from several array drives are parallelized.

The disadvantages of RAID5 appear when one of the disks fails - the entire volume goes into critical mode, all write and read operations are accompanied by additional manipulations, performance drops sharply, and the disks begin to heat up. If immediate action is not taken, you may lose the entire volume. Therefore, (see above) you should definitely use a Hot Spare disk with a RAID5 volume.

In addition to the basic levels RAID0 - RAID5 described in the standard, there are combined levels RAID10, RAID30, RAID50, RAID15, which are interpreted differently by different manufacturers.

The essence of such combinations is briefly as follows. RAID10 is a combination of one and zero (see above). RAID50 is a combination of “0” level 5 volumes. RAID15 is a “mirror” of the “fives”. And so on.

Thus, combined levels inherit the advantages (and disadvantages) of their “parents”. So, the appearance of a “zero” in the level RAID 50 does not add any reliability to it, but has a positive effect on performance. Level RAID 15, probably very reliable, but it is not the fastest and, moreover, extremely uneconomical (the useful capacity of the volume is less than half the size of the original disk array).

RAID 6 differs from RAID 5 in that in each row of data (in English stripe) has not one, but two checksum block. Checksums are “multidimensional”, i.e. independent of each other, so even the failure of two disks in the array allows you to save the original data. Calculating checksums using the Reed-Solomon method requires more intensive calculations compared to RAID5, so previously the sixth level was practically not used. Now it is supported by many products, since they began to install specialized microcircuits that perform all the necessary mathematical operations.

According to some studies, restoring integrity after a single drive failure on a RAID5 volume composed of SATA drives large volume (400 and 500 gigabytes), in 5% of cases it ends in data loss. In other words, in one case out of twenty, during the regeneration of a RAID5 array to a Hot Spare disk, the second disk may fail... Hence the recommendations of the best RAID drives: 1) Always make backups; 2) use RAID6!

Recently new levels RAID1E, RAID5E, RAID5EE have appeared. The letter “E” in the name means Enhanced.

RAID level-1 Enhanced (RAID level-1E) combines mirroring and data striping. This mixture of levels 0 and 1 is arranged as follows. The data in a row is distributed exactly as in RAID 0. That is, the data row has no redundancy. The next row of data blocks copies the previous one with a shift of one block. Thus, as in standard RAID 1 mode, each data block has a mirror copy on one of the disks, so the useful volume of the array is equal to half the total volume of the hard drives included in the array. RAID 1E requires a combination of three or more drives to operate.

I really like the RAID1E level. For a powerful graphics workstation or even for home computer– the best choice! It has all the advantages of the zero and first levels - excellent speed and high reliability.

Let's now move on to the level RAID level-5 Enhanced (RAID level-5E). This is the same as RAID5, only with a backup disk built into the array spare drive. This integration is carried out as follows: on all disks of the array, 1/N part of the space is left free, which is used as a hot spare if one of the disks fails. Due to this, RAID5E demonstrates, along with reliability, better performance, since reading/writing is performed in parallel from a larger number of drives at the same time and the spare drive is not idle, as in RAID5. Obviously, included in that backup disk cannot be shared with other volumes (dedicated vs. shared). A RAID 5E volume is built on a minimum of four physical disks. The useful volume of a logical volume is calculated using the formula N-2.

RAID level-5E Enhanced (RAID level-5EE) similar to RAID level-5E, but it has more efficient spare drive allocation and, as a result, faster recovery time. Like the RAID5E level, this RAID level distributes blocks of data and checksums in rows. But it also distributes free blocks of the spare drive, and does not simply reserve part of the disk space for these purposes. This reduces the time required to reconstruct the integrity of a RAID5EE volume. The backup disk included in the volume cannot be shared with other volumes - as in the previous case. A RAID 5EE volume is built on a minimum of four physical disks. The useful volume of a logical volume is calculated using the formula N-2.

Oddly enough, no mention of level RAID 6E I couldn’t find it on the Internet - so far this level is not offered or even announced by any manufacturer. But the RAID6E (or RAID6EE?) level can be offered according to the same principle as the previous one. Disk HotSpare Necessarily must accompany any RAID volume, including RAID 6. Of course, we will not lose information if one or two disks fail, but it is extremely important to start regenerating the integrity of the array as early as possible in order to quickly bring the system out of the “critical” mode. Since the need for a Hot Spare disk is beyond doubt for us, it would be logical to go further and “spread” it over the volume as is done in RAID 5EE in order to get the benefits of using more quantity disks (better read-write speed and faster integrity restoration).

RAID levels in “numbers”.

I have collected some important parameters of almost all RAID levels in a table so that you can compare them with each other and better understand their essence.

Level
~~~~~~~

Huts-
exactly
ness
~~~~~~~

Use
Disk capacity
~~~~~~~

Production
ditel-
ness
reading

~~~~~~~

Production
ditel-
ness
records

~~~~~~~

Built-in
disk
reserve

~~~~~~~

Min. number of disks
~~~~~~~

Max. number of disks

~~~~~~~

Exc.

Exc.

Exc.

Exc.

All “mirror” levels are RAID 1, 1+0, 10, 1E, 1E0.

Let's try again to thoroughly understand how these levels differ?

RAID 1.
This is a classic “mirror”. Two (and only two!) hard drives work as one, being a complete copy of each other. Failure of either of these two drives does not result in loss of your data, as the controller continues to operate on the remaining drive. RAID1 in numbers: 2x redundancy, 2x reliability, 2x cost. Write performance is equivalent to that of a single hard drive. Read performance is higher because the controller can distribute read operations between two disks.

RAID 10.
The essence of this level is that the disks of the array are combined in pairs into “mirrors” (RAID 1), and then all these mirror pairs, in turn, are combined into a common striped array (RAID 0). That is why it is sometimes referred to as RAID 1+0. Important point– RAID 10 can only combine an even number of disks (minimum 4, maximum 16). Advantages: reliability is inherited from the “mirror”, performance for both reading and writing is inherited from “zero”.

RAID 1E.
The letter "E" in the name means "Enhanced", i.e. "improved". The principle of this improvement is as follows: the data is “stripped” in blocks across all disks of the array, and then “striped” again with a shift to one disk. RAID 1E can combine from three to 16 disks. Reliability corresponds to the “ten” indicators, and performance becomes a little better due to greater “alternation”.

RAID 1E0.
This level is implemented like this: we create a “null” array from RAID1E arrays. Therefore, the total number of disks must be a multiple of three: a minimum of three and a maximum of sixty! In this case, we are unlikely to get a speed advantage, and the complexity of the implementation may adversely affect reliability. The main advantage is the ability to combine a very large (up to 60) number of disks into one array.

The similarity of all RAID 1X levels lies in their redundancy indicators: for the sake of reliability, exactly 50% of the total capacity of the array disks is sacrificed.

Depending on the selected RAID specification, read and write speeds and/or data loss protection may be improved.

When working with disk subsystems, IT specialists often face two main problems.

  • The first is the low read/write speed; sometimes even the speeds of an SSD disk are not enough.
  • The second is the failure of disks, which means loss of data, the recovery of which may be impossible.

Both of these problems are solved using RAID technology (redundant array of independent disks - redundant array of independent disks) - technology virtual storage data that combines several physical disks into one logical element.

Depending on the selected RAID specification, read/write speeds and/or data loss protection may be improved.

The RAID specification levels are: 1,2,3,4,5,6,0. In addition, there are combinations: 01,10,50,05,60,06. In this article we will look at the most common types of RAID Arrays. But first let's say that there are hardware and software RAID arrays.

Hardware and software RAID arrays

  • Software arrays are created after installation of the Operating System using software products and utilities, which is the main disadvantage of such disk arrays.
  • Hardware RAIDs create a disk array before installing the Operating System and are not dependent on it.

RAID 1

RAID 1 (also called "Mirror" - Mirror) involves complete duplication of data from one physical disk to another.

The disadvantages of RAID 1 include the fact that you get half the disk space. Those. If you use TWO 250 GB disks, the system will see only ONE 250 GB in size. This type RAID does not provide a gain in speed, but it significantly increases the level of fault tolerance, because if one disk fails, there is always a complete copy of it. Recording and erasing from disks occurs simultaneously. If information was intentionally deleted, then there will be no way to restore it from another disk.

RAID 0

RAID 0 (also called Striping) involves dividing information into blocks and simultaneously writing different blocks to different disks.

This technology increases the read/write speed, allows the user to use the full total capacity of the disks, but reduces fault tolerance, or rather reduces it to zero. So, if one of the disks fails, it will be almost impossible to restore information. To build RAID 0, it is recommended to use only highly reliable disks.

RAID 5 can be called a more advanced RAID 0. You can use up to 3 hard drives. Raid 0 is recorded on all but one, and a special checksum is recorded on the last one, which allows you to save information on the hard drives in the event of the “death” of one of them (but not more than one). The operating speed of such an array is high. If you replace the disk, it will take a lot of time.

RAID 2, 3, 4

These are methods of distributed information storage using disks allocated for parity codes. They differ from each other only in block sizes. In practice, they are practically not used due to the need to devote a large share of disk capacity to storing ECC and/or parity codes, as well as due to low performance.

RAID 10

It is a mix of RAID arrays 1 and 0. And it combines the advantages of each: high performance and high fault tolerance.

The array must contain an even number of disks (minimum 4) and is the most reliable option for storing information. The disadvantage is the high cost of the disk array: the effective capacity will be half of the total capacity of the disk space.

Is a mix of RAID arrays 5 and 0. RAID 5 is being built, but its components will not be independent hard disks, and RAID 0 arrays.

Peculiarities.

If the RAID controller breaks down, it is almost impossible to restore the information (does not apply to the Mirror). Even if you buy exactly the same controller, there is a high probability that the RAID will be assembled from other disk sectors, which means that information on the disks will be lost.

As a rule, discs are purchased in one batch. Accordingly, their working life may be approximately the same. In this case, it is recommended to immediately, at the time of purchasing disks for the array, purchase some excess. For example, to configure RAID 10 of 4 disks, you should buy 5 disks. So, if one of them fails, you can quickly replace it with a new one before other disks fail.

Conclusions.

In practice, most often only three types of RAID arrays are used. These are RAID 1, RAID 10 and RAID 5.

In terms of cost/performance/fault tolerance, it is recommended to use:

  • RAID 1(mirroring) to form a disk subsystem for user operating systems.
  • RAID 10 for data with high write and read speed requirements. For example, for storing 1C:Enterprise databases, mail server, A.D.
  • RAID 5 used to store file data.

The ideal server solution according to the majority system administrators is a server with six disks. The two disks are “mirrored” and the operating system is installed on RAID 1. The four remaining drives are combined into RAID 10 for fast, trouble-free, reliable system operation.

If you are interested in this article, then you have probably encountered or expect to soon encounter one of the following problems on your computer:

- there is clearly not enough physical capacity of the hard drive as a single logical drive. Most often this problem occurs when working with large files (video, graphics, databases);
- the hard drive's performance is clearly not enough. Most often, this problem occurs when working with non-linear video editing systems or when a large number of users simultaneously access files on the hard drive;
- The reliability of the hard drive is clearly lacking. Most often, this problem arises when it is necessary to work with data that must never be lost or that must always be available to the user. Sad experience shows that even the most reliable equipment sometimes breaks down and, as a rule, at the most inopportune moment.
Creating a RAID system on your computer can solve these and some other problems.

What is "RAID"?

In 1987, Patterson, Gibson, and Katz of the University of California, Berkeley, published “A Case for Redundant Arrays of Inexpensive Disks (RAID).” This article described different types disk arrays, denoted by the abbreviation RAID - Redundant Array of Independent (or Inexpensive) Disks (redundant array of independent (or inexpensive) disk drives). RAID is based on the following idea: by combining several small and/or cheap disk drives into an array, you can get a system that is superior in capacity, speed and reliability to the most expensive disk drives. On top of that, from a computer's point of view, such a system looks like one single disk drive.
It is known that the mean time between failures of a drive array is equal to the mean time between failures of a single drive divided by the number of drives in the array. As a result, the array's mean time between failures is too short for many applications. However, a disk array can be made tolerant of the failure of a single drive in several ways.

In the above article, five types (levels) of disk arrays were defined: RAID-1, RAID-2, ..., RAID-5. Each type provided fault tolerance as well as different advantages over a single drive. Along with these five types, the RAID-0 disk array, which is NOT redundant, has also gained popularity.

What RAID levels are there and which one should you choose?

RAID-0. Typically defined as a non-redundant group of disk drives without parity. RAID-0 is sometimes called “Striping” based on the way information is placed on the drives included in the array:

Since RAID-0 does not have redundancy, failure of one drive leads to failure of the entire array. On the other hand, RAID-0 provides maximum data transfer speed and efficient use of disk drive space. Because RAID-0 does not require complex math or logic calculations, its implementation costs are minimal.

Scope of application: audio and video applications requiring high speed continuous data transfer, which cannot be provided by a single drive. For example, research conducted by Mylex to determine the optimal disk system configuration for a non-linear video editing station shows that, compared to a single disk drive, a RAID-0 array of two disk drives provides a 96% increase in write/read speed, of three disk drives - by 143% (according to the Miro VIDEO EXPERT Benchmark test).
The minimum number of drives in a "RAID-0" array is 2.

RAID-1. Better known as "Mirroring" is a pair of drives that contain the same information and make up one logical drive:

Recording is performed on both drives in each pair. However, drives in a pair can perform simultaneous read operations. Thus, "mirroring" can double the read speed, but the write speed remains unchanged. RAID-1 has 100% redundancy and a failure of one drive does not lead to a failure of the entire array - the controller simply switches read/write operations to the remaining drive.
RAID-1 provides the highest speed of all types of redundant arrays (RAID-1 - RAID-5), especially in a multi-user environment, but the worst use of disk space. Because RAID-1 does not require complex math or logic calculations, its implementation costs are minimal.
The minimum number of drives in a "RAID-1" array is 2.
To increase write speed and ensure reliable data storage, several RAID-1 arrays can, in turn, be combined into RAID-0. This configuration is called “two-level” RAID or RAID-10 (RAID 0+1):


The minimum number of drives in a "RAID 0+1" array is 4.
Scope of application: cheap arrays in which the main thing is reliability of data storage.

RAID-2. Distributes data into sector-sized stripes across a group of disk drives. Some drives are dedicated to ECC (Error Correction Code) storage. Since most drives store ECC codes on a per-sector basis by default, RAID-2 does not offer much benefit over RAID-3 and is therefore not used in practice.

RAID-3. As in the case of RAID-2, data is distributed over stripes of one sector in size, and one of the array drives is allocated to store parity information:

RAID-3 relies on ECC codes stored in each sector to detect errors. If one of the drives fails, the information stored on it can be restored by calculating exclusive OR (XOR) using the information on the remaining drives. Each record is typically distributed across all drives and therefore this type of array is good for disk-intensive applications. Because each I/O operation accesses all the disk drives in the array, RAID-3 cannot perform multiple operations simultaneously. Therefore, RAID-3 is good for single-user, single-tasking environments with long records. To work with short notes synchronization of the rotation of the disk drives is required, since otherwise a decrease in the exchange speed is inevitable. Rarely used, because inferior to RAID-5 in terms of disk space usage. Implementation requires significant costs.
The minimum number of drives in a "RAID-3" array is 3.

RAID-4. RAID-4 is identical to RAID-3 except that the stripe size is much larger than one sector. In this case, reads are performed from a single drive (not counting the drive that stores parity information), so multiple read operations can be performed simultaneously. However, since each write operation must update the contents of the parity drive, it is not possible to perform multiple write operations simultaneously. This type of array does not have any noticeable advantages over a RAID-5 array.
RAID-5. This type of array is sometimes called a "rotating parity array". This type of array successfully overcomes the inherent disadvantage of RAID-4 - the inability to simultaneously perform several write operations. This array, like RAID-4, uses stripes large in size, but, unlike RAID-4, parity information is stored not on one drive, but on all drives in turn:

Write operations access one drive with data and another drive with parity information. Since the parity information for different stripes is stored on different drives, multiple simultaneous writes are not possible unless either the data stripes or the parity stripes are on the same drive. The more drives in the array, the less often the location of the information and parity stripes coincides.
Scope of application: reliable large-volume arrays. Implementation requires significant costs.
The minimum number of drives in a "RAID-5" array is 3.

RAID-1 or RAID-5?
RAID-5 uses more economically than RAID-1 disk space, since for redundancy it stores not a “copy” of information, but a check number. As a result, RAID-5 can combine any number of drives, of which only one will contain redundant information.
But higher disk space efficiency comes at the expense of lower information exchange rates. When writing information to RAID-5, the parity information must be updated each time. To do this, you need to determine which parity bits have changed. First, the old information to be updated is read. This information is then XORed with new information. The result of this operation is a bit mask in which each bit =1 means that the value in the parity information at the corresponding position must be replaced. The updated parity information is then written to the appropriate location. Therefore, for each program request to write information, RAID-5 performs two reads, two writes, and two XOR operations.
There is a cost to using disk space more efficiently (storing a parity block instead of a copy of the data): additional time is required to generate and write parity information. This means that the write speed on RAID-5 is lower than on RAID-1 by a ratio of 3:5 or even 1:3 (i.e., the write speed on RAID-5 is 3/5 to 1/3 the write speed RAID-1). Because of this, RAID-5 is pointless to create in software. They also cannot be recommended in cases where recording speed is critical.

Which RAID implementation method should you choose - software or hardware?

After reading the descriptions of the various RAID levels, you will notice that nowhere is there any mention of any specific hardware requirements that are needed to implement RAID. From which we can conclude that all that is needed to implement RAID is to connect the required number of disk drives to the controller available in the computer and install special software on the computer. This is true, but not entirely!
Indeed, it is possible to implement RAID in software. An example would be the OS Microsoft Windows NT 4.0 Server, in which software implementation of RAID-0, -1 and even RAID-5 is possible (Microsoft Windows NT 4.0 Workstation provides only RAID-0 and RAID-1). However, this solution should be considered as extremely simplified and does not allow fully realizing the capabilities of the RAID array. It is enough to note that with software implementation of RAID, the entire burden of placing information on disk drives, calculating control codes, etc. falls on CPU, which naturally does not increase the performance and reliability of the system. For the same reasons, there are practically no service functions here and all operations to replace a faulty drive, add a new drive, change the RAID level, etc. are carried out with complete loss of data and with the complete prohibition of performing any other operations. The only advantage of software implementation of RAID is its minimal cost.
- a specialized controller frees the central processor from basic RAID operations, and the controller’s effectiveness is more noticeable the higher the RAID complexity level;
- controllers, as a rule, are equipped with drivers that allow you to create RAID for almost any popular OS;
- the built-in BIOS of the controller and the management programs included with it allow the system administrator to easily connect, disconnect or replace drives included in RAID, create several RAID arrays, even at different levels, monitor the status of the disk array, etc. With “advanced” controllers, these operations can be performed “on the fly”, i.e. without turning off system unit. Many operations can be performed in " background", i.e. without interrupting current work and even remotely, i.e. from any (of course, if you have access) workplace;
- controllers can be equipped buffer memory(“cache”), in which the last few blocks of data are stored, which, with frequent access to the same files, can significantly increase the performance of the disk system.
The disadvantage of hardware RAID implementation is the relatively high cost of RAID controllers. However, on the one hand, you have to pay for everything (reliability, speed, service). On the other hand, recently, with the development of microprocessor technology, the cost of RAID controllers (especially younger models) began to fall sharply and became comparable to the cost of ordinary disk controllers, which makes it possible to install RAID systems not only in expensive mainframes, but also in servers entry level and even to workstations.

How to choose a RAID controller model?

There are several types of RAID controllers depending on their functionality, design and cost:
1. Drive controllers with RAID functionality.
In essence, this is an ordinary disk controller, which, thanks to special BIOS firmware, allows you to combine disk drives into a RAID array, usually of level 0, 1 or 0+1.

Ultra (Ultra Wide) SCSI controller from Mylex KT930RF (KT950RF).
Externally this controller no different from an ordinary SCSI controller. All “specialization” is located in the BIOS, which is divided into two parts - “SCSI Configuration” / “RAID Configuration”. Despite its low cost (less than $200), this controller has a good set of functions:

- combining up to 8 drives into RAID 0, 1 or 0+1;
- support Hot Spare for on-the-fly replacement of a failed disk drive;
- the ability to automatically (without operator intervention) replace a faulty drive;
- automatic control of data integrity and identity (for RAID-1);
- presence of a password to access the BIOS;
- RAIDPlus program that provides information about the state of drives in RAID;
- drivers for DOS, Windows 95, NT 3.5x, 4.0