Large scale data recovery procedures call for many functionally critical parameters related to failure detection latency, recovery mechanism, redundancy scheme, design principles, algorithm selection and reliability of recovery, pre recovery assumptions, disk bandwidth, disk space, drive replacement and others. These parameters have to be evaluated before taking any active measures for the initiation of data recovery procedures, since they need to be incorporated in them. Professional data recovery experts will be able to perform the tasks of evaluation, planning and executing the actions based on feasibility of these parameters. As a corporate or business entity your aim is to reduce the time frame and increase the efficiency of data recovery by adopting safer methods.
We have a team of experts who have hands on experience in handling disaster recovery management for small, medium and large scale enterprises. Our infrastructure of hardware testing tools, repair and replacement utilities, components and parts stocks, OEM and enterprise software applications make it possible to implement large scale data recovery in a systematic manner. Documentation of all the procedures is maintained at all stages of recovery procedures, which can be used for technical references. The corrective and preventive measures we take ensure long term protection from repetitive patterns of drive and hardware failures in the network servers and workstations.
Failure detection latency
When an imminent disk failure occurs in a windows blade server, the time gap between failure detection and corrective actions matters a lot for the network users. Failure detection latency may take about 15 to 20 seconds if the system is completely automated. Assessing the nature of failure is one of the critical parameters which affect the time require for taking corrective actions. This requires a team of hardware experts equipped with latest hardware and software utilities. They will be able to connect these utilities to the server disk drive without having to dismantle the drive from the server computer cabinet. This may take about 30 minutes to one hour, depending on the nature and intensity of the failure and the sophistication of diagnosis tools used. Once the nature of failure is detected, the next step will be to plan for the recovery and reconstruction process.
This stage could be slightly complex in nature as the decision to go for data recovery before hardware correction or after hardware correction will be in question. This again depends on the nature of failure. If the data region in the disk drive can be accessed, the technicians will be able to mirror the disk image onto another secondary donor drive to ensure safety of data. If no access is available for the data areas due to faults in the PCB and mechanical components of the HDD, the technicians will decide to go for the hardware corrections before data recovery process can be set in action.
This could open up risk factors for the safety of data within the server disk drive as the process of hardware corrections could damage part or complete portions of the data storage areas. In such cases the decision to use disk platter exchange method can be one of the best possible solutions to this problem. However one cannot rule out the probability of permanent data loss during this process due to mishandling of platters or other probable mechanical failures during the operations. So risk factors are always involved, no matter which procedures are adopted in data recovery procedures.
Meanwhile the blade server needs to be kept in working condition during the time of data recovery since 100s of network users will be actively connected to it. One of the practical methods of solving this problem is by connecting the network users to the backup server immediately after the failure of primary server disk drives. Most of the automated systems will be able to do it immediately after the failure.
Data transfer bandwidth
One the process of data recovery starts, the technician will pay attention to the bandwidth of data transfer or disk imaging process. Two way mirroring systems tend to take shorter time compared to other types of systems. This is where the Fast Recovery mechanism (FARM) plays a critical role in increasing the recovery bandwidth to optimum levels. It may not be possible to infinitely increase the bandwidth without considering the capacity of target drive onto which data is being transferred. When the storage capacity of the target drive is GB, the transfer rates will have to be slowed down to provide compatibility. If the storage capacity is in TB or PB (PETA bytes), then it is possible to increase the bandwidth up to 2GB/second or more. This is again an approximated optimum speed while keeping the safety of data and reliability of operation in view.
Disk space usage
Distributing the process of data recovery from the mains server to single disks may increase the problems related to congestions in target disk space utilization. FARM technology has eased this process by channeling the recovery process to multiple disks across the same server space or remote disks. In this aspect placement algorithm plays a critical role in optimizing the bandwidth of transfer as well as reducing the congestions in the target disk space. As the capacity-utilization increases, the possibility of congestion decreases over time.
Frequency of drive replacement in large scale servers can help reduce the chances of disk failures considerably. In the way large organizations will be avoiding the loss of time and resources spent in emergency data recovery procedures.
Design principles adopted in the data recovery systems can make a lot of difference to the speed and accuracy of salvaging process. If your data server is working on RAID-5 system, it is always better to have an additional disk in the array, which needs to be always kept reserved. This can be used by other disks to store user data as well as parity data and other Meta data details. In case of disk failures, this vital information can be used to reconstruct the failed disks in a faster manner.