Diagnosis of hardware problems. Review: software diagnostic tools Hardware and software technical diagnostic tools

All funds are divided into:

1. Test control and diagnostic tools. Test is a test with a known standard result. Test control – performing control and diagnostics using a test. When carrying out test control, the test object is first removed from the control loop.

2. Functional control means. Control of the system during operation.

Due to the fact that functional control and testing are carried out under different conditions, the volumes of checks performed are different. Monitoring and diagnostic algorithms can be:

· Conditional.

· Unconditional.

Any control process is a management process. The purpose of such a management process is to determine the class of technical condition or conditions with the greatest confidence.

In practice, the depth of functional control is lower than the depth of test control and diagnostics. In order to be able to carry out controls, special means must be included when designing systems.

Monitoring and diagnostic tools and functional monitoring tools can be:

ü Software.

ü Hardware.

ü Hardware and software.

Test control and diagnostics fall into the category of preventive means. Functional monitoring tools are designed to detect errors during system operation.

Generalized functional diagrams of test control and diagnostic tools and functional control.

OP – service personnel.

The controller is a device that generates a vector of input influences, as directed by the OP.

OK – object of control.

BRIRK – block for recognizing and recording control results.

Decision – a block that generates a decision based on the results of control.

IOC – model of the control object.

In most cases, the monitoring and diagnostic system operates under the control of the OP. The OP specifies a check from the set of valid checks. The master generates a vector of input influences. In BRIRK the vector of input influences is also recognized from the block memory is selected reference result. The control object processes input influences and produces results. The decision block compares the two results and concludes that the test was successful.

A reliably verified same object of control can be used as a model of the control object; physical or mathematical model control. The peculiarity of functional control is that the main conclusion: the reaction of the object of control and the reaction obtained on the model of the object of control do not contradict each other. An alarm is only triggered if the results contradict each other.

Hence, the decision in the SFC is more complex. We need mathematical facts that the results contradict each other.

Technical diagnostics and control tools (TDK) are the main part of the TDS; they determine the operational and technical characteristics of these systems and provide all the necessary information to consumers about the technical condition of the diagnosed RES. In diagnostics, they play the role of terminal devices, being sources of information for the consumer and at the same time a receiver and processing device for diagnostic information. SDK belongs to a wide class of information-measuring systems (IMS), acts as terminal devices SDK and IMS, with their parameters, determine all output parameters of the system. If the diagnostic object allows for a certain depth of search for the location of the failure, and the SRDK is not adapted for this, then this operation cannot be carried out at the required level.

Technical diagnostic and control tools.

Thus, the main requirement for the SDK is the need to ensure that the capabilities and parameters of the SDK correspond to the capabilities and parameters of the diagnostic object. In addition, modern information measuring systems for monitoring and diagnosing RES are complex radio-mechanical systems that characterize a set of functional use parameters (FU), technical and operational parameters. From this point of view, SDKs can be considered as objects of diagnostics and objects of metrological support.

Being an integral part of the STD, SRDK determine the control and suitability of the diagnostic object, which is a property of the product that characterizes its suitability for diagnostics and control by specified means. Consequently, when analyzing in STD for any complex RES, the SrDC must either be specified in advance or designed together with the diagnostic object.

Rice. 6. Classification of SrDC.

Signs: 1 – by the nature of unique tasks; 2 – by communication method and location; 3 – by purpose and type of information processing; 4 – according to the mode of monitoring the diagnostic object and the frequency of use; 5 – by the method of processing information and presenting results; 6 – by type of programming, indication and registration; 7 – according to the degree of unification and automation.

1 sign: 8 – performance monitoring; 9 – control and diagnostics; 10 – diagnostics; 11 – performance prediction; 12 – forecasting control; 13 – control control;

2 sign: 14 – built-in; 15 – external; 16 – mixed; 17 – motionless; 18 – movable;

3 sign: 19 – operational; 20 – pre-launch; 21 – preventive; 22 – technological; 23 – specialized; 24 – universal;

4 sign: 25 – with static mode; 26 – with dynamic mode; 27 – with continuous monitoring; 28 – with periodic monitoring; 29 – with sequential fault finding; 30 – with combined troubleshooting;

5 sign: 31 – analog; 32 – discrete; 33 – analog-discrete; 34 – with a tolerance assessment of the results; 35 – with quantitative assessment of results;

6 sign: 36 – with external programming; 37 – with internal programming; 38 – with centralized display and registration; 39 – with mixed indication and registration; 40 – with autonomous indication;

7 sign: 41 – unified; 42 – non-standardized; 43 – semi-automatic; 44 – automatic.

Classifying SRDK as an integral part of funds, they can be divided into the following funds:

    universal application (computer-based) and specialized application equipment (diagnostic stands);

    built-in control and means with external control;

    automatic (over 90% of operations are performed automatically), automated (40% - 90% of operations are performed automatically) and manual.

The STD classification makes it possible to describe the purpose of control means, methods of control and communication with the object, methods of obtaining and processing information.

The most widely used are STDs that assess the technical condition of an object at the time of inspection (STDs with performance prediction are promising).

SrDC parameters.

SRDK as a means of technical operation of RES can be classified into

    information and measuring instruments for general use (voltmeters, ammeters, oscilloscopes, generators, etc.);

    simulators and system parameter meters (various testers);

    simulators of signals of certain types of electronic zones;

    complex instruments for checking the operating condition of the REUiS;

    complex stands for diagnosing, monitoring, adjusting and restoring REUiS;

    diagnostic complexes for setting up complex systems;

    automatic and automated devices and computer-based control systems.

The main parameters of the SDK are: measurement accuracy, reproduction accuracy of emitted signals, information productivity, instrumental reliability, resolution, degree of automation. All of the listed parameters relate to the SrDC derivatives. The technical parameters of the SRDK are the same technical parameters that were considered for the REUiS (operating conditions and reliability parameters are taken into account).

Diagnostic tools are also objects of technical operation and diagnostic objects; for this purpose, they provide self-monitoring modes, which are implemented using built-in or external systems of monitoring and diagnostic tools.

The accuracy of measuring instruments can be assessed by a measure of accuracy, where is the root-mean-square error. The main share of the measurement error comes from primary transducers and elements of the measuring path. In general, it is determined by the expression: , where is the root-mean-square error of the converters, is the mean square error of the normalizers, is the mean square error of the switches, and is the mean square error of the measuring device itself.

The accuracy of reproduction of simulation signals is characterized by errors in electrical or technical and functional parameters. The productivity of the SRDK is set by the average operational duration of diagnosis or the number of REUiS diagnosed for a given interval T: , where is the duration of diagnosis. The performance of the SRDK depends on the input capacities, as well as on the time the means are ready for diagnostics. Input capacity refers to the maximum number of diagnostic indicators that can be determined during the diagnostic process. The resolution of the SDK characterizes the component of the output information, which determines the possibility of separately reproducing data from two different sources (signals of one block or signals about the state of two different blocks). The degree of automation shows the number of automated operations relative to their total number, is a ratio. As indicators of the SRDC, the technical utilization coefficient SRDC () and its various modifications can be used.

Diagnostics of CS faults has two aspects: hardware and software. The hardware aspect involves the use of hardware diagnostic tools - standard KIA, special KIA, service boards, devices and complexes.

In the hardware diagnostic method, tools and instruments are used to measure voltages, signal parameters and logic levels in PC circuits. This method requires deep knowledge of the logic of PC operation, microcircuitry, radio electronics, electronic components and certain skills in working with service test equipment. It should be noted that purely hardware diagnostics practically do not occur, except when diagnosing using fault dictionaries or tables of reference states, and even then - the symptoms that in these cases have to be guided by are generated either by the OS, or by a test program, or by a firmware test, and this is no longer purely hardware diagnostics. Diagnostics of individual computer nodes, such as thermal electronics, can be considered purely hardware diagnostics, which are checked not by automatically performing APS verification tests, but by submitting testing sequences to the node under study directly from a service device, for example UTK, or a generator of stimulating effects.

The software aspect of diagnostics involves the use of testing programs of various classes: firmware tests, built-in test programs, external test programs for general use, and finally, external test programs for in-depth testing. This should also include those small programs or examples that users of hardware and software systems have to write for specific cases of diagnosing faults of a separate PC node in a specific mode of its operation.

With the software diagnostic method, most of the diagnostic procedures are assigned to diagnostic software. This method requires certain knowledge of various diagnostic programs, starting with the POST program and ending with software tools for in-depth diagnostics of aircraft components.

The automatic diagnostic system is a complex of software, firmware and hardware and reference documentation (diagnostic manuals, instructions, tests). There are test and functional diagnostic systems. In test diagnostic systems, the effects on the device being diagnosed come from diagnostic tools. In functional diagnostic systems, the influences received by the device being diagnosed are specified by the operating algorithm of operation. The diagnostic process consists of certain parts (elementary checks), each of which is characterized by a test or operating effect applied to the device and a response taken from the device.

The resulting response value (signal values ​​at control points) is called the result of an elementary check. The object of an elementary test is that part of the equipment of the device being diagnosed for testing, which is used to calculate the test or operational impact of the elementary test. The set of elementary checks, their sequence and rules for processing the results determine the diagnostic algorithm. A diagnostic algorithm is called unconditional if it specifies one fixed sequence for implementing elementary checks. A diagnostic algorithm is called conditional if it specifies several different sequences for implementing elementary checks.

Windows XP obtains performance data from computer components. A running system component generates performance data. This data is represented as a performance object, which is usually named the same as the component that generates the data. For example, the Processor object is a collection of data about the performance of the processors present in the system.

The various performance objects built into the operating system typically correspond to major hardware components such as memory, processors, etc. Other programs may install their own performance objects. For example, services such as WINS provide performance objects that can be monitored using charts and logs. Each performance object contains counters that provide information about specific elements of the system or service. For example, the Memory object's Pages Exchanged per Second counter tracks the rate at which memory pages are exchanged. Although there may be many more objects in a system, it is usually the most common to monitor system components The following objects are available by default: cache, memory, objects, page file, physical disk, process, processor, server, system, thread.

The System Monitor and Alerts and Performance Logs components provide details about resources used by specific objects operating system and programs designed to collect data. Performance data is displayed in chart form. In addition, data is recorded in logs (Appendix B). The Alerts feature allows you to notify users via Windows Messaging when a counter value reaches, exceeds, or falls below a specified threshold.

Performance monitoring results are often used by the service technical support Microsoft to help diagnose the problem. Therefore, monitoring system performance is recommended as one of the administrator's tasks.

Task Manager is another tool for getting information about the performance of a computer running Windows XP. Task Manager provides information about the programs and processes running on your computer, as well as a summary of CPU and memory usage

The package of diagnostic utilities SiSoft Sandra (the abbreviation stands for System Analyzer Diagnostic and Reporting Assistant, which means: an assistant in analyzing and diagnosing the system) is one of the solutions for the non-professional user. Part full version The package includes about 70 modules for collecting information about all the main components of the PC. It is possible to check the location and contents of the main configuration files. GUI program is quite clear and allows you to get the most full information about the computer, including sometimes undocumented information. The main window of the program resembles a panel Windows management, only with more shortcuts. Each of them corresponds to a separate utility responsible for collecting and displaying information about specific device, included in the system, providing data about the manufacturer, version, date of manufacture, performance, etc. After installation, a shortcut to SiSoft Sandra appears on the Desktop and in the Control Panel. Double click Clicking this icon with the mouse calls up the package shell, which is a window with icons of the utilities included in it. There are four icon display modes: Information Utilities, Performance Utilities, View system files, testing utilities. The choice of one mode or another is carried out through the icons on the ruler at the top of the shell window. By default, the display mode of information utility icons is set.

Summary information about the computer under test is presented in Appendix A. Stress testing of the computer system is presented in Appendix B.

Two main groups of methods are used to monitor and diagnose digital devices: test and functional. To implement them, hardware and software are used. During test control, special influences (tests) are applied, and reactions of the controlled system (device, unit) are removed and analyzed at a time when it, as a rule, does not work for its intended purpose. This determines the scope of application of this type of control: in the process of setting up systems, during regulations, for autonomous testing of systems before the start of normal operation.

Functional control is designed to monitor and diagnose the system during its operation. However, if functional control means are available in the system, then, as a rule, they are also used during test control. Functional control means provide:

Detection of a fault at the moment of its first manifestation at the control point, which is especially important in the case when the action of the fault must be quickly blocked;

Providing information necessary to control the operation of the system in the presence of a malfunction, in particular, to change (reconfigure) the structure of the system;

Reduced troubleshooting time.

Using hardware functional control, redundant equipment is introduced into the component or device, which functions simultaneously with the main equipment. Signals arising during the operation of the main and control equipment are compared according to certain laws. As a result of such a comparison, information is generated about the correct functioning of the monitored node (device). In the simplest case, a copy of the node being tested is used as redundant equipment (the so-called structural redundancy), as well as the simplest control relation in the form of a comparison of two identical sets of codes. In the general case, simpler control devices are used, but the methods for obtaining control relationships become more complicated.

To monitor the functioning of the main and control devices, comparison methods are used: input and output words, internal states and transitions.

The first method is duplication, majorization, as well as control using prohibited code combinations. It also includes redundant coding methods. Redundant coding is based on the introduction of additional symbols into the input, processed and output information, which, together with the main ones, form codes that have error detection (correction) properties. The second method is used primarily for monitoring digital control devices.



For control, they have become widespread following types codes: parity code, Hamming code, iterative codes, equilibrium codes, remainder codes, cyclic codes.

Code with parity (odd) check is formed by adding one redundant (control) bit to the group of information bits, which are a simple (non-redundant) code. When using parity, the parity check digit is "0" if the number of ones in the code is even, and "1" if the number of ones is odd. Subsequently, during transmission, storage and processing, the word is transmitted with its digit. If, when transmitting information, the receiving device detects that the value of the check bit does not correspond to the parity of the sum of the word units, then this is perceived as a sign of an error. Odd parity controls the complete loss of information, since a code word consisting of zeros is prohibited. Parity-checking code has little redundancy and does not require large hardware costs to implement the check. This code is used to control: transfer/information between registers, reading information in RAM, exchanges between devices.

Iterative codes used to control the transfer of code arrays between an external memory and a processor, between two processors, and in other cases. The iterative code is formed by adding additional parity bits to each row of each column of the transmitted array of words ( two-dimensional code). In addition, parity can also be determined by the diagonal elements of the word array (multidimensional code). The detection ability of the code depends on the number of additional control characters. It detects multiple errors and is easy to implement.



Correlation codes are characterized by the introduction of additional symbols for each digit of the information part of the word. If there is a 0 in any digit of the word, then in the correlation code it is written as “01”; if it is 1, then with the symbol “10”. A sign of code corruption is the appearance of the characters “00” and “11”.

Code with simple repetition(match control) is based on repeating the original code combination; decoding occurs by comparing the first (information) and second (check) parts of the code. If these parts do not match, the accepted combination is considered incorrect.

Equilibrium codes are used to control data transfers between devices, as well as when transferring data over communication channels. An equilibrium code is a code that has a certain fixed number of units (weight is the number of units in the code). An example of an equilibrium code is the code "2" from "5", from "8". There is an infinite number of equilibrium codes.

Control for prohibited combinations, Microprocessor devices use special circuits that detect the occurrence of prohibited combinations, for example, access to a non-existent address, access to a non-existent device, or incorrect choice of address.

Hamming correction code is constructed in such a way that a certain number is added to the available information bits of the word D control bits, which are formed before transmitting information by calculating the parity of the sums of units for certain groups of information bits. The control device at the receiving end forms an error address from the received information and control bits through similar parity calculations; the erroneous bit is corrected automatically.

Cyclic codes used in means of sequential transmission of binary symbols that make up a word. A typical example of such means is a communication channel through which discrete data is transmitted. The peculiarity of cyclic codes that determine their name is that if an N-digit code combination belongs to a given code, then the combination obtained by cyclic permutation of signs also belongs to this code. The main element of encoding and decoding equipment when working with such codes is a shift register with feedback, which has the necessary cyclic properties. The cyclic code of an N-digit number, like any systematic code, consists of information signs and control signs, the latter always occupying the low-order digits. Since serial transmission is carried out starting from the most significant bit, the control characters are transmitted at the end of the code.

Software Functional monitoring is used to improve the reliability of the functioning of individual devices, systems and networks in cases where the effectiveness of hardware error detection is insufficient. Software methods functional diagnostics are based on establishing certain relationships between objects involved in the course of work to ensure error detection. Objects can be individual commands, algorithms, program modules, and software packages (functional and service).

Control relationships are established at the system, algorithmic, software and firmware levels.

The formation of control states is based on two principles:

Implementation of functional diagnostic methods based on coding theory by software at various levels, i.e. information redundancy is used;

Drawing up special ratios according to various rules based on the use of temporary redundancy (double and multiple counting, comparison with pre-calculated limits, truncation of the algorithm, etc.) by transforming the computational process.

Both principles are used to diagnose all basic operations performed by processor means - input-output operations, storage and transmission of information, logical and arithmetic.

The advantage of functional control software is its flexibility and the ability to use any combination for prompt error detection. They play an important role in ensuring the required level of reliability of information processing. For their implementation, they require additional costs of computer time and memory, additional programming operations and preparation of control data.

Control by double or multiple counting method consists in the fact that the solution of the entire problem as a whole or its individual parts is performed two or more times. The results are compared and their coincidence is considered a sign of fidelity. More complex comparison rules are also used, for example, majorized ones, when a result that corresponds to a larger number of correct results is accepted as correct.

The implementation of double or multiple counting is that control points are determined at which the comparison will take place, and special amounts of memory are allocated to store the results of intermediate and final calculations, comparison commands are used and conditional transition to continue the calculation (if the results coincide) or to the next repetition (if the results do not match.).

Control using the truncated algorithm method, Based on the analysis of algorithms executed by the processor, a so-called truncated algorithm is constructed. The problem is solved using both a full algorithm, which provides the necessary accuracy, and a truncated algorithm, which allows one to quickly obtain a solution, although with less accuracy. Then a comparison is made between the exact and approximate results. An example of a truncated algorithm is changing the solution step (increase) when solving differential equations.

Substitution method. When solving systems of equations, including nonlinear and transcendental ones, it is necessary to substitute the found values ​​into the original equations. After this, the right and left sides of the equation are compared to determine the residuals. If the residuals do not fall outside the specified limits, the solution is considered correct. The time spent on such control is always less than on a repeated solution. In addition, in this way, detect not only random, but also systematic errors, which are often missed by double counting.

Limit testing method or the "forks" method. In most problems, you can find in advance the limits (“fork”) within which some of the required quantities should lie. This can be done, for example, based on an approximate analysis of the processes described by this algorithm. The program provides certain points where a check is implemented to ensure that variables are within specified limits. Using this method, you can detect gross errors that make continuation of work pointless.

Validation using additional connections. In some cases, it is possible to use additional connections between the desired quantities for control. A typical example of such relationships are the well-known trigonometric relations. It is possible to use correlation connections for tasks of processing random processes and static processing. A variation of this approach are the so-called balance methods; their essence is that individual groups of data satisfy certain relationships. The method allows you to detect errors caused by failures.

Method of redundant variables consists in introducing additional variables that are either related by known relationships to the main variables, or the values ​​of these variables under certain conditions are known in advance.

Control by counting back method, in this case, based on the obtained result (function values), the initial data (arguments) are found and compared with the initially specified initial data. If they coincide (with a given accuracy), then the result obtained is considered correct. For counting backwards, inverse functions are often used. The use of this method is advisable in cases where the implementation of inverse functions requires a small number of instructions, computer time and memory.

Checksum method. Separate arrays of code words (programs, source data, etc.) are assigned redundant control words, which are obtained in advance by summing all the words of a given array. To carry out control, the summation of all words of the array and bitwise comparison with the reference word are carried out. For example, when transmitting data over a communication channel, all encoded words, numbers and symbols of the transmitted group of records are summed at the input to obtain checksums. The checksum is recorded and transmitted along with the data.

Control by counting recording method. A record is a precisely defined set of data characterizing an object or process. You can calculate in advance the number of records contained in individual arrays. This number is recorded in memory. When the corresponding data set is processed, the check number is periodically checked to detect lost or unprocessed data.

Time control for solving problems and the frequency of the results produced is one of the principles for determining the correctness of the computational process. An excessive increase in the duration of the solution indicates that the program is “cycling.” The so-called marker pulses (or time stamps) used in real-time systems serve the same purpose. Marker pulses are used to prevent the processor from stopping or performing incorrect calculation cycles due to an error in the command sequence. They are used both for the entire algorithm and for individual sections.

The implementation of these methods consists in determining the longest route for commands, taking into account interruptions by other programs. The processor uses a program time counter, on which the maximum permissible time for program implementation is set. When the counter reaches zero, a signal is generated that the permissible control time has been exceeded, which interrupts the program. The execution sequence of commands and program modules is controlled in two ways. The program is divided into sections, and for each section a convolution is calculated (by counting the number of operators, using signature analysis, using codes). Then the trace of the program is taken and the convolution is calculated for it and compared with the previously calculated one. Another way is that each site is assigned a specific code word (site key). This key is written to the selected RAM cell before the execution of the section begins, one of last teams the site checks for the presence of “its” key. If the code word does not match the section, then there is an error. The nodes of branching programs are checked by repeated counting, and the selection of only one branch is checked using keys. Control of cyclic sections of a program consists of checking the number of repetitions of the cycle by organizing an additional program counter.

At test control testing of components, devices and the system as a whole is carried out using special equipment - test stimulus generators and output reaction analyzers. The need for additional equipment and time costs (the impossibility of regular functioning during the test limits the use of test methods.

Testing with standard program, the functional diagram of the organization of such testing includes a test generator containing a set of pre-prepared statistical tests and an analyzer that works on the principle of comparing the output reaction with a standard one, also obtained in advance by special means preparing tests.

At probabilistic testing as a test generator, a generator of pseudo-random influences is used, implemented, for example, by a shift register with feedback. The analyzer processes output reactions according to certain rules (determines the mathematical creation of the number of signals) and compares the obtained values ​​with the reference ones. Reference values ​​are calculated or obtained on a previously debugged and tested device.

Contact testing(comparison with a standard) is that the stimulation method can be any (software, from a generator of pseudo-random influences), and standard reactions are formed during testing using a duplicating device (standard). The analyzer compares the output and reference reactions.

Syndromic testing(method of counting the number of switches). Functional diagram contains a test generator that generates counts of 2N sets at the input of the circuit, and at the output there is a counter that counts the number of switchings; if the number of switchings is not equal to the reference value, then the circuit is considered faulty.

At signature testing output reactions obtained over a fixed time interval are processed on a shift register with feedback - a signature analyzer that allows you to compress long sequences into short codes (signatures). The signatures obtained in this way are compared with the reference ones, which are obtained by calculation or on a previously debugged device. Stimulation of the control object is carried out using a generator of pseudo-random influences.

In conclusion, it should be noted that there is no universal control method. The choice of method should be made depending on the functional purpose of the digital device, the structural organization of the system, and the required reliability and reliability indicators.

When carrying out routine maintenance or during pre-flight preparation of the IVK, the main control methods are test methods. During the flight, the main ones are functional control methods, and testing is mainly carried out with the aim of localizing faults if they occur.

6. PREDICTION OF THE STATE OF MEASURING AND COMPUTING COMPLEXES WHEN ACCOUNTING THE INFLUENCE

ELASTIC PROPERTIES FOR THE OBJECT OF CONTROL

If you need to repair something, you first need to determine what has gone wrong, and that’s what diagnostics are for. It is advisable to carry it out to be 90% sure of the cause of the breakdown.

You can simply install a special program to diagnose your computer and identify problems, both in software and internal components PC rather than reinstalling Windows. You should always consider other reasons why your computer may behave strangely.

Also infected with viruses or other malicious software. One of the most common problems. The same viruses can themselves control the behavior of a PC, or through damage to its operating system. Here everything can be solved with the help of an antivirus program and Firewall.

PC not optimized or configured:

This is also a very common problem. For example, some errors in the sectors of the computer. Here everything is solved with the help of PC optimization software.

Failure in hardware or programs:

That is, there are some problems with PC components, for example, with motherboard, video card and so on. Here you already need a computer diagnostic program. It will help identify all or most problems and, in some cases, the best options for solving them.

Diagnostic programs:

Universal programs, that is, they diagnose all PC systems. They will be useful, first of all, to a simple user. Because they also provide a complete description of all computer systems. They have an excellent suite for testing all PC components, both programs and devices.

These include:

  • 1) SiSoftware Sandra Lite
  • 2) PC Wizard
  • 3) AIDA64
  • 4) Everest Home Edition.
  • - special programs- most often they specialize in work hard drives, flash drives and other storage devices. It’s better to use them very carefully and not press anything unless you know exactly what it’s for and how it works. Because the consequences can be unpredictable.

Diagnosis of hardware problems.

First, it’s worth understanding the reasons that can cause this phenomenon. As you know, both dust and unfavorable climatic conditions worsen the condition of PC components. Accordingly, failure of iron can be caused by oxidation of contacts, dust (and therefore static electricity) on microcircuits and connectors, or their overheating. Overheating can also be caused by poor cooling.

Also, all these reasons can be the result of a voltage surge, instability of the power supply, as well as improper grounding. The first thing we can recommend here is to use surge protectors, UPS and computer grounding. It is better not to ground the computer at all than to ground it incorrectly. Ground the PC case and modem with telephone line need to be separately. Do not ground the housing to a heating battery, for example, a refrigerator, washing machine or a hammer drill. In this case, this will already become a phase with a potential difference. It is not advisable to ground several devices at the same time. It is not recommended to connect household appliances to the same surge protector with a computer, but a monitor, printer and system unit It’s better to turn it on from one surge protector.

Microcircuits can also be caused by shorting a wire or getting power to the ground contact. Therefore, it is always worth monitoring the quality of cable connections and their condition.

Typical problems:

The smell of burning, where is it coming from. If it is not there, then it is worth checking the reliability of the power connection. If the check does not help, then you should turn on the PC and check whether the fans of the power supply unit (PSU), case and processor cooler are spinning (at the same time, check the cooler mounting). If they do not spin and the hard drive does not make the characteristic sound of spindle spinning, then the power supply has failed. The presence of voltage at its output can be checked with a tester by measuring the voltage at the contacts of the system board in the place where the power wiring harness is connected to the power supply. It is worth connecting a new power supply and checking the integrity of the remaining components. First, they need to be visually inspected for the presence of burnt elements.

Despite the fact that the work monitor breaks quite rarely, it is worth checking whether signals are being supplied to it from the video adapter. To do this, use an oscilloscope on pins 10 and 13 (ground and sync, respectively) of the 15-pin D-Sub connector of the video adapter inserted into the motherboard to check the presence of operating signals.

To facilitate the task of finding a faulty component, the most common symptoms of breakdowns of various equipment are used. When a processor fails, most often traces of burning are visible on its legs.

They can be identified by burnt legs and darkening in this area. There are also failures of clock generators and delay lines, as well as burnout of ports.

Also sometimes encountered is a broken contact on the board. This may be caused by the expansion card not being placed completely in the slot, the board being bent, the contacts on the back of the board being shorted to the case, or the wires running from the power supply to the motherboard being insufficiently long.

IN hard drives the most vulnerable point is an overheated controller and IDE connector. A burnt-out controller can be identified by darkening near its mounting points. Overheating of the microcircuit also leads to deterioration of the contact between the HDD controller and the HDA. Mechanical problems with the hard drive engine can be determined by the strong vibration of the HDD case when the disks rotate. Massive problems were noticed with IBM DTLA and Ericsson series drives (70GXP and 60GXP), Maxtor 541DX, Quantum Fireball 3, Fujitsu MPG series.

In CD drives, the optical-mechanical part most often fails. In particular, the mechanism for laser positioning and disc detection. As a rule, such a breakdown is caused by a malfunction of the MCU (system control microprocessor), which generates control signals, as well as the laser reader motor driver, which is responsible for the excitation signal. To check them, it is necessary to measure the output signals at the corresponding contacts of the MSU. A characteristic symptom of a malfunctioning MSU is the lack of movement of the laser reader when the power is initially turned on. Floppy disk drives most often experience mechanical failures associated with the lifting and pressing of floppy disks.

Software and hardware diagnostics.

If all of the above did not help determine the breakdown, then you will have to move on to software and hardware diagnostics. And in order for it to be successful, you need to know exactly what the order of turning on PC devices is.

Computer boot order.

  • 1) after turning on the power, the power supply performs self-testing. If all output voltages meet the required voltages, the PSU outputs a Power_Good (P_G) signal to the motherboard on pin 8 of the 20-pin ATX power connector. About 0.1-0.5 s passes between turning on the PC and sending a signal.
  • 2) the timer chip receives the P_G signal and stops generating the Reset initial setting signal supplied to the microprocessor. If the processor is faulty, the system freezes.
  • 3) The CPU is operational, then it begins to execute the code written in the ROM BIOS at address FFFF0h (the address of the system reboot program). This address contains the JMP unconditional jump command to the start address of the system boot program through a specific BIOS ROM (usually address F0000h).
  • 4) execution of specific ROM BIOS code begins. The BIOS begins checking system components for functionality (POST - Power On Self Test). If an error is detected, the system will beep because the video adapter has not yet been initialized. The chipset and DMA are checked and initialized, and a memory capacity test is performed. If the memory modules are not fully inserted or some memory banks are damaged, then either the system freezes or long, repeating beeps sound from the system speaker.
  • 5) the BIOS image is unzipped into RAM for faster access to BIOS code.
  • 6) the keyboard controller is initialized.
  • 7) BIOS scans the memory addresses of the video adapter, starting from C0000h and ending with C7800h. If the BIOS of the video adapter is found, the checksum (CRC) of its code is checked. If the CRCs match, then control is transferred to the Video BIOS, which initializes the video adapter and displays information about the Video BIOS version. If the checksum does not match, the message “C000 ROM Error” is displayed. If Video BIOS is not found, then the driver written in the BIOS ROM is used, which initializes the video card.
  • 8) ROM BIOS scans the memory space starting from the C8000h, looking for the BIOS of other devices such as network cards and SCSI adapters, and checking their checksum.
  • 9) The BIOS checks the value of the word at address 0472h to determine whether it should boot hot or cold. If the word 1234h is written to this address, then the POST procedure is not performed and a “hot” boot occurs.
  • 10) in case of cold boot, POST is performed. The processor is initialized and information about its make and model is displayed. One short signal is issued.
  • 11) RTC tested ( Real Time Clock).
  • 12) determining the CPU frequency, checking the type of video adapter (including built-in).
  • 13) testing standard and extended memory.
  • 14) assignment of resources to all ISA devices.
  • 15) initialization of the IDE controller. If a 40-pin cable is used to connect ATA/100 HDD, a corresponding message will appear.
  • 16) initialization of the FDC controller.
  • 17) ROM BIOS looks for the system floppy disk or MBR hard drive and reads sector 1 on track 0 of side 0, copies this sector to address 7C00h. Next, this sector is checked: if it ends with the signature 55AAh, then the MBR looks through the Partition Table and looks for the active partition, and then tries to boot from it. If the first sector ends with any other signature, then the Int 18h interrupt is called and the message “DISK BOOT FAILURE, INSERT SYSTEM DISK AND PRESS ENTER” or “Non-system disk or disk error” is displayed on the screen.

Concerning last point, then the errors indicated in it indicate a malfunction of the hard drive (software or hardware). Now all that remains is to identify at what exact moment the computer stops working. If this occurs before messages appear on the monitor, the malfunction can be determined by sound signals. Most common sound signals are given in the table.

Table 1 - IBM BIOS error beeps

Table 2 - Sound codes IBM POST AMI BIOS faults


It is worth noting that the sound signals may differ from those shown above due to differences in BIOS versions. If the sound signals do not help determine the malfunction, then you can only rely on hardware diagnostics. It is produced by several means.

Hardware diagnostics.

The operation of individual units can be checked by touching them with your hand to check their heating. After turning on for a minute, the chipset, processor, memory chips and video card units should warm up. If they seem warm, then this is enough to at least conclude that power is being supplied to these elements. With a high degree of probability they should turn out to be workers.

The second remedy is more scientific and requires some engineering training. It consists of measuring potentials on various elements. For this you need a tester and an oscilloscope. It is advisable to have a wiring map motherboard, since it is multi-layered, and the passage of signals is not so obvious. It is worth starting measurements with the power elements of the input circuits and stabilizing and shunt capacitors, checking the presence of +3.3 and +5 V in the corresponding places of the motherboard, and the operation of the clock generators. After this, it is worth checking for the presence of standard signals at the processor socket pins. Next, check for signals in the slots and ports. The last thing you should do is deal with the logical elements (although repairing them often turns out to be unwise). This will require knowledge of the layout of ports and slots. This information is shown in the tables below.

Table 3 - Power connector pinout

Table 4 - Port layout


The third diagnostic tool is professional diagnostic hardware. These include the use of diagnostic cards of the DP-1 type and the PC-3000 complex, created by the ROSC company. The diagnostic board is installed in a free slot on the motherboard, and after turning on the PC, its indicator displays an error code in hexadecimal form. The use of such a board significantly increases the likelihood of fault localization. The use of DP-1 is designed for the correct operation of the processor, and the CPU rarely fails.

On this moment in Russia, diagnostic cards, test ROM BIOS and other diagnostic tools are produced by ACE Laboratory.

When doing hardware diagnostics, you should keep in mind that in most cases only one device fails, and the easiest way to identify it is to replace it with a similar one that is guaranteed to work.

Regarding power supplies and peripheral devices, then diagnosing faults in them is a topic for a separate discussion, but a number of tips can be given regarding monitors. Quite often, the intermediate horizontal transformer, connected between the pre-terminal and output horizontal transistor, fails. Its main malfunction, as a rule, is a short circuit of the turns. This transformer is part of the high-voltage horizontal scanning unit. This high voltage is supplied to the CRT (Cathode Ray Tube). Therefore, often the absence of a glow on the screen and the absence of a raster indicate the absence of high voltage. Typically, a vertical line on the screen also indicates a line scan unit failure. You can check for the presence of high voltage on the CRT by running your hand over the surface of the screen. If high voltage is applied, you should feel some vibration or static crackling.

Software diagnostics.

If the computer still turns on, but is unstable, freezes when loading, “falls out” in blue screen, then this is most often a consequence of overclocking, local overheating or “glitchy” memory, as well as errors in the HDD (these include “Windows crashes”).

The stability of their operation can be checked under DOS by booting from a system floppy disk or disk. To do this, you should use the utilities CheckIT, PC Doctor, Memtest 86, Stress Linux, Norton Diagnostics, The Troubleshooter. For professional testing and HDD recovery, you should use HDDUtility and MHDD, but they only work correctly under MS-DOS 6.22. The first thing you need to do with them is to check the SMART attributes of the HDD state. You can also use Norton Disk Doctor to diagnose, check and mark bad sectors.

It should be remembered that a full hardware test can only be performed under Windows, testing the stability of operation in burn-in tests for at least 24 hours. Among such tests are CPU Hi-t Professional Edition, CPU Stability Test, Bionic CPU Keeper, CPU Burn, Hot CPU Tester Pro, HD_Speed, DiskSpeed ​​32, MemTest.

It is much easier to prevent an event than to correct its consequences, so it is much easier to regularly (at least once every few weeks) monitor the parameters of the voltages produced by the power supply, look at the SMART parameters of the HDD (Active SMART, SMARTVision, SMART Disk Monitor programs), study the processor temperature , check for good cooling and absence extraneous sounds. It would also be a good idea to lubricate the fans with machine oil at least once every six months.