Purposes of using databases. Database organization


Introduction

The widespread use of information processing systems in various areas of management activity, for solving complex economic problems, and conducting scientific research determines increased requirements for the efficiency of data organization in conditions of collective use.

Traditional way organizing data in the form of arrays focused on specific tasks is characterized by static nature, rigid binding to the corresponding software, leads to duplication in the accumulation and verification of data, causes a significant waste of memory for storing it, frequent reorganization of data when tasks change, etc.

One of the main ways to overcome these shortcomings is to create databases that ensure the maintenance of a dynamic information model of complex managed objects and processes in the system and collective access to it.

1. Database organization

A database (DB) is defined as a collection of interrelated data characterized by: the ability to be used for large quantity applications; the ability to quickly obtain and modify the necessary information; minimal redundancy of information; independence of application programs; a common, controlled search method.

Possibility of use Database for many user applications, it simplifies the implementation of complex queries, reduces redundancy and increases the efficiency of information use in data processing systems. Minimal redundancy and the ability to quickly modify keep data at the same update level. The independence of data and the application programs that use them is the main property of a database. Data independence means that changing the data does not change the application programs.

The traditional form of database organization that ensures such independence is a three-level structure: the logical data structure of the application programmer (subschema); general logical, data structure (schema); physical data structure.

Database schemas and subschemas are often depicted as diagrams. In Fig. Figure 1 shows a general diagram of the logical structure of the database and the subcircuits of two application programmers who have different ideas about the data. Solid lines represent connections in the diagram. Simple connections are indicated by a single arrow; one-to-many relationships are indicated by a double arrow. Dashed lines represent cross-references. The presence of cross-references allows you to avoid repeating the SUPPLIER and SPECIFICATIONS - LOT - PRODUCT records in each PURCHASE ITEM record.

Creating a database is not a one-time process; it extends over the entire period of its existence. The three-level organization provides the ability to quickly change the structure of the database in the context of maintaining and modifying control systems and increasing user tasks, as well as in the conditions of improving and expanding hardware. The three-level organization ensures mutual independence of changes in the general logical structure of the database and application programs (logical data independence) and the ability to change the physical location and organization of data without changing the general logical data structure and data structures of application programmers (physical independence).

2. Database management systems

The use of database management systems (DBMS) allows you to exclude data description and detailed programming of data management from application programs. Descriptions are replaced by references to the general logical data structure, and control programming is replaced by data manipulation commands that are executed by universal software.

The main function of a DBMS, along with updating data, is to process user requests for searching and transferring data to application programs. For example, the sequence of basic actions implemented by the DBMS in the process of reading a record consists of the following operations: the application program issues a request to the DBMS to read the record, which contains the value of the segment or record key; The DBMS searches in the application program subcircuit for a description of the data for which the request was issued; The DBMS, using a common logical data schema, determines what type of logical data is needed; Based on the description of the physical organization of the data, the DBMS determines which physical record needs to be read; The DBMS issues operating system command to read the required record; the operating system reads data into system buffers; Based on a comparison of the schema and subschema, the DBMS extracts the information requested by the application program; The DBMS transfers data from system buffers to the workspace of the application program.

The actions that the DBMS performs when updating data are similar to reading operations. The DBMS will carry out the necessary transformations of data in system buffers, the opposite of those transformations that were made when reading the data. The database management system then issues a WRITE command to the operating system. The general architecture of the database management system is shown in Fig. 2. It is inherent in all DBMSs, which differ in the limitations and capabilities for performing the corresponding functions.

The process of comparing and selecting such systems for a specific application comes down to matching their capabilities with the specific requirements and limitations of the user.

To work with a database management system, you need several languages: programming languages, languages ​​for describing circuits and subcircuits, languages ​​for describing physical data. An application programmer uses programming languages ​​to write programs (COBOL, FORTRAN, PL/1, ASSEMBLY) and tools for describing data - a subcircuit description language. The subschema description language can be a means of describing data in a programming language, a means provided by the DBMS, or an independent data description language. Many DBMSs use programming languages ​​to describe data. For the application programmer, the DBMS must provide a means of transmitting commands and interpreting messages issued by the system. The interface between the application program and the DBMS - the data manipulation language - is built into the programming language. The record is requested in a data manipulation language and read into the workspace of the application program; similarly, when you include a record in a database, the application program places it in the work area and issues a command in a data manipulation language. Typical data manipulation language commands are: open a file or recordset; close a file or recordset; locate and read the specified copy of the record; pass the contents of specified data elements from a specific record instance; replace the value of certain elements of the specified record instance with values ​​from the program workspace; insert a record from the workspace into the recordset; remove a specific instance of a record from a sequence of records; remember a new instance of a record in the database; delete a specific instance of a record from the database; reorder records in a group in descending or ascending order according to the specified key.


The systems programmer uses a language to describe the overall logical data circuit. This language can be a programming language extension, a DBMS tool, or an independent language. To describe the layout of data, the systems programmer uses some form of physical data description language. This language defines the placement of data on physical devices, buffering control, addressing and searching methods.

When new tasks arise in the management system related to the processing of large volumes of information, the problem arises of choosing a method for organizing data that ensures its solution. In this case, it is necessary to consider possible ways organizing data and the effectiveness of their use to solve the problem: organizing data in the form of files; use of an existing database (without changes to the general logical data schema); development of a new logical database schema using a universal DBMS; development of a database and a special DBMS.

Any analysis necessary to evaluate and select a database organization must begin with a thorough examination of the user's needs. Based on the study of user requirements, you should draw up full list data to be stored, indicating their characteristics and relationships, a list of processes and users interacting with the database, indicating their priorities and setting data access parameters, formulate requirements for the time and cost of accessing the database. In addition, it is necessary to analyze the available hardware (the size of the main memory of the processor, the configuration of the operating system, taking into account data transfer media, available resources and expansion capabilities of external memory), as well as the number and qualifications of system and application programmers.

If there is a need to organize data to solve a number of applied problems in the absence of a functioning database, the feasibility of using the database concept to solve the assigned problems should be assessed. The basis for using a database is the following factors: significant overlap of data arrays, duplication of data accumulation and verification, the need to solve the problem of element consistency; a significant number of information consumers; a significant number of request types; relative constancy of the nomenclature of requests. At the same time, when deciding on organizing a database, it should be remembered that its creation can lead to a decrease in the efficiency of individual application programs and queries, since the purpose of organizing a database is to increase the efficiency of the entire system.

When new tasks arise in a control system with a functioning database, the possibility of using a common data structure to describe subject areas should be assessed, i.e. a set of objects and processes of new users in the language of subschemas of a functioning database or, if the existing general logical schema needs modification, estimate the costs of making the necessary changes and the impact of these changes on General characteristics system in terms of meeting the needs of all users of the system.

If a decision is made to develop a new database, the task of choosing a universal or developing a specialized DBMS arises, as well as designing its general logical and physical structure, and the choice of a DBMS, as a rule, is the basis for the design and organization of the logical and physical structures of the database.

Existing DBMSs provide three main approaches to data management: hierarchical, network and relational (Fig. 3). The hierarchical approach is based on the representation of a hierarchy of objects. Hierarchical relationships are directly supported in the physical design of the DBMS. Hierarchical relationships are a special case of network relationships. For example, a supplier may supply several types of goods, and each type of goods may have several suppliers. Relational systems do not differentiate between objects and relationships. Network and hierarchical relationships can be represented in the form of two-dimensional tables called relationships and having the following properties: each table element represents one data element (there are no repeating groups); the elements of the column are of the same nature, the columns are uniquely assigned names; there are no two identical rows in the table; rows and columns can be viewed in any order, regardless of their information content. A database built using relationships is called relational and ideally has the following advantages: the ability to be used by untrained users; simplicity of the security system (for each relationship the legality of access is specified); data independence; possibility of building simple language manipulating data using relational algebra.

When choosing a universal DBMS for implementing a specific set of application programs based on the use of a database, you should evaluate the data description language provided to the user, the data manipulation language and the means of maintaining the physical database. The characteristics that define a data description language are usually distinguished: clarity, ease of study, degree of data independence, procedures for protecting against unauthorized access, description elements (data types, size and name, etc.), supported relationships (hierarchical, network, relational) . Among the characteristics of data manipulation languages, the following should be highlighted: represented means of access (in physical sequence, by the value of a data element, by key), compatibility with basic (high-level) programming languages, ease of learning and use, data independence, capabilities and means of simultaneous use of the database by several application applications. programs.

3. Selecting a DBMS

It is carried out taking into account the needs of the user using one of the following methods: feasibility analysis, experimental verification, simulation and modeling.

The capability analysis method is based on scoring the above characteristics of the DBMS from the point of view of user requirements. Each characteristic is studied from two perspectives - whether it is present in the proposed DBMS and what is its quality. Quality is ranked according to a standard scale. The ranking coefficient is multiplied by the weight allocated for a given component, and the scores weighted for each component are summed up.

The experimental verification method consists of creating a specific application environment and using it to obtain the operational characteristics of a given hardware and software system. For experimental verification, it is necessary to design and load a generic database; then, using the DBMS data manipulation language, model the requirements for processing existing and expected application programs and perform an experimental test of the DBMS under consideration.

In the method of simulating and modeling the operation of a DBMS, mathematical expressions are used that determine the dependence of one of the parameters on the others. For example, the access time can be represented as a function of the number of disk accesses, the amount of information transferred, and the processor time of generating a response to a request. Since the listed parameters depend on the method of storing data and the method of accessing it, different DBMSs require different models. Once these models are developed, they can be used to estimate processing time and cost when using different DBMSs by specifying a variety of conditions (changing database sizes, access methods, blocking ratios, etc.).

An unskilled designer may impose limitations on a particular DBMS early in the database design process. In this case, user requirements are artificially specified by the hierarchical and network structures of a particular DBMS without considering other possible design solutions. This approach may lead to reduced system efficiency.

Below are brief characteristics some universal DBMS.

INES DBMS is focused on solving information retrieval problems mainly using dialogue. It provides the ability to quickly access the database to obtain reference data and effectively view large volumes of data when compiling summaries and when generating initial arrays for solving economic problems.

The system allows access to data from user application programs written in ASSEMBLY, PL/1, COBOL, ALGOL-60, FORTRAN-4.

Special input languages ​​are used (economic indicator input language, document input language) and query languages ​​that satisfy the basic requirements for data description languages ​​and data manipulation languages.

To work with data that has a hierarchical structure, a special access method is used. INES DBMS has a system for generating output messages and a system for visualizing messages, which allow the user to set the structure of the document and its details, search, change and correct data and display them on the display.

The KVANT-M DBMS is a real-time system designed to run on a minicomputer and used to solve problems in information retrieval and reference systems (factographic, bibliographic, order reservation, etc.).

User programs can be written in COBOL, FORTRAN, BASIC-2 and access the database using the CAM interface.

The KVANT-M DBMS supports a database consisting of a set of arrays (files). The array entries have the same structure and a unique sequential number (ISN). Records are made up of fields, which are the smallest unit of data in the database. A field can be declared as a key. To describe data in files, a schema is created containing the names of the record fields, their type, and a sign indicating whether the field is a key. One or more subschemas are created for users to which they have access.

The physical data structure uses inverted lists of key values ​​and ensures that addressing is independent of the physical location of the data. The system provides following types access: serial via ISN; in a logical sequence; on request.

The language for data manipulation is the KVANT SCRIPT-M language. It is an English-like conversational language designed to efficiently search and highlight records in a database and display them.

Recently, the production of high-speed personal computers with large operational and external memory, with the possibility of combining them into a local area network. For these computers, universal DBMSs have appeared and continue to appear, providing the user with tools and conveniences that are sometimes richer than those of minicomputers and mainframe computers. This process has not yet been established, and therefore it makes no sense to present the characteristics of these DBMSs in the textbook.

4. Specialized databases

The process of designing a specialized database includes: logical design, physical design, development of a specialized DBMS.

Logical design involves analysis. requirements, modeling of application program data and their integration, development of a logical circuit. The analysis of user requirements is carried out using standard methods of system analysis: documentation, survey, simulated reports. Requirements analysis produces systematic data sets and processing specifications using standard data definitions that minimize inconsistencies in the developed specifications.

The data model is designed to represent the user environment. Since the database is created for many users, the integration phase is designed to resolve conflicts between user requirements during the modeling process.

At the stage of creating a diagram, user models must be combined into one, from which all alternative views can be distinguished. The task of synthesizing the optimal logical structure of a database is to determine the optimal logical structure in the sense of the accepted criterion, ensuring the implementation of the set of requests of all users of the system. Among the criteria used for synthesizing the logical structure of a database, one can highlight the minimum cost of storing, updating and transmitting information for a certain period, the minimum of total information flows, the minimum amount of stored information, the minimum cost of updating information, the maximum speed, the maximum reliability. If the requirements specifications and data models do not depend on the DBMS, then the database schema is its description in a data description language. In addition to the schema, descriptions of database subsets (subschema) of users are developed. Design of a physical database includes the physical representation of data, selection and documentation of access methods, and placement of data.

Based on the logical design, the physical designer must determine the representation of each data element, record, and array. For each physical array, its size must be set. When determining the physical representation of data, the following factors must be taken into account: saving memory, minimizing redundancy (using references or pointers), processing method (sequential, random) and addressing, array and record activity coefficient (determines the choice of devices with direct or sequential access), data independence , response time to a request (determines the organization of data and the type of storage device).

The specifics of the tasks solved on the basis of the database being developed, user requirements and the logical design of the database are the basis for choosing the method of addressing and organizing data search. Addressing methods include: sequential scanning, block search, binary search, index-sequential method, direct addressing, shuffling and various combinations thereof.

Sequential scanning involves checking the key of each record. The method can be effective only when batch processing sequential arrays on magnetic tape.

Block search is used in cases of sequential arrays ordered by key. Records are grouped into blocks and each block is checked until the desired block is found whose records are read.

In binary search, the search area is divided in half each time, and the key of the resulting record is compared with the search key. Binary search is usually not suitable for direct access devices and is used when searching array indexes.

The index-sequential addressing method uses indexes. The top-level index specifies the location of the bottom-level index, which specifies the location of the block of records. A block of records is either scanned or binary or block searched. The method involves ordering records by key. In the case of an arbitrary array, the index-arbitrary method is used. However, this requires a much larger index because it must contain one element for each entry in the array, rather than per block of entries.

Direct addressing involves some translation of a key into an address. There are many methods for converting a key to an address in an array. The simplest way is to specify the relative monitor address of the record in the input message. In some applications, the address is calculated based on object identifiers.

The shuffle method involves converting the data item's key into a quasi-random number, which is used to determine the location of the record.

When designing the physical structure of a database, you must define and document how each type of record is accessed, identify records that are accessed directly by keys, and records that are accessed using pointers from other records or indexes. Each array, depending on the access method, must be allocated space on physical devices ( magnetic disks, tapes). This places data in such a way as to give priority to frequently accessed data or to maximize the proximity of stored data.

The most widely used is the index-sequential addressing method. In this case, you can use two access methods - ISAM and VSAM. In the ISAM access method, records are grouped so that they can be located on individual tracks of the cylinders of the disk module, and one track on each cylinder is allocated for indices pointing to the records located on that cylinder. If new data needs to be added, it is placed in the overflow area (pointers to the overflow areas are also included in the index track). The VSAM access method is similar to the ISAM method, however, the VSAM method is independent of the type of equipment and does not operate on categories such as tracks and cylinders. Instead of cylinders divided into tracks, controlled areas are used, which in turn are divided into controlled intervals. In the VSAM method, there is one set of pointers (index) per managed area.

Designing a specialized DBMS involves developing a data description language, a data manipulation language, and a means of maintaining a physical database. The basic requirements that data description and data manipulation languages ​​must satisfy were identified when considering the issue of choosing a universal DBMS. The most common language for describing programmer data (subcircuits) is the COBOL data section; modern DBMSs, as a rule, develop their own data description languages ​​to describe schemas and physical structure of the database. The Association for Data Systems Languages ​​(CODASYL) has proposed a data description language that is used both to describe data logically and to describe its physical organization.

5. Distributed databases

In connection with the creation and development of a number of automated control systems currently based on computer networks, the design of distributed databases (RDB) is relevant. A distributed database is a system of information-interconnected and in a certain way interacting local databases (LDBs), which have their own information content and structure. Essentially, the RDB is a distributed memory system that stores all the data required by the corresponding automated control system. Its peculiarity is that fragments of the generated logical structure are located in geographically remote databases. The physical implementation of the connectivity of the RDB is carried out by organizing information flows within the LDB and between them via communication channels.

The main problem when creating a RDB is data placement; this determines such characteristics of the RDB as the volume of stored and updated data, the intensity of information flows and the reliability of systems.

RDB design can take place under the following conditions:

a) the creation of an automated control system is just beginning and the task is to select the optimal structure of the RDB and the placement of individual LDBs;

b) there is a certain number of LDBs and computer centers and the task is to allocate an additional number of LDBs and optimally change the structure of connections in the system;

c) the formation of the geography and structure of the system is completed and the task is to optimally re-allocate arrays and change the topology of connections.

The most typical tasks when designing a RDB are determining the structure of the RDB, determining the topology of connections, choosing a strategy for searching and updating information, and choosing a control system for the RDB.

There are centralized, decentralized and combined RDB structures. The most widely used are combined RDBs, which are characterized by the presence of a central database that stores system-wide information about the placement of arrays in the RDB. The number of LDBs at each level of the hierarchy is determined by restrictions on the volume of stored information and restrictions on the cost of creating a LDB. The placement of the LBD depends on the location of consumers and information sources.

The choice of LBD network topology is determined by the nature of their information relationships, the direction and intensity of information flows, and the required reliability and reliability of information transmission. Typically, users are assigned to one LDB and through this LDB they are connected to other databases in the RDB. The following types of structures of LBD connections in RBD are distinguished: radial, radial-nodal, ring, each with each, combined (Fig. 4, a - d). The most reliable, with quick search information is a system with the structure “everyone with everyone”. Information connections of this type are characteristic of objects that are subordinate to each other only functionally.

Search strategy influences the amount and placement of structural and flow of query information. If the information requested by the user is not available in the nearest LDB, the following search strategies can be proposed:

1) according to the structural information about the placement of data in the RDB, the required LDB is searched and this LDB is accessed;

2) a search is carried out in the LDB of a higher rank; if the necessary information is missing, structural information about the contents of all subordinate LBDs is analyzed; if the necessary information is missing, go to the LDB of a higher hierarchy level;

3) an appeal is made to the control LDB, where structural information about all LDBs is stored;

4) all LBDs are polled either in parallel or sequentially.

Strategy 1 provides the minimum amount of query information, but in each LDB it is necessary to store structural information about the placement of arrays in the RDB.

Strategy 2 is typical for hierarchical systems in which top-down information flows predominate.

Strategy 3 minimizes structural information.

Strategy 4 is characterized by large flows of request information.

The functioning of the RDB presupposes the presence of information update streams in it. Among the update strategies, the following can be distinguished: the update of all duplicated arrays across all LDBs is performed by the information source; the source updates information only in the nearest LDB; all other duplicated arrays are updated at the initiative of this LDB; updating of duplicated arrays is carried out according to an algorithm (for example, minimizing the total update flows). The update strategy must ensure the specified reliability, reliability and performance of the RDB. The development and implementation of effective RBD management systems are currently at an early stage. The main criterion when developing a RDB control system is the minimum labor intensity of creating and implementing its software. The problem can be solved by refining and adjusting existing DBMSs or by creating effective special RDB management systems.

In its most general form, a database is defined as a collection of interrelated information. An important feature of the database follows from this definition, namely that the database includes not only the data itself, but also the connections between them. One of the main ideas of a database is the shared storage of data with their descriptions. Thanks to this, the stored data becomes “open”, understandable for any number of applications working with the database. This makes the database an independent information resource that can be reused by various applications while remaining independent of them.

In addition to the fact that the database describes the data, explains its meaning and structure, it supports certain restrictions imposed on this data, for example, it determines the type of data, its dimension, etc.

Thus, database is a kind of structured data information resource intended for multi-purpose, repeated use in specific subject areas.

Databases operate under the control of database management systems (DBMS), which are defined as a collection of languages ​​and programs required to work with a database. DBMSs allow you to create databases, enter and change information in them, and access this information. The organization of software and information support in the case of using a DBMS will be fundamentally different (Fig. 2.5).

Applications access data stored in one or more databases through a DBMS. At the same time, the organization of the software and information complex is determined not by the software, but information support. The data turns out to be independent of applications; applications, in turn, can use any data contained in the database. The DBMS maintains the integrity of data, determines their sharing by different programs and different users, and, to one degree or another, ensures the security of information.

Rice. 2.5. Scheme of organization of software and information support

Using subd

It also performs the most important information procedures with the data contained in the database, upon a user request or a command received from the application.

When using a compiler-type DBMS, applications are created that work directly with databases. At the same time, the DBMS itself as a separate software tool for working with data is virtually absent.

In fact, at the compilation stage, the database core and application are merged (Figure 2.6).

Rice. 2.6. Scheme for merging the database core and applications

Currently, applications are developed, as a rule, by qualified programmers. At the same time, designing database structures is impossible without the participation of economists - specialists in the subject area, since the information capabilities of a database are largely determined by the quality of its design.

Database elements

The database is a system, i.e. it consists of a certain number of elements and relationships between them (Fig. 2.7)

Rice. 2.7. Structural units of the database

The smallest of them is a data element, which in some cases is also called a field or attribute and which corresponds to one attribute. So, a data element is the smallest semantically meaningful named unit of data. A data element is defined by the following characteristics: name (full name, date of birth, company name), type (character, numeric, calendar), length (maximum possible number of characters - 15 bytes) and precision (number of decimal places for numeric data).

Data items are organized into records called tuples. Sign up for general case corresponds to the indicator and carries data about one of the homogeneous objects, for example, one account, one employee, etc. In some cases, the concept of a data aggregate is used, which occupies an intermediate position between a data element and a record. A data aggregate can be simple (consisting of only data elements) or compound (consisting of elements and simple data aggregates).

A set of records of the same type is called a database file (or table). It should be noted that the database file does not always correspond to the physical file. In some cases, the information of a database file is contained in several physical files and, conversely, several database files can be contained within a single physical file.

A database is thus a collection of interconnected database files.

The meaning of these terms can be explained in the diagram (Fig. 2.8).

The data element contains one attribute, in this case the name of the city - Moscow. A data aggregate consists of several details considered as one whole. A record consists of one or more data elements and contains information about one object, in the example given - about one enterprise. A set of records of the same type makes up a database file; in the figure, this is a file with information about enterprises. The collection of such files, interconnected in one way or another, constitutes a database. The diagram shows three files of information interconnected as follows: enterprises are serviced by banks, they have accounts opened in these banks. If there is no connection between the files, then their collection cannot be considered a database.

Rice.2.8. An example of the relationship between structural units of a database

Database life cycle

Like any system, a database has its own life cycle, which represents the consistent development of the system over time. The following main stages of the life cycle are distinguished.

Planning. Its purpose is to determine and analyze the requirements for the database being created. At the same time, different views on data are considered depending on the functions performed, specific requirements for the database are determined depending on the types of queries, the volume of viewing database files, the mode of working with this database, etc.

Database design. Based on the analysis of user needs and the integration of these needs, a domain model is created. On the basis of this model, a logical database structure is built, oriented to a specific DBMS. The final step is to analyze and evaluate the capabilities of the database in terms of meeting the needs of different user groups.

Database materialization. Purpose of this stage life cycle- filling and loading the database using DBMS tools. This involves the following work: data preparation; transfer of input information from documents to computer media; transformation of existing data files according to a specifically developed structure obtained at the previous stage.

Database operation. The goals of this stage of the life cycle are: ensuring the implementation and continuous operation of the database; data safety in case of computer system failures and other emergency situations; analysis of the functioning of the database system in terms of its efficiency and productivity. A system of quantitative and qualitative indicators is used as criteria for evaluating the database. Quantitative indicators include: response time to a request, external and RAM memory used, reliability and other costs (for updating the database, creating, reorganizing, restructuring). Qualitative indicators are flexibility, adaptability, dynamism, recoverability, the ability to maintain data integrity, etc.

Development and improvement of the database. When operating a database, new tasks arise, the solution of which is associated with the emergence of new data elements and connections between them, which leads to the need to create a modifiable database. A distinction should be made between database reorganization and database restructuring. The first is related to changing data values, i.e. improvement of the information system has a positive effect on the development of the database and its modification. Restructuring involves changing structural shifts, which entails the creation of an almost new database.

Database Design

Among the stages of the database life cycle, the design stage is of great importance. This is due to the fact that the quality of the database largely determines the quality of the entire information system. In addition, it is at this stage that the most active interaction between developers and experts in the subject area occurs and the basic ideas about the subject area are formed, which will form the basis of the EIS. Consequently, design is of a research nature and addresses problems associated not only with defining the subject area and creating its model, but also with the principles of constructing logical structures.

Database design goes through several stages. It is customary to distinguish the stages of conceptual, logical and physical design, although sometimes they are called differently, and in some cases some of them are actually absent.

On first stage a conceptual, or infological, data model of the subject area is created. It is characterized by the fact that it does not depend on the characteristics of specific DBMSs. It describes the main objects of the subject area, their properties and connections between them. It is possible to describe the subject area in natural language, but for greater accuracy, clarity and ease of further design, formalized means are used. Thus, a description made using natural language, mathematical formulas, tables, graphs and other tools that are understandable to everyone working on database design is called conceptual data model.

There are many approaches to building a conceptual data model: graph models, semantic networks, entity-relationship model, etc. The most popular of them is the entity-relationship model ( ER-model, from English. Entity- Relation). ER-model was proposed by the American scientist Peter Ping Shen Chen in 1976. To date, several of its varieties have been developed, but they are all based on the graphic diagrams proposed by Chen.

The entity-relationship model is depicted graphically in the form ER-diagram, which consists of a set of sets of values ​​describing the properties of an entity and relationship (Entity - Relationship Diagrams).

The advantages of this model include:

    ease of formalization,

    ease of understanding;

    description by graphic means;

    clarity of the image various types connections;

    ease of conversion to a database schema supported by some DBMS.

The main components of the entity-relationship model are entity, attributes and relationships (Figure 2.9).

Rice.2.9. ER -diagram

Essence (Entity) - a certain real or imaginary object that is significant for the subject area under consideration, information about which is subject to storage. Each entity must have certain properties: have a unique name and have one or more attributes that uniquely identify each instance of the entity. An attribute is any characteristic of an entity that is significant for the subject area under consideration. It is intended to qualify, identify, classify, quantify, or express the state of an entity.

Connection (Relationship) - a named association between two entities that is significant for the subject area under consideration.

The relationship is given a name, and the name of each relationship between two entities must be unique.

The conceptual model is compiled on the basis of interviews and surveys of experts - specialists in the subject area and must satisfy a number of requirements:

    must be independent of a specific DBMS;

    must be understandable to both information system developers and subject matter experts;

    should minimize further design efforts. This means that its structures must be easily converted into logic model structures;

    should, if possible, exist in a computer-perceivable form. In this case, it will be suitable for computer-aided database design.

So, the goal of conceptual modeling is to create an accurate and complete representation of the subject area, i.e. define objects, their properties and object relationships (i.e. relationships).

On second stage design, a logical, or datalogical, model is created. Such a model is no longer built in terms of objects and relationships, but in terms of the specific DBMS in which the database is supposed to be used. This model is also called a database schema.

Currently, three logical data models are known (they are also called classical or remarkable models), namely hierarchical, network and relational.

Hierarchical and network data models began to be used in database management systems in the early 1960s. In the early 1970s. a relational data model was proposed.

The main elements of any of these models are objects and the relationships between them, and the distinguishing feature is the difference in the ways of representing the relationships between objects.

Third stage- physical design. Its result is a physical model, which includes a description of the database in terms of its physical structure. The physical model involves selecting storage media, determining how to place the database and its components on these media, describing the possibilities and feasibility of data compression, and optimizing access to data at the physical level.

Thus, if the conceptual model is independent of the DBMS and, possibly, even of the data model, and the logical model depends on a specific DBMS, then the physical model depends not only on the DBMS, but also on technical and partly system software.

The current stage of development of information systems makes some changes to the classical database design scheme:

    at the conceptual design stage, graphic methods are widely used;

    new methodologies make it quite easy to translate a conceptual model into a logical model for different DBMSs. In some cases, the transition from a conceptual model to a logical one can be automated or even completely automatic;

    modern DBMS and other software make it possible to greatly simplify physical organization databases. Therefore, the physical design stage is now significantly reduced, and sometimes almost completely automated.

Types of Logical Data Models

As noted above, the main types of logical data models are: hierarchical, network and relational.

Hierarchical data model allows you to build databases with a tree structure. In them, each node contains its own data type (entity). At the top level of the tree in this model there is one node - the root, at the next level there are nodes associated with this root, then nodes associated with nodes of the previous level, etc., and each node can have only one ancestor (Fig. 2.10). Such databases support a one-to-many relationship.

Searching for data in a hierarchical system always starts from the root. Then a descent is made from one level to another until the desired level is reached. Moving through the system from one record to another is carried out using links.

The main advantages of the hierarchical model are the simplicity of describing the hierarchical structures of the real world and the fast execution of queries that correspond to the data structure. However, such models often contain redundant data and are poorly suited to represent many-to-many relationships. In addition, it is not always convenient to start searching for the necessary data from the root every time, and there is no other way to move through the database in hierarchical structures. Hierarchical systems are the oldest generation of database systems; they were developed for large computers.

Rice. 2.10. Hierarchical structure of the database model

Network Data Model is also based on a graphical representation of the relationships between objects. However, here, in addition to vertical connections, there are also horizontal ones, i.e. subordination of one object to many objects is allowed. Thus, unlike hierarchical network models, they support many-to-many relationships. Each generated element in them can have more than one ancestor (Fig. 2.11).

Rice. 2.11. Network structure of the database model

However, network systems are quite complex and require solid software. In them, as well as in hierarchical systems, the transition from record to record is made using links inserted into each record. At one time they were quite popular and were used for mini-computers and mainframe computers.

Relational model of database organization is currently the most popular. This is due to the fact that the presence of fundamental shortcomings of hierarchical and network models has led to the ineffectiveness of their use in modern conditions. The capabilities of the relational model made it possible to overcome these shortcomings and describe hierarchical and network structures.

A relational database is a database that is perceived by its user as a collection of tables. Presentation of data in the form of tables is fully consistent with traditional “non-computer” information processing technologies; it is visual and understandable to most users. Relational database tables can be viewed on a computer screen or printed. Note also that the relational data model best matches the structure of economic information.

Basic concepts of relational databases

The relational data model, developed by E. Codd in the 1970s, is based on the mathematical theory of relations and relies on a system of concepts of relational algebra, the most important of which are table (relation), row (tuple), column (attribute), column contents ( domain), primary key, foreign key.

In the relational model, data is presented in the form of interconnected tables. Each table consists of rows (records of the same type, or tuples) and columns (fields, or attributes) and has a name that is unique within the database. The word "same type" means that all records have the same set of attributes, or fields, although each attribute can have its own value. Each table column has a name unique to its table. The columns are arranged in the table according to the order in which their names appeared when the table was created. Each attribute can take a subset of values ​​from a specific area (domain). Unlike columns, rows do not have names, their order in the table is not defined, and their number is not limited.

Let us explain the above concepts using the example of a relational database model containing information about employees of a certain company. Let's look at the table with data about the company's employees (Table 2.6).

Table 2.6

Employees

Personnel Number

Last name I.O.

Department code

Work phone

PETROV A.V.

ROMANENKO S.T.

STEPANOVA I.S.

You can see that all three entries have the same attributes, but they take on different values. So, for record No. 1, the “personnel number” attribute takes the value 008976, and for record No. 2 - 008980, etc. The values ​​of some attributes may be the same for different records, for example, records No. 1 and No. 2 have the same value for the “department code” attribute. However, every table must have an attribute (or collection of attributes) whose value is never repeated and uniquely identifies each row in the table. This is necessary so that when working with the database you can distinguish one record from another. Such attributes are called unique. A unique attribute of a table or a set of its unique attributes is called primary key or key field. In this table, the key is the “personnel number” attribute.

In the case when a record is uniquely determined by the values ​​of several fields (or a set of unique attributes), there is a composite (chained) key.

In some cases, the attribute may not have any meaning: for example, employee No. 3 does not have a work phone number and the corresponding attribute is not filled in. In this case, the attribute is said to have a null value. The key cannot have a null value.

A simple collection of tables cannot be considered a database unless certain relationships exist between them. In relational databases, relationships indicate correspondence between records in two tables. Let's consider the second table containing information about departments (Table 2.7)

Table 2.7

Between the two tables above, you can establish the relationship "EMPLOYEE - works in - DEPARTMENT". If you want to find out information about the department in which employee No. 2 works, you need to take the value of the “department code” attribute in the “EMPLOYEES” table and find the corresponding code in the “DEPARTMENTS” table. Thus, two records from different tables will be merged

ROMANENKO S.T.

HUMAN RESOURCES DEPARTMENT

You can see that the relationship between two tables is established based on the correspondence of the attribute values ​​of the two tables, in our case this is the “department code” attribute of the “EMPLOYEES” table and the “code” attribute of the “DEPARTMENTS” table. Such attributes are called communication attributes. A relationship attribute in one table must be a key. In the example above, the "code" attribute is the key for the "DEPARTMENTS" table. If this were not the case and the department codes in this table were repeated, it would be impossible to determine which department is referred to in the first table. The second attribute of the relationship - in this case the "department code" of the EMPLOYEES table - is called foreign key, since it refers to the key of another (external) table (Fig. 2.12).

Table 2 primary key

Post-relational databases

As already mentioned, relational databases consist of two-dimensional tables that are related to each other. Thus, when designing a relational database, all information is divided into many two-dimensional arrays. In some cases, the table corresponds to many real objects, such as “departments”, “employees”, “accounts”, etc. But sometimes, when you have to deal with hierarchical information, the same object has to be “decomposed” into several tables.

An example would be multi-line documents such as an invoice. Each of these documents contains general details, such as number, date, name of the supplier and recipient. In the “Invoices” table, such details make up one record. However, an invoice is a multi-line document, and to store each line containing the name of the product, its quantity, price, and amount, a separate record will also be required. Thus, it is necessary to create an additional table “Invoice Lines” associated with the previous one. The data for each invoice will be contained in one record in the first table and in one or more records in the second.

This approach has a number of disadvantages. First, the number of tables and relationships between them increases, which across the entire database leads to slower query execution. Secondly, the hierarchy and logical unity of tables are not taken into account. In this example, the Invoice Lines table can be considered subordinate to the Invoices table, since it cannot exist without it. And only in unity do these two tables describe the so-called business object - an analogue of a real document. Splitting business objects into multiple tables complicates the database structure and its understanding by users.

INVOICE LINES

INVOICES

These shortcomings are overcome in the post-relational data model, which, in essence, is a development of the relational model with the difference that it removes the restriction on the atomicity (indivisibility) of attributes.

The restriction on attribute atomicity means that in a relational database, an attribute (field) of each record can contain only one value. In a post-relational model, on the other hand, a field can contain multiple values ​​or even an entire table. Thus, it becomes possible to “nest” one table into another.

This allows you to operate business objects more efficiently, each of which becomes logically integral, being represented by just one record.

According to the developers of post-relational DBMSs, the speed of query execution in them increases up to 20 times compared to relational DBMSs. However, the transition from relational databases, which have become ubiquitous, to post-relational ones is associated with significant costs and is still limited.

Data store

A data warehouse is a system designed to provide a unified information space for the purpose of analyzing and optimizing its business.

The activity of any economic entity is associated with the use and processing of information, which is the most important economic resource for achieving high economic performance. However, a feature of EIS is the need to process two types of data, namely operational and analytical. Therefore, in the process of functioning of the EIS, two classes of problems have to be solved:

    ensuring the daily work of the enterprise in entering and processing information;

    organization of an information repository for the purpose of complex multidimensional analysis and study of data to identify development trends, forecast conditions, assess and manage risks, etc. and ultimately to facilitate decision making.

First class tasks - automation of operational activities - are completely solved ABOUTLTR-systems (Online Transactional Process­ ing - operational processing of transactions), which include automated banking systems, accounting systems, etc. Designed for working with analytical data OLAP- systems (Online Analytical Processing - operational analytical processing), which are built using data warehouse technology and are designed for aggregated analysis of large volumes of data. These systems are an integral part of decision-making systems or management systems of the class middle And top manage­ ment, those. systems designed for average and higher levels company management.

Comparative characteristics of the use of data in operational and analytical processing systems are given in Table. 2.8.

Table 2.8

Characteristics of operational and analytical information systems

Propertiesdata

System

operational processing

analytical processing

Purpose

Operational search, simple types of processing

Analytical processing, forecasting, modeling

Aggregation level

Detailed data

Aggregated data

Storage time

From several months to one year

From several decades or more

Update frequency

High. Update in small portions

Low. Updating in large chunks, up to several million records at a time

Criterion

efficiency

Number of transactions per unit of time

Speed ​​of execution of complex queries and transparency of the storage structure for users

Thus, modern EIS is a system based on the sharing of transactional OLTP- systems and data storage (Data Warehouse).

The term "data warehouse" became popular in the 1990s. As defined by William Inmon, who is the founder of this technology, a data warehouse (DW) is a domain-specific, integrated, immutable, historical collection of data organized for the purpose of supporting decision making.

Distinctive features of the data warehouse are:

    orientation to the subject area, i.e. Only information that can be useful for the operation of analytical systems is placed in the data warehouse;

    support for historical data, which determines the fact that the analysis requires information accumulated over a long period of time;

    integration into a single repository of previously separated data coming from various sources, as well as their verification, coordination and reduction to a single format;

Aggregation, which provides for the simultaneous storage of aggregated and primary data in the database so that requests for determining total values ​​are completed quickly enough.

Thus, a data warehouse is a specialized database in which information necessary for managers to prepare management decisions is collected, integrated and accumulated; for example, in the banking industry, this is information about bank clients, credit matters, interest rates, exchange rates, stock quotes, the state of the investment portfolio, operating days of branches, etc.

In this regard, the data warehouse is intended not so much for entering information as for quickly searching and analyzing it. Therefore, systems based on a data warehouse have a database architecture that ensures high speed of query execution to huge amounts of information. They are distinguished by a different design of the user interface, which presents special means of searching for information, summarizing it, and going into detail.

Currently, such repositories are being developed in which not only data is collected, but also the possibility of changing it is provided: arranging analytical features, making management adjustments, supplementing missing data.

The data warehouse management system consists of several functional blocks.

Database. It represents the core of the entire system. The fundamental difference between a warehouse database is its well-thought-out structure, which has an optimal composition of entities specific to the financial industry, as well as the structure and set of attributes of these entities, as a result of which the structure of the warehouse database is optimized for fast loading and fast data retrieval, and not for fast execution of transactions.

Database configuration and metadata management tools. These tools are designed to configure the information model of the data warehouse during implementation and change this model during operation, to gradually expand the functionality of the data warehouse. In addition, to ensure maximum flexibility and adaptability of the data warehouse, metadata should be stored in the database as separate information (Metadata) - data about data. Metadata includes information about physical data, technical and business processes, rules and data structure. Thus, they play the role of a reference book containing information about the sources of primary data, processing algorithms to which the original data were subjected, a description of data structures and their relationships, etc.

Data collection technology. Inconsistency and inconsistency of data, fragmentation and isolation of sources, differences in methods of data storage, inability to access data from some sources do not allow the full use of this data. Therefore, we need a special data collection technology that ensures regular and uninterrupted receipt of data from various sources, in particular from remote branches, additional offices and other information systems, and allows us to turn the entire variety of data accumulated by the bank into information valuable for business. This technology includes data formats, technology for their generation, business rules governing the extraction of data from external sources, distribution of metadata (regulatory and reference information) and much more.

Two approaches are used to collect data: ETL- systems and corporate standard for data exchange format. The first way to collect data is to use tools ETL (Extract, Transforma), special systems for extracting data from other databases, transforming them according to the rules described in this system and loading them into storage. The second approach is to use a standard format for data collection and develop procedures for downloading data on the source side. This allows you to collect homogeneous data from heterogeneous systems, decentralize the development of data upload procedures, providing the solution to this problem to specialists who have knowledge of the source system.

Data cleaning and loading system. This system provides input data control, automatic error correction, elimination of duplication and incorrect values, bringing data to uniform standards, loading large amounts of data, multi-level logging.

Calculation apparatus. A special calculation apparatus provides:

    data aggregation - calculation of generalized indicators, for example calculation of monthly, quarterly and annual balance;

    data consolidation - summing up data along the organizational hierarchy, for example calculating the consolidated balance sheet of a bank;

    calculation of derivative indicators, such as actual budget execution, liquidity, margin, etc.

User interfaces and reports. The data warehouse, while accumulating valuable information, must ensure its maximum use by employees. To achieve this, it has special user interfaces designed to quickly retrieve data, and advanced technology for creating and issuing reports.

Interfaces for external systems. The data warehouse provides information to external analytical systems and report generators. For this purpose, industry standards for data access are used.

The architecture of the data warehouse management system is shown in Fig. 2.15.

Rice. 2.15. Data warehouse management system architecture

Data is loaded into the warehouse from operational data processing systems and from external sources, for example, from official reports of enterprises and banks, results of exchange trading, etc. When loading data into the repository, the integrity, comparability, and completeness of the loaded data are checked, and their necessary conversion and transformation is also carried out.

The data warehouse is aimed at senior and middle management of the company, responsible for decision making and business development. These are heads of structural, financial and client departments, as well as marketing departments, analysis and planning departments, for example, reporting department, general accounting, treasury, analytical department, marketing department, etc.

The data warehouse is based on the concept of a multidimensional information space, or hypercube (multidimensional cube). The values ​​stored in the cells of this cube and called facts, represent quantitative indicators characterizing the activities of the company. In particular, this may be data on turnover and account balances, the structure of expenses and income, the status and flow of funds, etc. The dimensions of a cube, forming one of its faces, are a set of data of the same type intended to describe facts, for example, branches, products, suppliers, customers and currencies, etc., i.e. they are responsible for describing the quality characteristics of the company. Data aggregation is performed according to cube dimensions, so dimension elements are usually grouped into hierarchical structures, for example, branches are often grouped by territorial basis, clients - by industry, dates - into weeks, months, quarters and years. Each cell of this cube is “responsible” for a specific set of values ​​for its individual dimensions, for example, the turnover of balance sheet accounts for a day, a quarter, a year by branches.

Currently, three types of data models used in the construction of data warehouses are most widespread: multidimensional, relational and combined.

For multidimensional model characterized by the use of non-relational spatial databases in the form of hypercubes, providing multidimensional storage, processing and presentation of data. The main advantage of a multidimensional model is the speed of data retrieval. The data is at the intersection of hypercube dimensions. To search for them, you do not need to organize connections between tables, as is done in relational DBMSs. Thanks to this, the average response time to a complex (unregulated) query in a multidimensional model is 1-2 orders of magnitude lower than in a relational one.

However, a hypercube requires large amounts of disk storage because it reserves space in advance for each possible data, this volume increases dramatically with a high degree of data granularity, and it becomes difficult to modify the data since adding another dimension requires a complete rebuild of the hypercube.

Thus, it is advisable to use a multidimensional HD model when its volume is small (no more than 10-20 gigabytes) and the hypercube has a time-stable set of measurements.

Using relational model The multidimensional structure of the data warehouse is implemented by relational tables with both standard data layout schemes such as “star” and “snowflake”, and more complex, specified templates SQL-requests. Data warehouses built on the basis of the relational model are capable of storing huge amounts of information, but are inferior to multidimensional models in the speed of query execution. In the relational model, the hypercube is emulated by the DBMS at the logical level.

Over the past few years they have begun to use combined data warehouses, in which a relational DBMS is combined with a whole set of multidimensional ones. The relational database in this case is the central repository and allows you to accumulate huge amounts of information. The data required by specific analytical applications is allocated from a central repository into multidimensional databases. Each multidimensional database stores information on one of the areas of activity (Fig. 2.16).

Rice. 2.16. Logical diagram of a combined data warehouse

Data marts

One of the options for implementing a data warehouse in practice is to build data marts (Data Marts). They are sometimes also called data marts. A data mart is a subject-oriented collection of data that has a specific organization. The content of data marts, as a rule, is intended to solve a certain range of homogeneous problems in one or more related subject areas, or to perform specific business functions, or for specific departments. For example, to solve problems related to the analysis of bank credit services, one display case is used, and to analyze the bank’s activities in the stock market, another one is used.

Therefore, a data mart is a relatively small and specialized data repository containing only topic-specific data and intended for use by a specific functional unit. So, function-oriented data marts are data structures that provide solutions to analytical problems in a specific functional area or division of a company, for example, profitability management, market analysis, resource analysis, cash flow analysis, customer base analysis, market research, asset and liability management, and etc. Thus, data marts can be considered as small thematic repositories that are created to provide information for the analytical tasks of specific management departments of the company.

Organizing data into a showcase is determined by the need to provide the ability to analyze data from a particular subject area using the most optimal means.

Data marts and data warehouses are quite different from each other. A data warehouse is created to solve corporate problems present in the corporate information system. Typically, data warehouses are created and acquired by centrally controlled organizations, such as classic organizations information technologies, for example a bank. The data warehouse is compiled by the entire corporation.

The data mart is developed to meet the needs for solving a specific homogeneous range of problems. Therefore, one company can have many different data marts, each of which has its own appearance and content.

The next difference is the granularity of the data, since the data mart contains already aggregated data. On the contrary, the data warehouse contains the most detailed data.

Because the level of integration in data marts is higher than in warehouses, the granularity of the data mart cannot be easily decomposed into the granularity of the warehouse. But you can always go in the opposite direction and aggregate individual data into generalized indicators.

Unlike a warehouse, a data mart contains only a small amount of historical information, tied only to a short period of time and significant only at the moment when it meets the requirements of solving a problem. Data marts can be thought of as logically or physically separated subsets of a data warehouse. In Fig. Figure 2.17 shows the relationship between data marts and data warehouses using the banking industry as an example.

Data marts are typically hosted in a layered technology, which is optimal for analysis flexibility but not optimal for large volumes of data.

The structure of data marts is also focused on multidimensional organization of data in the form of a cube. However, their construction, due to the limited information range that meets the needs of one functional area, is much simpler and more profitable.

Rice. 2.17. Relationship between data marts and data warehouse

There are two types of data marts - dependent and independent. Dependent data mart is the one whose source is the data warehouse. Source independent data mart is the primary software application environment. Dependent data marts are stable and have a robust architecture. Independent data marts are unstable and have an unstable architecture, at least when transferring data.

It should be noted that data marts are the ideal solution to the most significant conflict in data warehouse design - performance versus flexibility. In general, the more standardized and flexible a data warehouse model is, the less efficient it is at responding to queries. This is due to the fact that requests entering a standardly designed system require significantly more preliminary operations than in an optimally designed system. By directing all user queries to data marts while maintaining a flexible data warehouse model, developers can achieve flexibility and long-term stability in the warehouse design, as well as optimal performance for user queries.

Data, once in storage, can be distributed among many data marts for access by user queries. These data marts can take many forms - from client-server databases to desktop databases, OLAP- cubes or even dynamic spreadsheets. The choice of tools for user queries can be broad and reflect the preferences and experiences of specific users. The wide variety of such tools and their ease of use will make them the least expensive part of implementing a data warehouse project. If the data in the warehouse is well structured and of proven quality, then transferring it to other data marts will become a routine and low-cost operation.

The use of data mart technologies, both dependent and independent, allows us to solve the problem of consolidating data from various sources in order to most effectively solve data analysis problems. In this case, the sources can be various accounting and reference systems that differ in architecture and functionality, in particular, geographically dispersed ones.

There are several ways to share an Access database, depending on your needs and resource availability. This article describes the available options and the benefits of each, and provides resources with additional information about working methods.

To change the database structure, you must have Access installed on your computer.

In this article

Sharing data using network folders

This is the simplest option with minimum requirements, but it provides the least functionality. In this method, the database file is stored on a shared network drive and all users use it simultaneously. Because all database objects are used simultaneously, multiple users can change data at the same time, limiting reliability and availability. Performance may also be affected because all database objects are sent across the network.

This option is suitable if the database will be used by several people at the same time and users will not need to change the database structure.

Note: This method is less secure than other database sharing methods because each user has a complete copy of the database file, which increases the risk of unauthorized access.

Sharing a Database Using a Network Folder

    If there is no shared network folder, you need to configure it.

    For more information about this, see the help for the operating system of the computer that will be used to share the database. If the shared folder is on network server, you may need help from your network administrator.

    The Access application must be configured to open in shared mode on all users' computers. This mode is the default, but this needs to be checked: if a user opens the database in exclusive mode, other users will not be able to work with the data. Follow the steps below on each computer.

    1. Launch Access and on the tab File select item Options.

      In the window Access Options select item Client Options.

    Copy the database file to a shared folder. Next, configure the file attributes to allow read/write access to the database file. To use the database, you must have read and write access to it.

    On each user's computer, create a shortcut to the database file. In the Shortcut Properties dialog box, specify the path to the database file in the property Target, using a UNC address instead of the connected drive letter. For example, instead of the path F:\sample.accdb indicate the path \\computername\shared.accdb.

    Note: Users can perform this action themselves.

Sharing a Partitioned Database

This method is useful if you don't have a SharePoint site or database server. General access shared databases can be accessed over the network or through a SharePoint site. When a database is partitioned, it is reorganized into two files: the server database, which contains the data tables, and the client database, which contains all other database objects (for example, queries, forms, reports). Each user interacts with data using a local copy of the external database.

Benefits of Database Partitioning

    Increased productivity. Only data is shared over the network, not tables, queries, forms, reports, macros, or modules.

    Increased Availability Database transactions, such as editing records, are faster.

    Improved Security. Users access the back-end database through linked tables; It is less likely that attackers could gain unauthorized access to data through the client database.

    Increased reliability If a user encounters a problem and the database closes unexpectedly, any damage to the database file is usually limited to the copy of the client database that the user has open.

    Flexible development environment Each user can independently develop queries, forms, reports, and other database objects without affecting the work of other users. Additionally, you can develop and distribute a new version of the client database without disrupting access to data stored on the database backend.

If this method works for you, go to the instructions in the article Partitioning an Access database.

Share a database on a SharePoint site

If you have a server running SharePoint (especially with Access Services), there are several possible good options. SharePoint integration helps provide easier access to your database. When you publish a web database, Access Services creates a SharePoint site that contains the database. All database objects and the data itself are moved to SharePoint lists on this site.

The published database is hosted on the Internet. You can create Web forms and reports that run in a browser window, as well as standard Access objects (sometimes called client objects to distinguish them from Web objects). To use Access client objects, you must install the Access application, but all database objects that are stored on SharePoint are shared.

Note: If you have Access installed on your computer, you can use client objects from a web database, not just web database objects.

Access Services provides a platform for creating free data that can be used on the Internet. Web databases are built and published using Access 2010 and SharePoint, and you can then use the web database through a web browser.

Forms, reports, and interface macros run inside the browser.

If you are using a web database, the data is stored in SharePoint lists: All tables are converted to SharePoint lists and records become list items. This allows you to control access to your web database using SharePoint permissions.

Queries and data macros run on the server: All SQL processing is done on the server. This improves network performance because only result sets are transmitted over it.

Saving a Database to a Document Library

The database can be saved in any SharePoint document library. This method is similar to saving the database to network folder and provides a convenient way to control access to the database. When linked to SharePoint lists, only the data is shared, not the database objects. Each user gets their own copy of the database.

For example, if your SharePoint site has lists that track customer service issues and store employee data, you can create a database in Access that serves as the interface to those lists. You can create Access queries to analyze these issues and Access reports to format and publish written reports for team meetings. If users have Access installed on their computers, you can share Access queries and reports for a SharePoint list using the menu Performance. When viewing a list on a SharePoint site, users will be able to find and open queries, reports, and other Access objects from a menu Performance. If users don't have Access, they can still consume data from lists using SharePoint views.

    Open the database you want to share.

    On the tab File select item Save as.

    Note: If you're using Access 2010, select items File > Save and publish > Save Database As > SharePoint.

    In the dialog box Save to SharePoint Browse to the appropriate document library.

    Check the database file name and type, change them if necessary and click the button Save.

For more information, see Publish to Access Services and Import and link data to a SharePoint list.

Share a database by linking to SharePoint lists

This method has the same benefits as using a split database and allows each user to modify their own copy of the database because sharing data is accessed through a SharePoint site. While this does not provide the benefits of publishing the database to a SharePoint site, it does provide the benefit of having the data centralized. Because the data resides in SharePoint lists, it can be shared over the network using SharePoint features.

This method includes three main steps.

    Moving data to SharePoint lists.

    Create links to these lists.

    Database file distribution.

You can use the SharePoint Site Migration Wizard to complete the first two steps, and you can complete the last step using any available tools.

Use the Export Tables to SharePoint Wizard

    On the tab Working with databases in Group Data transfer click element SharePoint.

    Note: This element is only available if the database file is saved in ACCDB format.

    Follow the wizard to export tables to SharePoint; specifically, specify the SharePoint site location. To cancel the process, click the button Cancel.

    To view more information about the migration, on the last page of the wizard, select the checkbox Details.

    This page contains information about which tables are associated with the lists, as well as backup location information and the database URL. It also displays a warning if there are problems with the migration and indicates the location of the log table where you can view more information about the problems.

    When all the wizard steps are completed, click the button Ready.

    If the wizard displays a warning, you should review the log table and take the necessary action. For example, you might want to unwrap some fields or convert them to other data types that are compatible with SharePoint lists.

Note: To view lists on a SharePoint site, click the Quick Launch button Lists or select item View all site contents. You may need to refresh the page in your web browser. To display lists in the Quick Launch pane on your SharePoint site or change other settings (such as turning on version tracking), you can change the list settings on your SharePoint site. For more information, see Help for your SharePoint site.

Sharing a Database Using a Server

You can share a database using an Access application and a database server (such as SQL Server). This method provides many advantages, but it requires additional software - a database server.

This method is similar to database sharing because the tables are stored online and each user has a local copy of the database file Microsoft data Access, containing links to tables, queries, forms, reports, and other database objects. This option is used if the database server is available and all users have Access installed. The benefits of this method vary depending on the database server software used, but in general they include user accounts and selective access to data, excellent data availability, and convenient built-in data management tools. Moreover, most database server applications work fine with earlier versions of Access, so it is not necessary for all users to be running the same version. Only tables are shared.

Benefits of Sharing a Database Using a Database Server

    High performance and scalability In many cases, a database server provides better performance than a single Access database file. Many database server products also provide support for very large databases of approximately 500 in size per interval (2 GB) for an Access database file (two gigabytes). Database server products typically work very efficiently by processing queries in parallel (using multiple native threads in a single process to process user queries) and also minimize additional memory requirements when adding new users.

    Increased Availability Most database server products allow you to create backups database while it is being used. This way, users do not have to forcefully close the database to back up data. Moreover, the database server typically handles concurrent editing and record locking very efficiently.

    Increased security It is impossible to completely protect a database. However, database server products offer strong security that can help you protect your data from unauthorized use. Most database server products offer account-based security, allowing you to specify who can see which tables. Even if the interface is incorrectly gained, unauthorized use of data is prohibited by account-based security.

    Automatic recovery options In the event of a system failure (such as an operating system crash or power failure), some database server products have mechanisms automatic recovery, which restore a database to its latest consistency state within minutes, without the need for a DBA. participate.

    Processing on the server Using Access in a client-server configuration helps reduce network traffic by processing database queries on the server before sending the results to the client. Server processing is usually more efficient, especially when working with large data sets.

Basic steps for using Access with a database server

Factors to consider when choosing a method

Method requirements

Database partitioning

Network folder

SharePoint site

Database server

Necessity of availability database server software

The need for SharePoint

Necessity availability Access Services on SharePoint Server

Depends on the scenario:

Linking to lists and saving to a document library does not require Access Services;

Publishing as a web database or web application requires Access Services.

Data Availability

Suitable for small groups if data changes little

The best. Suitable for offline use cases.

The best

Safety

Depends on additional measures

Least secure way

The best

The best

Flexibility

Flexible way. New database functions can be easily developed without disruption. Users can change the structure in their own copy.

Less flexible way. Development can be done using an offline copy of the database, which is then replaced. There is no possibility for users to individually change the database structure.

Flexible way. Use SharePoint permissions to control access and change design. Allows some database objects, such as forms, to be used in a browser-based manner.

Flexible way. New database functions can be easily developed without disruption. Users can change the structure of objects in their own copy.

03/30/17 3.4K

By following the principles described in this article, you can create a database that works as expected and can be adapted to new requirements in the future. We will look at the basic principles database design, as well as ways to optimize it.

Database Design Process

Properly structured database:

  • Helps you save money disk space by eliminating unnecessary data;
  • Maintains data accuracy and integrity;
  • Provides convenient access to data.

Database development includes the following stages:

  1. Requirements analysis or database purpose determination;
  2. Organizing data in tables;
  3. Specifying primary keys and analyzing relationships;
  4. Normalization of tables.

Let's consider each database design stage more details. Please note that this tutorial covers Edgar Codd's relational database model, written in SQL ( rather than hierarchical, network or object models).

Requirements Analysis: Determining the Purpose of the Database

For example, if you are creating a database for a public library, you need to consider how both readers and librarians should access the database.

Here are some ways to collect information before creating a database:

  • Interviewing people who will use it;
  • Analysis of business forms such as invoices, schedules, surveys;
  • Consideration of all existing data systems ( including physical and digital files).

Start by collecting existing data that will be included in the database. Next, determine the types of data you want to save. As well as objects that describe this data. For example:

Clients

  • Address;
  • City, State, Zip Code;
  • E-mail address.

Goods

  • Name;
  • Price;
  • Quantity in stock;
  • Quantity to order.

Orders

  • Order number;
  • Sales Representative;
  • Date of;
  • Product;
  • Quantity;
  • Price;
  • Price.

When designing a relational database, this information will later become part of the data dictionary, which describes the tables and fields of the database. Break the information down into the smallest parts possible. For example, consider splitting the field postal address and state so you can filter people by the state in which they live.

Once you have decided what data will be included in the database, where the data will come from, and how it will be used, you can begin planning the actual database.

Database structure: building blocks

The next step is to visually represent the database. To do this, you need to know exactly how relational databases are structured. Within a database, related data is grouped into tables, each of which consists of rows and columns.

To convert lists of data into tables, start by creating a table for each type of object, such as products, sales, customers, and orders. Here's an example:

Each row in a table is called a record. Records include information about something or someone, such as a specific customer. Columns (also called fields or attributes) contain the same type of information that is displayed for each record, for example, the addresses of all customers listed in the table.


To ensure consistency across records when designing your database model, assign the appropriate data type to each column. Common data types include:
  • CHAR - specific text length;
  • VARCHAR - text of various lengths;
  • TEXT - large amount of text;
  • INT is a positive or negative integer;
  • FLOAT , DOUBLE — floating point numbers;
  • BLOB - binary data.

Some DBMSs also offer an Autonumber data type, which automatically generates a unique number on each row.

In a visual representation of the database, each table will be represented by a block in the diagram. The header of each block should state what the data in that table describes, and the attributes should be listed below:


At designing an information database you need to decide which attributes, if any, will serve as the primary key for each table. Primary key ( PK) is a unique identifier for this object. With it, you can select a specific customer's data, even if you only know that value.

Attributes chosen as primary keys must be unique, immutable, and cannot be set to NULL ( they can't be empty). For this reason, order numbers and usernames are suitable primary keys, but phone numbers or addresses are not. You can also use several fields at the same time as a primary key ( this is called a composite key).

When it comes time to create the actual database, you implement both the logical and physical structure through the data definition language supported by your DBMS.

You also need to evaluate the size of your database to ensure that you can get the level of performance you require and that you have enough space to store the data.

Creating relationships between entities

Now that the data has been converted into tables, we need to analyze the relationships between them. The complexity of a database is determined by the number of elements interacting between two related tables. Determining complexity helps ensure that you divide your data into tables in the most efficient way.

Each object can be interconnected with another using one of three types of relationships:

One-to-one communication

When there is only one instance of object A for each instance of object B, they are said to have a one-to-one relationship ( often denoted 1:1). You can indicate this type of relationship in an ER diagram with a line with a dash at each end:


If you have no reason to separate this data when designing and developing databases, a 1:1 relationship usually indicates that it is better to combine these tables into one.

But under certain circumstances, it makes more sense to create tables with 1:1 relationships. If you have an optional data field, such as "description", that is left blank for many records, you can move all the descriptions into a separate table, eliminating empty fields and improving database performance.

To ensure that the data is correlated correctly, you will need to include at least one identical column in each table. Most likely this will be the primary key.

One-to-many communication

These relationships occur when a record in one table is related to multiple records in another. For example, one customer might place many orders, or a reader might have several books borrowed from the library. One-to-many (1:M) relationships are indicated by the so-called crow's foot mark, as in this example:


To implement a 1:M relationship, add the primary key from "one" table as an attribute to the other table. If the primary key is specified in this way in another table, it is called a foreign key. The table on the "1" side of the relationship is the parent table to the child table on the other side.

Many-to-many communication

When several objects of a table can be related to several objects of another. They say they have a connection" many-to-many» ( M:N). For example, in the case of students and courses, since a student can take many courses, and each course can be attended by many students.

In an ER diagram, these relationships are represented using the following lines:


When designing a database structure, it is impossible to implement this kind of connection. Instead, you need to break them into two one-to-many relationships.

To do this, you need to create a new entity between these two tables. If there is an M:N relationship between sales and products, you can call this new object « sold_products", as it will contain data for each sale. Both the sales table and the products table will have a 1:M relationship with sold_products . This kind of intermediate object in various models called a link table, association object, or link table.

Each entry in the relationship table will correspond to two entities from neighboring tables. For example, a table of connections between students and courses might look like this:

Mandatory or not?

Another way to analyze connections is to consider which side of the relationship must exist for the other to exist. The optional side may be marked with a circle on the line. For example, a country must exist in order to have a representative at the United Nations, and not vice versa:


Two objects can be interdependent ( one cannot exist without the other).

Recursive connections

Sometimes when designing a database, a table points to itself. For example, an employee table might have an attribute "manager" that refers to another person in the same table. This is called recursive links.

Extra connections

Extraneous connections are those that are expressed more than once. Typically, you can delete one of these links without losing any important information. For example, if the object "students" has a direct relationship with another object called "teachers", but also has an indirect relationship with teachers through "subjects", you need to remove the relationship between "students" and "teachers". Because the only way students are assigned teachers is through subjects.

Database Normalization

After preliminary database design, you can apply normalization rules to ensure that the tables are structured correctly.

At the same time, not all databases need to be normalized. In general, databases with real-time transaction processing ( OLTP), must be normalized.

Databases with interactive analytical processing ( OLAP), allowing for easier and faster data analysis, can be more effective with a certain degree of denormalization. The main criterion here is the speed of calculations. Each form or level of normalization includes rules associated with the lower forms.

First form of normalization

The first form of normalization ( abbreviated 1NF) states that during logical design Database Each cell in a table can only have one value, not a list of values. Therefore, a table like the one below does not correspond to 1NF:


You may want to get around this limitation by splitting the data into additional columns. But this is also against the rules: a table with groups of duplicate or closely related attributes does not comply with the first form of normalization. For example, the table below does not correspond to 1NF:
Instead, during physical database design, divide the data into multiple tables or records until each cell contains only one value and there are no additional columns. Such data is considered to be broken down to its smallest usable size. In the above table, you can create an additional table " Sales details”, which will match specific products with sales. "Sales" will have a 1:M relationship with " Sales details».

Second form of normalization

The second form of normalization ( 2NF) stipulates that each of the attributes must depend entirely on the primary key. Each attribute must depend directly on the entire primary key, and not indirectly through another attribute.

For example, the attribute “age” depends on “birthday”, which, in turn, depends on “student ID”, has a partial functional dependence. A table containing these attributes will not conform to the second form of normalization.

In addition, a table with a primary key consisting of several fields violates the second form of normalization if one or more fields do not depend on each part of the key.

Thus, a table with these fields will not match the second form of normalization, since the "product name" attribute depends on the product ID, but not on the order number:

  • Order number (primary key);
  • Product ID (primary key);
  • Product Name.

Third form of normalization

The third form of normalization ( 3NF) : Every non-key column must be independent of every other column. If at relational database design changing a value in one non-key column causes a change in another value, this table does not comply with the third form of normalization.

According to 3NF, you cannot store any derived data in a table, such as the "Tax" column, which in the example below directly depends on the total cost of the order:


At one time, additional forms of normalization were proposed. Including Boyce-Codd form of normalization, forms four through six, and domain key normalization, but the first three are the most common.

Multidimensional data

Some users may need to access multiple views of the same data type, especially in OLAP databases. For example, they might want to know sales by customer, country, and month. In this situation, it is better to create a central table that can be referenced by the customer, country, and month tables. For example:

Data Integrity Rules

Also using database design tools it is necessary to configure the database taking into account the ability to check data for compliance with certain rules. Many DBMSs, such as Microsoft Access, automatically apply some of these rules.

The integrity rule states that a primary key can never be NULL. If a key consists of multiple columns, none of them can be NULL. Otherwise, it may ambiguously identify the entry.

The referential integrity rule requires that every foreign key specified in one table be mapped to one primary key in the table it references. If a primary key is changed or deleted, those changes must be implemented in all objects referenced by that key in the database.

Business logic integrity rules ensure that data conforms to certain logical parameters. For example, the meeting time must be within standard business hours.

Adding Indexes and Views

An index is a sorted copy of one or more columns with values ​​in ascending or descending order. Adding an index allows you to find records faster. Instead of re-sorting for each query, the system can access the records in the order specified by the index.

Although indexes speed up data retrieval, they can slow down adding, updating, and deleting data because the index must be rebuilt whenever a record changes.

A view is a saved request for data. Views can include data from multiple tables or display part of a table.

Advanced Properties

After database model design You can refine your database using advanced properties such as help text, input masks, and formatting rules that apply to a specific schema, view, or column. The advantage of this method is that since these rules are stored in the database itself, the presentation of the data will be consistent across multiple programs that access the data.

SQL and UML

Unified Modeling Language ( UML) is another visual way of expressing complex systems created in an object-oriented language. Some of the concepts mentioned in this tutorial are known by different names in UML. For example, an object in UML is known as a class.

UML is not used that often these days. These days it is used academically and in communication between software developers and their clients.

Good bad

IN modern world Data is becoming increasingly important, and without exaggeration we can say that the world is driven by data. Therefore, now much attention is paid to the collection, storage, analysis and sale of data. The key to the successful development of a modern business is the accumulation, systematization and use, for example, of information about your customers, such as their needs, purchasing preferences, etc. Such information can help in making informed decisions regarding the practical effectiveness of promotional offers, identifying unprofitable business segments, analyzing demand for manufactured goods or services, tracking the dynamics of trade for individual items and reviewing other key factors. Databases, when used correctly, give us these advantages over our competitors.

If you are the owner of a small business and do not yet use a Customer Relationship Management System (CRM or Customer Relationship Management) in your work, which automates the generation of customer interaction strategies, then the success of your business is subject to certain risks. Remember that your competitors are not asleep!

Let's take a look at which software you can use to build a database tailored to the unique needs of your small business that will collect daily, weekly, monthly or yearly data.

Filemaker Pro 12

This database has undeservedly fallen out of the sight of database administrators and developers for a long time, but has been loved by the business community since its creation. Filemaker Pro, created by Apple, works like an operating system Mac system, and on Windows system. The program is an intuitive and very easy-to-use tool for creating your own databases with support for the provision of data on the Internet, which is capable of generating reports in normal and advanced modes, and can be integrated with other database systems.

Microsoft Access 2010

For a long time the Access database management system from the package Microsoft Office has been the most popular solution for most small businesses. However, now it is faced with competition from other DBMSs that are easier to use, better integrated with cloud systems, and do not require extensive knowledge in creating, maintaining databases and software development.

If you already have a database, chances are it was built using Microsoft Access. A new version 2010 looks and works better and is easier to use compared to previous versions, such as the widely used 2003 version. Despite the fact that this DBMS has begun to be crowded out by competitors, it still occupies a leading position in this segment of the software market.

Oracle Application Express (APEX) database for this business

APEX is a database management system built on the mega-successful Oracle database engine. APEX is available completely free if you are already an Oracle customer and provides a more advanced business application creation system than Microsoft Access or FileMaker Pro. However, using APEX is not as simple as simply entering data into tables, as is done in an Access database.

If you are already using Oracle or anticipate needing more advanced database management capabilities, such as integration with other data systems in the future, or handling very large volumes of data with fast performance, then APEX is the right choice.

Zoho Creator is a relative newcomer to the database world and offers an intuitive cloud-based database system. The Zoho developers have created a truly reliable, easy-to-use system in which you can quickly create a simple database application without much preparation. This became possible thanks to the use of data entry forms, a very good report builder, and integration with other systems, which is often necessary when you already have an existing database created in other DBMSs, or when using the databases of your partners.

When making decisions, a manager at any level is based on the information available to him about the subject of management, therefore, the effectiveness of his work directly depends on the qualitative characteristics of this information, such as adequacy, completeness, reliability, timeliness, consistency, etc.

In modern conditions, information systems play and will play an increasingly important role in achieving the strategic goals of the organization. This leads to new requirements for information systems and their functions. Such systems are no longer just a tool that provides information processing to departments and end users within an organization. They must now produce information-based products and services that will provide the organization with a competitive advantage in the marketplace.

Information systems and information technologies used within their framework are the result of certain decisions of managers in an organization. However, in turn, systems and technologies dictate their own specific conditions for doing business and change organizations.

And no matter what consultants the manager attracts in this area, the final decisions must be made by him personally. A manager must be able to make the most of the potential benefits of information technology. He must have sufficient knowledge to provide general management of the process of application and development of information technology in the organization and understand when additional resources are required in this area or the assistance of third-party specialists.

Since the invention of writing, humanity has been faced with the task of storing data. Record keeping has a long history, but despite the evolution from clay tablets to papyrus, then to parchment and finally to paper, all this time it had one thing in common - information was processed manually.

With the development of civilization, document management threatened to consume all of a specialist’s time - by the end of the 20th century, many companies had entire floors dedicated to storing documents, which, you see, is not so far from storing clay tables in Sumerian archives.

With the advent of computers, the task of document management was simplified - storing documents in electronic form turned out to be simple, cheap and convenient. A key component of this new technology there was software. It has become relatively easy to program and use computers, and it has become much easier to sort, analyze and process data. Standard packages have emerged for common business applications such as accounting, payroll, inventory, subscription management, banking, and document library management.

The reaction to the emergence of these new technologies was quite predictable: large businesses stored even more information and demanded ever faster equipment.

In the course of their activities, industrial enterprises, corporations, departmental structures, government and management bodies have accumulated large volumes of data. They contain enormous opportunities for extracting useful analytical information, on the basis of which you can identify hidden trends, build a development strategy, and find new solutions.

Obviously, providing quick access to most of the data is not that difficult. However, any of us has encountered a situation where finding the necessary document, so wisely saved last month (or year), turns out to be disproportionately labor-intensive. At this point, it becomes clear that the traditional capabilities of file systems are no longer enough for success in the modern world - the world of information technology.

Today, in order to obtain additional competitive advantages, most domestic companies need serious IT support for their business - a company transformation system based on the use of information technology and aimed at increasing the efficiency of the organization.

In the second half of the 90s, many enterprises began to realize that the data at their disposal was a valuable asset, the correct use of which could create competitive advantages. Large companies have been accumulating data about their customers, suppliers, products and services for decades. However, they realized that their data was stored in disparate systems and that this information needed to be integrated to further develop the business. The need to integrate corporate information gave rise to the creation of databases.

This is especially true now when, thanks to the rapid development of e-commerce, Internet-based firms can turn into huge enterprises within a few months, or even weeks. And, as a result, their databases will grow rapidly.

Therefore, a prudent manager should begin investing in IT support without bringing the enterprise to the decision point where his company faces a cost ceiling. The real problem facing the company's top management is organizing the accumulated data archives in a way that would make it easy to find the required information. Finding patterns, trends, anomalies and relevant information in a large database is one of the new, most exciting areas of data management.

For those who have already taken this path, it is obvious that databases can radically change the nature of the work of any organizations positioned in different subject areas, relieving managers from performing routine procedures associated with searching for information in numerous files, paper documents, reference books and standards. This is a new stage in the company’s development, leading it to the next stage of evolution, although often using revolutionary methods.

Reducing time costs is only an indirect effect of automation. the main task development of information technologies in another way - in the acquisition by an organization of fundamentally new qualities that give it significant competitive advantages. This is exactly the case that is worth a lot.

Moreover, today installing and administering databases is a much less complicated process than just a few years ago. Database design and management are largely automated. Software, which allows you to solve this problem - create a database, update the information stored in it - and provide convenient access to it for viewing and searching, is called a database management system (DBMS).

The database management system creates a specific environment for the user to work on the computer screen (user interface). In addition, the DBMS has certain operating modes and a command system. Information systems are created and operate on the basis of DBMS. It is worth recalling that database management systems are one of the most successful technologies in the entire computer industry. Billions of dollars are spent on database systems and applications every year. Database management systems play an exceptional role in the organization of modern industrial, instrumental and research information systems.

Typical modes of working with a DBMS are creating a database, editing a database, manipulating a database, searching in a database. To operate in each mode, there is its own system of DBMS commands. Any user work with the database is built in the form of an algorithm made up of these commands. Such algorithms can be executed in direct execution mode (the given command is immediately executed) and in automatic execution mode, i.e., in program mode.

When creating a database, the user strives to organize information according to various characteristics and quickly retrieve a sample with an arbitrary combination of characteristics. Database users can be different application programs, software systems, as well as domain experts who act as consumers or sources of data (called end users).

The main feature of a well-built DBMS is its functionality a sharp reduction in labor costs for processing almost all internal and external business information of the organization. A database designed in this way allows each department to use the information entered by one user and eliminates the need for company divisions to duplicate data, which leads to a sharp reduction in labor costs. For example, information about a sold product, already at the time of release from the warehouse, becomes equally available to both the sales manager and business applications of general accounting and payroll.

Meanwhile, despite the enormous achievements associated with facilitating the installation, management and use of DBMS, especially those that run on personal computers or workstations, many still prefer to use the file system. There is a tacit assumption that the DBMS must be handled by well-trained, full-time personnel, and that most database users do not have any training in database technology. Users still find it difficult to connect to the DBMS, find the desired directory or database names where data is stored, and formulate queries and update the database. Connectivity and Access Paradigm file system still seem significantly easier to understand.

However, when thinking about the success of his business, a manager should not succumb to such sentiments. It is worth remembering that it is now easy and inexpensive to create a database. Millions of people do this, and you don't have to become a computer operator to do it. A competent IT engineer and a few training workshops for staff is all you need to turn piles of files with hard-to-access information into a modern database. Whereas, by abandoning the advantages of a DBMS for the sake of the immediate convenience of the staff and their reluctance to change the established order, the manager risks remaining the only user of cuneiform in the world who has switched to the phonetic alphabet.