DATABASE SYSYTEMS
1.1 DATA BASE MANAGEMENT SYSTEMS
You know that a database is a collection of logically related data elements that may be structured in various ways to meet the multiple processing and retrieval needs of organizations and individuals. There's nothing new about data bases-early ones were chiseled in stone, penned on scrolls, and written on index cards. But now databases are commonly recorded on magnetizeable media, and computer programs are required to perform the necessary storage and retrieval operations.
You'll see in the following pages that complex data relationships and linkages may be found in all but the simplest databases. The system software package that handles the difficult tasks associated with creating, accessing, and maintaining data base records is called a data base management system (DBMS). The programs in a DBMS package establish an interface between the database itself and the users of the database. (These users may be applications programmers, managers and others with information needs, and various OS program.)
This capability enables decision makers to search, probe, and query data base contents in order to extract answers to nonrecurring and unplanned questions that aren't available in regular reports. These questions might initially be vague and/or poorly defined, but people can "browse" through the database until they have the needed information. In short, the DBMS will "manage" the stored data items and assemble the needed items from the common database in response to the queries of those who aren't programmers. In a file -oriented system, users needing special information may communicate their needs to a programmer, who, when time permits, will write one or more programs to extract the data and prepare the information. The availability of a DBMS, however, offers users a much faster alternative communications path
1.2 DBMS STRUCTURING TECHNIQUES
Sequential, direct, and other file processing approaches are used to organize and structure data in single files. But a DBMS is able to integrate data elements from several files to answer specific user inquiries for information. This means that the DBMS is able to access and retrieve data from nonkey record fields. That is, the DBMS is able to structure and tie together the logically related data from several large files.
Logical Structures. Identifying these logical relationships is a job of the data administrator. A data definition language is used for this purpose. The DBMS may then employ one of the following logical structuring techniques during storage access, and retrieval operations:
1. List structures. In this logical approach, records are linked together by the use of pointers. A pointer is a data item in one record that identifies the storage location of another logically related record. Records in a customer master file, for example, will contain the name and address of each customer, and each record in this file is identified by an account number. During an accounting period, a customer may buy a number of items on different days. Thus, the company may maintain an invoice file to reflect these transactions. A list structure could be used in this situation to show the unpaid invoices at any given time. Each record in the customer file would contain a field that would point to the record location of the first invoice for that customer in the invoice file This invoice record, in turn, would be linked to later invoices for the customer. The last invoice in the chain would be identified by the use of a special character as a pointer.
2. Hierarchical (tree) structures. In this logical approach, data units are structured in multiple levels that graphically resemble an "upside down" tree with the root at the top and the branches formed below. These are a superior-subordinate relationship in a hierarchical (tree) structure. Below the single-root data component are subordinate elements or nodes, each of which, in turn, "own" one or more other elements (or none). Each element or branch in this structure below the root has only a single owner. Thus, as we see in a customer owns an invoice, and the invoice has subordinate items. The branches in a tree structure are not connected.
3. Network Structures. Unlike the tree approach, which does not permit the connection of branches, the network structure permits the connection of the nodes in a multidirectional manner . Thus, each node may have several owners and may, in turn, own any number of other data units. Data management software permits the extraction of the needed information from such a structure by beginning with any record in a file.
4. Relational structures. A relational structure is made up of many tables. The data are stored in the form of "relations" in these tables. For example, relation tables could be established to link a college course with the instructor of the course, and with the location of the class . To find the name of the instructor and the location of the English class, the course/instructor relation is searched to get the name ("Fitt"), and the course/location relation is searched to get the class location ("Main 142"). Many other relations are, of course, possible. This is a relatively new database structuring approach that's expected to be widely implemented in the future.
5. Physical Structures. People visualize or structure data in logical ways for their own purposes. Thus, records R1 and R2 may always be logically linked and processed in sequence in one particular application. However, in a computer system it's quite possible that these records that are logically contiguous in one application are not physically stored together. Rather, the physical structure of the records in media and hardware may depend not only on the I/O and storage devices and techniques used , but also on the different logical relationships that users may assign to the data found in R1 and R2.For example, R1 and R2 may be records of credit customers who have shipments send to the same block in the same city every 2 weeks. From the shipping department manager's perspective, then, R1 and R2 are sequential entries on a geographically organized shipping report. But in the A/R application, the customers represented by R1 and R2 may be identified, and their accounts may be processed, according to their account numbers which are widely separated. In short, then, the physical location of the stored records in many computer-based information systems is invisible to users.
1.3 DATA INDEPENDENCE, INTEGRITY ANDSECURITY
DATA INDEPENDENCE
An important point about database systems is that the database should exist independently of any of the specific applications. Traditional data processing applications are data dependent. COBOL programs contain file descriptions and record descriptions that carefully describe the format and characteristics of the data.
Users should be able to change the structure of the database without affecting the applications that use it. For example, suppose that the requirements of your applications change. A simple example would be expanding ZIP codes from five digits to nine digits. In a traditional approach using COBOL programs, each individual COBOL application program that used that particular field would have to be changed, recompiled, and retested. The programs would be unable to recognize or access a file that had been changed and contained a new data description; this, in turn, might cause disruption in processing unless the change were carefully planned.
Most database programs provide the ability to change the database structure by simply changing the ZIP code field and the data-entry form. In this case, data independence allows for minimal disruption of current and existing applications. Users can continue to work and can even ignore the nine-digit code if they choose. Eventually, the file will be converted to the new nine-digit ZIP code, but the ease with which the changeover takes place emphasizes the importance of data independence.
DATA INTEGRITY
Data integrity refers to the accuracy, correctness, or validity of the data in the database. In a database system, data integrity means safeguarding the data against invalid alteration or destruction. In large on-line database system, data integrity becomes a more severe problem and two additional complications arise. The first has to do with many users accessing the database concurrently. For example, if thousands of travel agents and airline reservation clerks are accessing the same database at once, and two agents book the same seat on the same flight, the first agent's booking will be lost. In such cases the technique of locking the record or field provides the means for preventing one user from accessing a record while another user is updating the same record.
The second complication relates to hardware, software, or human error during the course of processing and involves database transactions to keep the database in a consistent state of integrity. A database transaction is a group of database modifications treated as a single unit. For example, an agent booking an airline reservation involves several database updates (i.e., adding the passenger's name and address and updating the seats-available field), which comprise a single transaction. The database transaction is not considered to be completed until all updates have been completed; otherwise, none of the updates will be allowed to take place.
DATA SECURITY
Data security refers to the protection of a database against unauthorized or illegal access or modification. This usually involves one or more levels of password protection that are specified in the data dictionary. For example, a high-level password might allow a user to read from, write to, and modify the database structure, whereas a low-level password might only allow a user to read from the database.
Often an audit trail-the recorded history of the modifications to a database-can be used to identify where and when a database was tampered with and it can also be used to restore the file to its original condition.
1.4 MANAGEMENT INFORMATION SYSTEM (MIS)
The management information system (MIS) concept has been defined in dozens of ways. Since one organization's model of an MIS is likely to differ from that of another, it's not surprising that their MIS definitions would also vary in scope and breadth. For our purposes, an MIS can be defined as a network of computer-based data processing procedures developed in an organization and integrated as necessary with manual and other procedures for the purpose of providing timely and effective information to support decision making and other necessary management functions.