In 1982, two Fellows at IBM were investigating possible ways to
access data better than was currently the case. Codd and Date identified a RELATIONAL
model which would eliminate redundant data and the huge overheads involved in maintaining
and synchronizing multiple copies of it. Despite strong resistance at first, and some very
poor and inefficient early attempts to implement the model in software, within 15 years
the generally accepted repository for storing important information was the Relational Database.
COBOL people are familiar with the use of tables, defined through the OCCURS
clause. The "tables" that comprise a Relational database represent "relations"
between a key and its data; each row is comprised of columns which relate to
a particular primary key. Although COBOL people will see a "row" as a "record",
and the columns which comprise it as "fields", this is not strictly accurate
(although it is certainly a good enough starting point to help you get the idea
of relational tables.)
In Relational terminology, elements that "occur" are called "Repeating Groups".
The way that COBOL deals with them is to assign a fixed area of memory (even if you
use "OCCURS...DEPENDING the maximum amount of memory is still allocated), but Relational
Database Management Systems (Oracle, SQLServer, Access, DB2 etc.) will not tolerate
fixed defined limits on data. Instead, a COBOL "table" within a record defintion must be
separated out to a new DB table where it can grow to whatever size it needs to.
(see diagram, above)
This process of removing repeating goups out to separate attached tables is part
of the "normalization" of the Relational Database (RDB). "Normalization" is the
process of optimizing your data so that it conforms with the Relational Model. A
properly normalized RDB has no redundant data, just as the the mathematical
purity of the Relational Model promised.
Here's a quick summary of some RDB jargon you are likely to encounter:
|
TERM |
Simple explanation |
|
CONSTRAINT |
Defining an explicit relationship between tables. See CASCADE |
|
REFERENTIAL INTEGRITY |
Guarantees that constrained tables will be treated as a single unit. |
|
FOREIGN KEY |
The key in an attached table which links it back to the base table, or another
table it is constrained with. |
|
PRIME KEY |
The unique identifier for a given table row. It may become a FOREIGN KEY in
linked tables |
|
CASCADE |
The base and attached tables are constrained on one or more keys and if you
delete or update the base relation, any attached relations are also updated or
deleted. |
Apart from providing random and sequential access to discrete or linked data (which you
could do with your indexed or relative files anyway), the Relational Database Management
Systems carry out a number of other very useful functions that help to ease the load
for the programmer.
Transactional isolation means that transactions which process against a number of tables
can have their updates deferred until the successful end of the transaction, when all
pending updates are applied together. If the transaction aborts or is cancelled, the database
will remain in the state it was before the transaction started. COMMITS and ROLLBACKS
can be applied manually under program control, or automatically when transactions are run.
By using a separate subsystem to manipulate data it is possible to do performance tuning in
the subsystem, rather than in the application.
PRIMA ESQL for COBOL tutorial 
|
There is a document that covers the use of embedded SQL (ESQL) with COBOL and
shows how to include SQL into your programs so you can manipulate RDBs. A
separate Appendix describes and explains Normalization in simple language.
DOWNLOAD the package by clicking the icon; you can READ the normalization
document by clicking
here...
|