Chapter 4 Database Security
Security Requirements
The basic security requirements of database systems are not unlike those of other computing systems we have studied. The basic problemsaccess control, exclusion of spurious data, authentication of users, and reliabilityhave appeared in many contexts so far Following is a list of requirements for database security
Physical database integrity. The data of a database are immune to physical problems, such as power failures, and someone can reconstruct the database if it is destroyed through a catastrophe. Logical database integrity. The structure of the database is preserved. With logical integrity of a database, a modification to the value of one field does not affect other fields, for example. Element integrity. The data contained in each element are accurate. Auditability. It is possible to track who or what has accessed (or modified) the elements in the database. Access control. A user is allowed to access only authorized data, and different users can be restricted to different modes of access (such as read or write). User authentication. Every user is positively identified, both for the audit trail and for permission to access certain data. Availability. Users can access the database in general and all the data for which they are authorized.
***IMP*** Security Requirements Security requirements for databases and DBMSs: a. Physical database integrity requirements DB immune to physical problems (e.g., power failure, flood) b. Logical database integrity requirements DB structure preserved (e.g., update of a field doent affect another) c. Element integrity requirements Accuracy of values of elements d. Auditability requirements Able to track who accessed (read, wrote) what e. Access control requirements Restricts DB access (read, write) to legitinmate users f. User authentication requirements Only authorized users can access DB g. Availability requirements DB info available to all authorized users 24 X 7.
Integrity of the Database
The data must be protected from corruption Two situations can affect the integrity of a database: 1) When the whole database is damaged (as happens, for example, if its storage medium is damaged) 2) When individual data items are unreadable. Integrity of the database as a whole is the responsibility of the DBMS, the operating system, and the (human) computing system manager. From the perspective of the operating system and the computing system manager, databases and DBMSs are files and programs, respectively. Therefore, one way of protecting the database as a whole is to regularly back up all files on the system
Element Integrity
The integrity of database elements is their correctness or accuracy. Ultimately, authorized users are responsible for entering correct data into databases. However, users and programs make mistakes collecting data, computing results, and entering values. Therefore, DBMSs sometimes take special action to help catch errors as they are made and to correct errors after they are inserted.
Auditability
For some applications it may be desirable to generate an audit record of all access (read or write) to a database. Such a record can help to maintain the database's integrity, or at least to discover after the fact who had affected which values and when. A second advantage, as we see later, is that users can access protected data incrementally; that is, no single access reveals protected data, but a set of sequential accesses viewed together reveals the data, much like discovering the clues in a detective novel.
Access Control
Databases are often separated logically by user access privileges. For example, all users can be granted access to general data, but only the personnel department can obtain salary data and only the marketing department can obtain sales data. Databases are very useful because they centralize the storage and maintenance of data. Limited access is both a responsibility and a benefit of this centralization. The database administrator specifies who should be allowed access to which data, at the view, relation, field, record, or even element level. The DBMS must enforce this policy, granting access to all specified data or no access where prohibited. Furthermore, the number of modes of access can be many. A user or program may have the right to read, change, delete, or append to a value, add or delete entire fields or records, or reorganize the entire database.
User Authentication
The DBMS can require rigorous user authentication. For example, a DBMS might insist that a user pass both specific password and time-of-day checks. This authentication supplements the authentication performed by the operating system. Typically, the DBMS runs as an application program on top of the operating system. This system design means that there is no trusted path from the DBMS to the operating system, so the DBMS must be suspicious of any data it receives, including user authentication. Thus, the DBMS is forced to do its own authentication.
Availability
A DBMS has aspects of both a program and a system. It is a program that uses other hardware and software resources, yet to many users it is the only application run. Users often take the DBMS for granted, employing it as an essential tool with which to perform particular tasks. But when the system is not available busy serving other users or down to be repaired or upgraded the users are very aware of a DBMS's unavailability.
Reliability and Integrity
Databases amalgamate data from many sources, and users expect a DBMS to provide access to the data in a reliable way. When software engineers say that software has reliability, they mean that the software runs for very long periods of time without failing. Database concerns about reliability and integrity can be viewed from three dimensions:
Database integrity: concern that the database as a whole is protected against damage, as from the failure of a disk drive or the corruption of the master database index. These concerns are addressed by operating system integrity controls and recovery procedures. Element integrity: concern that the value of a specific data element is written or changed only by authorized users. Proper access controls protect a database from corruption by unauthorized users. Element accuracy: concern that only correct values are written into the elements of a database. Checks on the values of elements can help prevent insertion of improper values. Also, constraint conditions can detect incorrect values.
***IMP*** Reliable software runs long time without failures Reliable DBMS preserves: DB Integrity / Element Integrity / Element Accuracy Basic protection provided by OS underlying DBMS File back ups Access controls Integrity checks DBMS needs more CIA controls E.g. two-phase commit protocols for updates Redundancy/internal consistency controls
DB recovery Concurrency/consistency control Monitors to enforce DB constraints Range, state, transition constraints Control structural DB integrity. TWO PHASE UPADTE Intent Phase Check value of COMMIT-FLAG Gathers resources Data Dummy records Open files Lock out others Calculate final answers Write COMMIT-FLAG Permanent Change Phase Update made Rollback ability at each phase Redundancy/internal consistency controls Error detection / error correction Hamming codes Parity bits Cyclic redundancy check Shadow fields DB recovery Uses DBMS access log Concurrency control Checks/enforcement Monitors for DB constraints Range comparisons State constraints Transition constraints.
Sensitive Data
Sensitive data are data that should not be made public. These 2 cases Nothing sensitive Everything Sensitive
Several factors can make data sensitive.
Inherently sensitive. The value itself may be so revealing that it is sensitive.
Examples are the locations of defensive missiles or the median income of barbers in a town with only one barber. From a sensitive source. The source of the data may indicate a need for confidentiality. An example is information from an informer whose identity would be compromised if the information were disclosed.
Declared sensitive. The database administrator or the owner of the data may
have declared the data to be sensitive. Examples are classified military data or the name of the anonymous donor of a piece of art.
Part of a sensitive attribute or a sensitive record. In a database, an entire
attribute or record may be classified as sensitive. Examples are the salary attribute of a personnel database or a record describing a secret space mission.
Sensitive in relation to previously disclosed information. Some data
become sensitive in the presence of other data. For example, the longitude coordinate of a secret gold mine reveals little, but the longitude coordinate in conjunction with the latitude coordinate pinpoints the mine.
Access Decisions
It Depends 1) Availability of Data. 2) Acceptability of Access. 3) Assurance of Authenticity.
Availability of Data
One or more required elements may be inaccessible. For example, if a user is updating several fields, other users' accesses to those fields must be blocked temporarily. This blocking ensures that users do not receive inaccurate information, such as a new street address with an old city and state, or a new code component with old documentation. Blocking is usually temporary. When performing an update, a user may have to block access to several fields or several records to ensure the consistency of data for others.
Acceptability of Access
Deciding what is sensitive, however, is not as simple as it sounds, because the fields may not be directly requested. A user may have asked for certain records that contain sensitive
data, but the user's purpose may have been only to project the values from particular fields that are not sensitive.
Assurance of Authenticity
Certain characteristics of the user external to the database may also be considered when permitting access. For example, to enhance security, the database administrator may permit someone to access the database only at certain times, such as during working hours. Previous user requests may also be taken into account; repeated requests for the same data or requests that exhaust a certain category of information may be used to find out all elements in a set when a direct query is not allowed.
Types of Disclosures
Data can be sensitive, but so can their characteristics. Below is a form of disclosure. 1) Exact data 2) Bounds 3) ve Result 4) Existence 5) Probable value
Exact Data
The user may know that sensitive data are being requested, or the user may request general data without knowing that some of it is sensitive. A faulty database manager may even deliver sensitive data by accident, without the user's having requested it. In all of these cases the result is the same: The security of the sensitive data has been breached.
Bounds
Indicating that a sensitive value, y, is between two values, L and H. Sometimes, by using a narrowing technique not unlike the binary search, the user may first determine that L< y<H and then see whether L<y< H/2, and so forth, thereby permitting the user o determine y to any desired precision. In another case, merely revealing that a value such as the athletic scholarship budget or the number of CIA agents exceeds a certain amount may be a serious breach of security. Sometimes, however, bounds are a useful way to present sensitive data. It is common to release upper and lower bounds for data without identifying the specific records. For example A company may announce that its salaries for programmers range from $50,000 to $82,000. If you are a programmer earning $79,700, you can presume that you are fairly well off, so you have the information you want; however, the announcement does not disclose who are the highest- and lowest-paid programmers. Negative Result Sometimes we can word a query to determine a negative result. That is, we can learn that z is not the value of y. For example, knowing that 0 is not the total number of felony convictions for a person reveals that the person was convicted of a felony. The distinction between 1 and 2 or 46 and 47 felonies is not as sensitive as the distinction between 0 and 1. Therefore, disclosing that a value is not 0 can be a significant disclosure. Similarly, if a student does not appear on the honors list, you can infer that the person's grade point average is below 3.50. This information is not too revealing, however, because the range of grade point averages from 0.0 to 3.49 is rather wide.
Existence
In some cases, the existence of data is itself a sensitive piece of data, regardless of the actual value. For example, An employer may not want employees to know that their use of long distance telephone lines is being monitored. In this case, discovering a LONG DISTANCE field in a personnel file would reveal sensitive data. Probable Value Finally, it may be possible to determine the probability that a certain element has a certain value.
Inference (Inference Problems) ( I- Problem in Database)
Inference attack - inferring sensitive datafrom nonsensitive data Types of inference attacks: 1) Direct attack 2) In Direct Attack
1)Direct Attack Infer sens. data from results of queries run by attacker n-item k-percent rule: Data withheld if n items represent > k percent of the result reported Most obvious case: 1-item 100-percent case: 1 person represents 100 % of results reported 2) Indirect attack Infer sens. info from statistics (Sum, Count, Median) also from info external to t he attacked DB Tracker attacks (intersection of sets) Linear system vulnerability Use algebra of multiple equations to infer Indirect can be implemented with help of following
Sum Count Mean
Tracker Attacks
A tracker attack can fool the database manager into locating the desired data by using additional queries that produce small results. The tracker adds additional records to be retrieved for two different queries; the two sets of records cancel each other out, leaving only the statistic or data desired.
The approach is to use intelligent padding of two queries. In other words, instead of trying to identify a unique value, we request n - 1 other values (where there are n values in the database). Given n and n - 1, we can easily compute the desired single element. For instance, suppose we wish to know how many female Caucasians live in Holmes Hall. A query posed might be count ((SEX=F) ^(RACE=C) ^(DORM=Holmes)) However, further analysis of the query allows us to track sensitive data through nonsensitive queries. The query
q=count((SEX=F) ^(RACE=C) ^(DORM=Holmes))
is of the form
q = count(a ^ b ^ c)
By using the rules of logic and algebra, we can transform this query to
q = count(a ^ b ^ c) = count(a) count( a ^ (b ^ c))
Linear System Vulnerability
A tracker is a specific case of a more general vulnerability. With a little logic, algebra, and luck in the distribution of the database contents, it may be possible to construct a series of queries that returns results relating to several different sets. For example, the following system of five queries does not overtly reveal any single c value from the database. However, the queries' equations can be solved for each of the unknown c values, revealing them all.
Inference Controls
***IMP*** 1) Query controls applied to queries Primarily against direct attacks Query analysis to prevent inferences Query inventory (history) per person 2) Data item controls applied to individual DB items Useful for indirect attacks Two types: a) Suppression data not provided to querying user Suppress combinations of rows and columns Combine results (to hide actual answers) b) Concealing close answers, not exact given to querying user Rounding Present range of results Present random sample results Perturb random data (generate small + and error) ***** Suppression and concealing are two controls applied to data items. Suppression: - sensitive data values are not provided; the query is rejected without response. Concealing: - the answer provided is close to but not exactly the actual value.