Database security
Computer Security: Principles
and Practice
Chapter 5 – Database Security
by William Stallings and Lawrie Brown
Outline
1. The need for Database Security
2. Introduction to databases and DBMS
3. Relational databases
4. database access control issues
5. Inference in databases
6. Statistical database security issues
7. database encryption
8. Cloud security
The need for Database security
Organizational databases tend to concentrate
sensitive information in a single logical system.
Corporate financial data
Confidential phone records
Customer and employee information, such as
name, Social Security number, bank account
information, credit card information
Proprietary product information
Health care information and medical record
The need for Database security
Many businesses and other organizations,
need to provide customers, partners, and
employees with access to this information.
But such information can be targeted by
internal and external threats of misuse or
unauthorized change.
So, security specifically tailored to databases is
an increasingly important component of an
overall organizational security strategy.
The need for database security
1. Dramatic imbalance between complexity of modern
DBMS’s and the security techniques used to
protect DBMS’s.
2. SQL is a very sophisticated and complex protocol.
Effective database security requires a thorough
understanding of SQL.
3. Most database admins are not expert in security.
4. Heterogeneous database platforms (Oracle, IBM
DB1, Microsft, Sybase, etc), OS platforms (Unix,
Linux, z/OS, Windows, ets)
Database Security
DB security
requirements are
beyond the
capability of OS-
security
mechanisms.
Databases
structured collection of data stored for use by one or
more applications
contains the relationships between data items and
groups of data items
can sometimes contain sensitive data that needs to be
secured
database management system (DBMS)
suite of programs for constructing and maintaining the
database
offers ad hoc query facilities to multiple users and
applications
query language
provides a uniform interface to the database
Relational Databases
Constructed from tables of data
– each column holds a particular type of data
– each row contains a specific value of each column
– ideally has one column where all values are
unique, forming an identifier/key for that row
have multiple tables linked by identifiers
use a query language to access data items
meeting specified criteria
Relational Database Example
Relational Database Elements
relation / table / file
tuple / row / record
attribute / column / field
primary key
– uniquely identifies a row
foreign key
– links one table to attributes in another
view / virtual table
Relational Database Elements
Structured Query Language
Structure Query Language (SQL)
– originally developed by IBM in the mid-1970s
– standardized language to define, manipulate, and
query data in a relational database
– several similar versions of ANSI/ISO standard
CREATE TABLE department ( CREATE VIEW newtable (Dname, Ename, Eid, Ephone)
Did INTEGER PRIMARY KEY, AS SELECT D.Dname E.Ename, E.Eid, E.Ephone
Dname CHAR (30), FROM Department D Employee E
Dacctno CHAR (6) )
WHERE E.Did = D.Did
CREATE TABLE employee (
Ename CHAR (30),
Did INTEGER,
SalaryCode INTEGER,
Eid INTEGER PRIMARY KEY,
Ephone CHAR (10),
FOREIGN KEY (Did) REFERENCES department (Did) )
Database Access Control
DBMS provide access control for database
Assumes user has been authenticated
DBMS provides specific access rights to portions of
the database
– e.g. create, insert, delete, update, read, write
– to entire database, tables, selected rows or columns
– possibly dependent on contents of a table entry
Database Access Control
Can support a range of policies:
1. Centralized administration:
– A small number of privileged users may grant and revoke
access rights.
2. Ownership-based administration:
– The owner (creator) of a table may grant and
revoke access rights to the table.
3. Decentralized administration:
– The owner of a table can grant and revoke access rights to
a table, the owner of the table may also grant and revoke
authorization to other users, allowing them to grant and
revoke access rights to the table.
SQL Access Controls
Two commands:
– GRANT { privileges | role }
– [ON table]
– TO { user | role | PUBLIC }
– [IDENTIFIED BY password]
– [WITH GRANT OPTION]
• e.g. GRANT SELECT ON ANY TABLE TO ricflair
– REVOKE { privileges | role }
– [ON table]
– FROM { user | role | PUBLIC }
• e.g. REVOKE SELECT ON ANY TABLE FROM ricflair
Typical access rights are:
– SELECT, INSERT, UPDATE, DELETE, REFERENCES
SQL Access Controls
GRANT SELECT ON ANY TABLE TO ricflair
– Enables user ricfair to query any tables in the
database
REVOKE SELECT ON ANY TABLE FROM ricflair
– Revokes the access rights ricfair has on any tables of
the database
The GRANT command
Let A be the creator of the EMPLOYEE relation
What happens after the chronological execution of the
following grant commands:
1) A: GRANT READ, INSERT ON EMPLOYEE TO B WITH GRANT
OPTION
– B will possess the read and insert privileges on the
EMPLOYEE relation.
2) B: GRANT READ, INSERT ON EMPLOYEE TO X
– B is attempting to grant these privileges on EMPLOYEE to
user X
– the database system will refuse the command, for B has
not been given the GRANT option by A
Now, assume the following sequences of grants:
1. A: GRANT READ, INSERT ON EMPLOYEE TO B WITH GRANT
OPTION
2. A: GRANT READ ON EMPLOYEE TO X WITH GRANT OPTION
3. B: GRANT READ, INSERT ON EMPLOYEE TO X
What will happen after each grant execution?
Grant Failure/s why
operation uccess
1
2
3
Now, assume the following sequences of grants:
1. A: GRANT READ, INSERT ON EMPLOYEE TO B WITH GRANT
OPTION
2. A: GRANT READ ON EMPLOYEE TO X WITH GRANT OPTION
3. B: GRANT READ, INSERT ON EMPLOYEE TO X
What will happen after each grant execution?
Grant Failure/s why
operation uccess
1 S A can grant the privileges because A
created the table
2 S Same reason
3 S B has been given the grant option by A
The grant command
What privileges can user X grant after the execution of
the above three grants?
1. X will have READ and INSERT privileges on
EMPLOYEE,
2. but his INSERT privilege is not grantable. (why?)
3. X’s source of INSERT privilege is B,
4. and B did not give X the grant option ( the right, to
further grant it)
IMPLEMENTATION- SYSTEM r
System R maintains two relations for the use
of the authorization subsystem:
1. SYSAUTH
2. and SYSCOLAUTH.
RELATION SYSAUTH
The following columns :
1. USERID : indicates the user who is authorized to perform
these actions on this table.
2. TNAME : specifies the table.
3. TYPE : is ‘R’ if this table is a base relation, ‘V’ if it is a view.
4. A column for each of the privileges READ, INSERT, . . . that
may be granted on a table, excluding update, containing a ‘Y’
or an ‘N’ to indicate whether the user has that privilege on the
named table.
5. UPDATE : indicates authorization for column update.
6. GRANTOPT: indicates whether the privileges in this row are
grantable to other users.
REVOKE COMMANDE FORMAT
Any user who has granted a privilege may subsequently
withdraw it, by issuing the REVOKE command.
Privileges on the named table are denied to the
revokee,
Unless the revokee has another (independent) source
of the privilege?????
The need to REVOKE a previously granted privilege
significantly complicates the authorization mechanism.
Cascading Authorizations
What happens if Ann revokes the access rights to Bob & Chris?
X X X X
X X X
Cascading Authorizations
What happens if Ann revokes the access rights to Bob
& Chris?
What happens if Bob revokes the access rights to David?
IF
a user A revokes an access right:
THEN
Any cascaded access right is also revoked
Unless that access right would exist even
if the original access right from A had never
occurred
Revocation
Let the sequence of grants of a specific privilege on a
given table by any user before any REVOKE commands
be represented by:
Each Gi is the grant of a single privilege.
IF i < j THEN grant Gi occurred earlier than Gj.
If Gi is revoked, we have the following sequence:
Revocation
Formal definition of the semantics of the revocation
Gi : it is like if Gi has never occurred.
So the sequence becomes:
Example
Consider the following sequence:
1. A: GRANT READ, INSERT, UPDATE ON EMPLOYEE TO X
2. B: GRANT READ, UPDATE ON EMPLOYEE TO X
3. A: REVOKE INSERT, UPDATE ON EMPLOYEE FROM X
What are the privileges of X after the revoke?
X will retain READ and UPDATE privileges on EMPLOYEE. Why?
General rule:
IF the revokee possesses other grants of the revoked
privilege from an independent source,
THEN he/she retains these privileges,
Implementation –system R-
the SYSAUTH table is modified
Rather than just, ‘Y’ or ‘N’, each authorization column contains a
timestamp indicating the relative time of a grant.
no two GRANT commands are tagged with the same
timestamp.
(Privileges granted in the same command are tagged with the
same timestamp.)
An authorization column contains:
A. 0, signifying no possession of that right,
B. or some positive value T, signifying that the right was
granted at time T
Implementation example
SYSAUTH table
Example
at time t=35,
B issues the command REVOKE ALL RIGHTS
ON EMPLOYEE FROM X.
Clearly the (X, EMPLOYEE, B) tuple must be
deleted from SYSAUTH.
Example –more
In order to determine which of X’s grants of EMPLOYEE must be
revoked, we form a list of X’s remaining incoming grants:
We also form a list of x grants to others.
The grant of the DELETE privilege by X at time t=25 must be
revoked because his earliest remaining DELETE privilege was
received at time t= 30.
But X’s grants of READ and INSERT are allowed to remain
because they are still “supported” by incoming grants which
occurred earlier in time.
Example more
When B revokes all rights on Employee from X at time t = 25
We must revoke the privileges granted by X at time t=25 because all
those privileges were granted after B granted them to X
Role-Based Access Control
(RBAC)
role-based access control eases administrative burden and improves
security
• application owner
• an end user who owns database objects as part of an application
• end user
• an end user who operates on database objects via a particular
application but does not own any of the database objects
• administrator
• user who has administrative responsibility for part or all of the
database
a database RBAC needs to provide the following capabilities:
create and delete roles
define permissions for a role
assign and cancel assignment of users to roles
Role-Based Access Control
DB RBAC must manage roles and their users (cf.
RBAC on Microsoft’s SQL Server)
1. Fixed server roles (roles defined at the server
level)- independent of any user database.; meant
for administrative tasks.
2. Fixed database roles. Roles operating at the level
of an individual user database.
Table 5.2
Fixed Roles
in
Microsoft
SQL
Server
Inference
the process of
performing queries
and deducing
unauthorized
information from
the legitimate
responses received
inference channel
is the information
transfer path by
which unauthorized
data is obtained
Inference techniques
Two inference techniques can be used to
derive additional information:
1. Analyzing functional dependencies between
attributes within a table or across tables;
2. Merging views with the same constraints.
Inference across tables
Can you link employee name
and salary?
Inference across tables.
A user who knows:
– the structure of the Employee table
– and who knows that the view tables maintain the
same row order as the Employee table
is then able to merge the two views to
construct the table 5.c
Inference Countermeasures: two approaches
1. Inference detection at database design
– alter database structure or access controls
2. Inference detection at query time
– by monitoring and altering or rejecting queries
Both approaches Need some inference
detection algorithm
– a difficult problem
– cf. employee-salary example
Example: Employee /salary
Database:
1. Table= Name, address, salary
2. Available to a clerk in HR dpt.
3. But association between name and salary
restricted to superior role
Example (1): Employee /salary
Solution : Split the table into three tables
Clerck’s role has access to :
–Employees (Employee#, Name, Address)
–Salaries (S#, salary)
Administrative’s role has access to
–Emp-Salary (EMP#, S#)
Example (2): Employee /salary
Solution : Split the table into three tables
But a nonsensitive attribute is added (start-date to
the Salaries table)
Clerck’s role has access to :
– Employees (Employee#, Name, Address)
– Salaries (S#, salary, start-date)
Administrative role has access to
– Emp-Salary (EMP#, S#)
What might happen?
Example (2): Employee /salary
Solution : Split the table into three tables
But a nonsensitive attribute is added (start-date to
the employees table)
Clerck’s role has access to :
– Employees (Employee#, Name, Address start-date,)
– Salaries (S#, salary)
Administrative role has access to
– Emp-Salary (EMP#, S#)
Statistical Databases
Provides data of a statistical nature
– e.g. counts, averages
Two types:
– pure statistical database
– ordinary database with statistical access
• some users have normal access, others statistical
Access control objective to allow statistical
use without revealing individual entries
security problem is one of inference
Statistical Database Security
Use a characteristic formula C
– a logical formula over the values of attributes
– e.g. (Sex=Male) AND ((Major=CS) OR (Major=EE))
Query set X(C) of characteristic formula C, is
the set of records matching C
A statistical query is a query that produces a
value calculated over a query set
There are N individuals, or entities, in the
table and M attributes. Abstract Model of a
Each attribute Aj has |Aj| possible values,
with xij denoting the value of attribute j for Relational Database
entity i .
Statistical Database Example
C = (GP > 3.7)
specifies all students
whose grade point
average exceeds 3.7.
C = Female CS, X(C)
consists of records 1 and 4,
the records for Allen and
Davis
The statistic sum(EE Female, GP) = 2.5
compromises the database if the user Statistical Database Example
knows that Baker is the only female EE
student
Consider the following
sequence of two queries:
1. count(EE Female) = 1
2. sum(EE Female, GP) = 2.5
Inference in Statistical
database is called
compromise
Protecting
Against
Inference in
SDB
Query restriction
1. Rejects a query that can lead to a compromise.
2. The answers provided are accurate.
3. Restriction in this context simply means that
some queries are denied.
Query restriction
The simplest form of query restriction is query size
restriction.
For a database of size N (number of rows, or records),
a query q(C) is permitted only if the number of records
that match C satisfies:
k <= |X(C)| <= N-k, where k is a fixed integer greater
than 1.
Thus, the user may not access any query set of less
than k records.
Why an upper bound?
Query restriction
The upper bound of N – k guarantees that the
user does not have access to statistics on query
sets of less than k records.
In practice, queries of the form q(All) are
allowed, enabling users to easily access
statistics calculated on the entire database.
Assume that the restriction is only k <= |X(C)|
Assume no upper bound limit
Query restriction
What happens when you compute: q(all),
then q(~C)
q(C)= q(all)-q(~C)
Query size restriction counters attacks based
on very small query sets
Formal query answer Informal statement
Count (M.CS) 3 Number of males in CS dpt
COUNT(F.Prof.(CS+Math)) 2 Nbr of female prof. in either CS or Math Dpt
SUM(M+~(CS);Sal) $176K Total salaries among either males or NonCS staff
SUM($15K Sal;Contr) $150 Total of contributions by persons earning $15K.
Compromise= two types
Compromise: questioner deduces , from the response to one or
more queries, confidential information of which he/she was not
previously aware.
Negative compromise
Positive compromise
IF the questioner can deduce a
IF the questioner can deduce value is not in a given
the value in a given category or category or data field of a
data field of a given individual given individual
Compromise example with a small
query set
Positive compromise Negative Compromise
Questioner knows that Dodd Questioner knows that Dodd
is a female CS professor, poses is a female CS professor,
two queries in table I: poses two queries in table I:
1. Count(F.CS.PROF) =1 1. Count(F.CS.PROF) =1
2. COUNT(F.CS.Prof.$15KSal)=1 2. COUNT(F.CS.Prof.$15KSal)=0
3. Dodd’ salary is revealed 3. The questioner would
because she is the only
deduce that Dodd’s salary in
possible individual satisfying
the characteristics of both
not 15K
queries
Compromise with large query sets
COUNT C n COUNT C
Compromises can also be achieved with
large query sets
Compromise with large query set-example
Determine n with the following query with a tautology
COUNT prof prof 12
Ask the query: COUNT F CS prof 11
12-11=1 is the number of female CS professors
SUM prof prof ; Sal $194 K
SUM F CS prof ; Sal $179 K
Subtracting the two queries yield the salary of this
person ( 194-179=$15K)
Query size restriction
For a database of size n ( number of records, or rows)
A query C is permitted Only IF:
X(C) the number of records that match C satisfies :
k X C n k
k ; k 1;
Query size restriction can prevent trivial
compromise attacks
But compromise is still possible for even large
values of k
Tracker
A tracker attack is a particular type of indirect attack
which is present in most statistical databases
A linear system vulnerability is an algebraic relatio
between query responses
Individual tracker compromise
Counting queries answerable for query set size
in the range [k, n-k]; 1< k ≤ n/2
Questioner knows from external sources that
individual I, whose record is in the database, is
uniquely characterized by the formula C.
The questioner wants to know if I has a
characteristic a.
First solution
Questioner can evaluate COUNT (C.a)
But COUNT (C.a) ≤ COUNT (C) =1
This method cannot be used
Schlorer- Individual tracker
The questioner divides his or her knowledge of an individual
into parts such that queries can be made without violating
the query restriction
If the questioner divides C into two parts he may be able to
compute COUNT(C.a) from two answerable queries
involving the two parts.
Individual tracker
Assume the formula C that can identify I can be
decomposed into C= A.B such that:
COUNT A B and COUNT A are both answerable
k COUNT A B COUNT A n k
The formula T A B is called
the indvidual tracker of I
Example of a tracker
C=AB
T
A B
T= A~B
Individual tracker compromise
COUNT (C ) COUNT ( A) COUNT (T )
COUNT (C a) COUNT (T A a) COUNT (T )
COUNT(C.a)=?
0 1
Negative COUNT(C )
Arbitrary statistics about
compromise I can be computed from
I does not have Positive q(C )=q( A) – q(T)
characteristic a compromise
I does have
characteristic a
Individual tracker
Compromise (proof)
Q(A)?
Q(T+A.a)?
Example of individual tracker with k=2, n=12
table 1
Query set size restriction
2 ≤ COUNT (C) ≤ 12-2=10
Questioner believes C= “ F.CS.PROF” characterizes Dodd
But the restriction k=2 prevents the use of the previous methods (1&2)
But questioner can make a tracker T
T A B
A F " and B CS prof
Proof :
COUNT ( F CS prof ) COUNT ( F ) COUNT ( F CS prof )
5 4 1
So, Dodd is the only individual characterized by C
Individual tracker k=2
How about Dodd salary?
Apply same methods with different salary values
C A B " F CS prof "
A " F "; B " CS . prof "
T A B
a $25 KSal
We apply :
COUNT (C a ) COUNT (T A a ) COUNT (T )
COUNT ( F CS prof 25KSal ) Dodd’s salary is not 25 K.
COUNT ( F CS prof F $25KSal )
COUNT ( F CS prof ) 4 4 0
Tracker Attacks
divide queries into parts
– C = C1.C2
– count(C.D) = count(C1) - count (C1. ~C2)
combination is called a tracker
each part acceptable query size
overlap is desired result
Other Query Restrictions
1. Query set overlap control
– limit overlap between new & previous queries
– has problems and overheads
2. Partitioning –extreme overlap control
– cluster records into exclusive groups
– only allow queries on entire groups, no subsets of
groups.
3. Query denial and information leakage
– denials can leak information
– to counter must track queries from user
Query set overlap control
A query size restriction can be defeated by issuing a succession
of queries with considerable overlap in the query set.
IDEA:
1. Check successive queries against Overlap
2. Look at the number of common records in any query.
IF that number exceeds a given threshold, the query is not
allowed
A query q(C) is only allowed if the n umber of records that match
C satisfies
|X (C) X (D) | ≤ r, r > 0
For all q(D) that have been answered for this user, and r is a
fixed integer greater than zero.
Query set overlap control
Several vulnerabilities
1. Users can cooperate to compromise the database
2. Statistics for both a set and its subset ( ex. All
patients, and patients undergoing a giving
treatment) cannot be released- limiting the
usefulness of the database
3. For each user, a profile has to be kept up to date.
Partitioning
can be viewed as taking query set overlap control to its logical
extreme, by not allowing overlapping queries at all.
, the records in the database are clustered into a number of
mutually exclusive groups, e.g. values for a given year.
The user may only query the statistical properties of each group
as a whole.
Partitioning solves some security problems, but has some
drawbacks.
The users ability to extract useful statistics is reduced, and there
is a design effort in constructing and maintaining the partitions.
Query denial and information
leakage
A general problem with query restriction techniques:
the denial of a query may provide sufficient clues that an
attacker can deduce/leak underlying information.
One approach to counter this threat:
A. the system monitors all queries from a given source
B. and decides on the basis of the queries so far posed whether
to deny a new query.
C. The decision is based solely on the history of queries and
answers and the new query.
Example- query denial and info
leakage
DB consists of real-valued entries
A query is denied only if it would enable the requestor to
deduce a value.
Assume that:
a) the requestor query is sum(x1,x2,x3), and the response is 15.
b) when the requester queries max (x1,x2,x3) the query is denied
What can the requestor deduce?
– Because sum(x1,x2,x3)=15, we know that max (x1,x2,x3) cannot be
less than 5.
– But if max (x1,x2,x3) > 15, the query cannot be denied because no
information would be leaked
– So max (x1,x2,x3) =5.
Auditing= Countering-query denial
with information leakage
Keeping up-to-date logs of all queries made by each user
Constantly checking for compromise whenever a new query is
issued.
Auditing mechanisms avoid linear systems attacks
x1
x
2
x 3
q
1
x1
x
2
x 4
q
2
x1
x 3
x 4
q
3
x1
x 3
x 4
q
4
1
x1
q q q 2 q
1 2 3 4
3
Perturbation
Add noise to statistics generated from data
– will result in differences in statistics
Data perturbation techniques
– data swapping
– generate statistics from probability distribution
Output perturbation techniques
– random-sample query
– statistic adjustment
Must minimize loss of accuracy in results
– Too little an error: user can infer a close approximation to
protected values
– Too great an error: resulting statistics are unusable
Data swapping
D D'
Record Sex Major GP Sex Major GP
1 Female Bio 4.0 Male Bio 4.0
2 Female CS 3.0 Male CS 3.0
3 Female EE 3.0 Male EE 3.0
4 Female Psy 4.0 Male Psy 4.0
5 Male Bio 3.0 Female Bio 3.0
6 Male CS 4.0 Female CS 4.0
7 Male EE 4.0 Female EE 4.0
8 Male Psy 3.0 Female Psy 3.0
Database D is transformed into database D’
Database Encryption
The database: the most valuable information resource for any
organization. IT protected by multiple layers of security:
a. firewalls, authentication,
b. O/S access control systems,
c. DB access control systems,
d. database encryption
Encryption- often implemented with particularly sensitive
data. can be applied to:
a. the entire database ,
b. at the record level (selected records)
c. the attribute level, (selected columns)
d. or level of the individual field
Database Encryption has also its disadvantages
Disadvantages to encryption:
a. Key management
• Users who are authorized to access the database need the
encryption/decryption key
• Several categories of users and several applications may need
to access the database
• Secure keys must be provided to each categories of users or
applications
b. Inflexibility
• if an entire database, or part of a database is encrypted,
• How would you perform a search??
• What happens when a record is updated?
One example of database
encryption scheme
As a small or medium sized organization,
Outsourcing the DBMS and the database to a
service provider can be a cost/effective
solution.
But what about the confidentiality of your
data?
Encrypting the entire database?
How feasible a solution is it?
One example of database
encryption scheme
Your data is secure
The service provider will
not be able to access the
confidential data
If you encrypt the entire
database
and you don’t provide the
decryption key(s) to the BUT ……….
service provider
One example of database
encryption scheme
Your data is secure
The service provider will
not be able to access the
If you encrypt the entire confidential data
database
and you don’t provide the
decryption key(s) to the
service provider BUT
Authorized users will not
be able to access
individual data based on
searches or indexing on
key parameters
One example of database
encryption scheme
Every time the need to use the database
They would have to:
1. Download the entire database,
2. Decrypt the tables,
3. And work with the decrypted tables
What if the database gets updated?
What may happen?
Inflexible solution
An alternate solution
Data owner
User
Client
server
Data owner –
organization that
produces data to be
Database
made available for
controlled release Encryption
User – human entity
that presents queries to
the system
Client – frontend that
transforms user queries
into queries on the
encrypted data stored
on the server
Server – an
organization that
receives the encrypted
data from a data owner
and makes them
available for distribution
to clients
Scenario
1. The user issues an SQL query for fields from one or
more records with a specific value of the primary key.
2. The query processor at the client encrypts the
primary key, modifies the SQL query accordingly, and
transmits the query to the server.
3. The server processes the query using the encrypted
value of the primary key and returns the appropriate
record or records.
4. The query processor decrypts the data and returns
the results.
5. Are we done yet?
Example - look at following query
SELECT Ename, Eid, Ephone
FROM Employee
WHERE Did 15
eid ename salary addr did
23 Tom 70K Maple 45
860 Mary 60K Main 83
320 John 50K River 50
875 Jerry 55K Hopew 92
ell
Example - look at following query
Assuming the encryption key k is used
And the encryption value the department id 15
is E(k,15)= 1000110111001110
The query becomes
SELECT Ename, Eid, Ephone
FROM Employee
WHERE Did 1000110111001110
eid ename salary addr did
How 23 Tom 70K Maple 45
flexible is 860 Mary 60K Main 83
this 320 John 50K River 50
solution? 875 Jerry 55K Hopew 92
ell
How flexible is this scheme?
Assume that the table employee contains a
salary attribute
And assume that a user wants to retrieve all
records for salaries less than $70K.
How would you perform such operation?
Remember that the attribute value for salary in
each records is encrypted.
And the set of encrypted records no not
preserve the ordering of values of the original
attribute
Encryption Scheme for Database of
Figure 5.7
Alternate solution
Each record (row) of a table in the database is
encrypted as a block.
Each row Ri is treated as a contiguous block
Bi = ( xi1 || xi2 ||… || xiM ).
Thus, each attribute value in Ri , regardless of
whether it is text or numeric, is treated as a
sequence of bits,
all of the attribute values for that row are
concatenated together to form a single binary
block.
The entire row is encrypted, expressed as
E( k , Bi ) = E( k , ( xi1 || xi2 || … || xiM )).
Alternate solution
To assist in data retrieval, attribute indexes
are associated with each table.
For some or all of the attributes an index
value is created.
For each row Ri of the unencrypted
database,
the mapping is as follows
( xi1 , xi2 , … , xiM ) [E( k , Bi ), Ii1 , Ii2 , … , IiM ]
Alternate solution
For each row in the original database, there is one row in
the encrypted database.
The index values are provided to assist in data retrieval.
We can proceed as follows.
1. For any attribute, the range of attribute values is divided
into a set of non-overlapping partitions that encompass all
possible values,
2. and an index value is assigned to each partition.
Example
Suppose that employee ID ( eid ) values lie in the range
[1, 1000].
We can divide these values into five partitions:
[1, 200], [201, 400], [401, 600], [601, 800], and [801,
1000];
and then assign index values 1, 2, 3, 4, and 5,
respectively.
For a text field, we can derive an index from the first letter
of the attribute value.
For the attribute ename , let us assign index
1 to values starting with A or B, index 2 to values starting
with C or D, and so on.
Example
Similar partitioning schemes can be used for each of the
attributes.
The values in the first column represent the encrypted
values for each row. The actual values depend on the
encryption algorithm and the encryption key.
The remaining columns show index values for the
corresponding attribute values.
The mapping functions between attribute values and
index values constitute metadata that are stored at the
client and data owner locations but not at the server.
Example
This arrangement provides for more efficient data retrieval.
Suppose, for example, a user requests records for all
employees with eid < 300.
1. The query processor requests all records with I( eid ) ≤ 2.
2. These are returned by the server.
3. The query processor decrypts all rows returned, discards
those that do not match the original query, and returns the
requested unencrypted data to the user.
Example
The indexing scheme just described does provide a certain
amount of information to an attacker, namely a rough
relative ordering of rows by a given attribute.
To obscure such information, the ordering of indexes can
be randomized.
For example, the eid values could be partitioned by
mapping
[1, 200], [201, 400], [401, 600], [601, 800], and [801, 1000]
into 2, 3, 5, 1, and 4, respectively.
Example
Because the metadata are not stored at the server, an
attacker could not gain this information from the server
Other features may be added to this scheme.
1. To increase the efficiency of accessing records by means
of the primary key, the system could use the encrypted
value of the primary key attribute values, or a hash value..
2. Different portions of the database could be encrypted with
different keys, so that users would only have access to
that portion of the database for which they had the
decryption key.
3. This latter scheme could be incorporated into a role-based
access control system.
Cloud Security
NIST defines cloud computing as follows [MELL11]:
“A model for enabling ubiquitous, convenient, on-demand
network access to a shared pool of configurable
computing resources (e.g., networks, servers, storage,
applications, and services) that can be rapidly provisioned
and released with minimal management effort or service
provider interaction. This cloud model promotes
availability and is composed of five essential
characteristics, three service models, and four deployment
models.”
Cloud Computing Elements
Cloud computing elements (NIST SP800-144)
Enterprises loose
control over
resources,
services, and
application
Figure 5.13
Cloud Computing Context
Cloud Security Risks
The Cloud Security Alliance (CSA10) lists the
following as the top cloud specific security
threats:
abuse and insecure
malicious
nefarious use of interfaces and
insiders
cloud computing APIs
shared
data loss or account or
technology
leakage service hijacking
issues
unknown risk
profile
Cloud computing Risks
Abuse and nefarious use of cloud computing:
it is relatively easy to register and begin using cloud services,
anyone with a valid credit card can register and immediately
begin using cloud services
Some CP even offering free limited trial periods.
Attackers can get inside the cloud to conduct various attacks,
such as spamming, malicious code attacks, and denial of service.
Cloud computing Risks
Insecure interfaces and APIs:
CPs expose a set of software interfaces or APIs that customers
use to manage and interact with cloud services.
The security and availability of general cloud services is
dependent upon the security of these basic APIs.
From authentication and access control to encryption and
activity monitoring, these interfaces must be designed to protect
against both accidental and malicious attempts to circumvent
policy.
CP risks
Malicious insiders:
Under the cloud computing paradigm, an organization
relinquishes direct control over many aspects of security
It confers an unprecedented level of trust onto the CP.
One grave concern is the risk of malicious insider activity.
Cloud architectures necessitate certain roles that are extremely
high-risk.
Examples include CP system administrators and
managed security service providers.
Shared technology issues:
IaaS vendors deliver their services in a scalable way by
sharing infrastructure.
Often, the underlying components that make up this
infrastructure (CPU caches, GPUs, etc.) were not
designed to offer strong isolation properties for a multi-
tenant architecture.
CPs typically approach this risk by the use of isolated
virtual machines for individual clients.
This approach is still vulnerable to attack, by both insiders
and outsiders, and so can only be a part of an overall
security strategy.
Data loss or leakage:
For many clients, the most devastating impact from a
security breach is the loss or leakage of data.
Account or service hijacking:
Account and service hijacking, usually with stolen
credentials, remains a top threat.
With stolen credentials, attackers can often access
critical areas of deployed cloud computing services,
Allowing them to compromise the confidentiality, integrity,
and availability of those services.
Unknown risk profile:
In using cloud infrastructures, the client necessarily
cedes control to the cloud provider on a number of
issues that may affect security.
Thus the client must pay attention to and clearly define
the roles and responsibilities involved for managing
risks.
For example, employees may deploy applications and
data resources at the CP without observing the normal
policies and procedures for privacy, security, and
oversight.
Data Protection in the Cloud
the threat of data compromise increases in
the cloud
risks and challenges
that are unique to the
multi-instance model
cloud
provides a unique DBMS
running on a virtual
machine instance for
each cloud subscriber multi-tenant model
architectural or provides a predefined environment for the gives the appearance of exclusive use of
operational gives the subscriber cloud subscriber that is shared with other the instance but relies on the cloud
characteristics of the complete control over tenants typically through tagging data with provider to establish and maintain a secure
cloud environment administrative tasks a subscriber identifier database environment
related to security
Summary
database role-based access control (RBAC)
structured collection of data
application owner/end user other than
database management system application owner/administrator
(DBMS)
programs for constructing and maintaining inference channel
the database information transfer path by which
unauthorized data is obtained
structured query language (SQL)
language used to define statistical database (SDB)
schema/manipulate/query data in a
relational database query restriction/perturbation/data
swapping/random-sample query
relational database
table of data consisting of rows (tuples) and database encryption
columns (attributes)
multiple tables tied together by a unique cloud computing/security/
identifier that is present in all tables
data protection
database access control multi-instance/
centralized/ownership-based/decentralized multi-tenant model
administration
Database Encryption
Databases typical a valuable info resource
– protected by multiple layers of security: firewalls,
authentication, O/S access control systems, DB access
control systems, and database encryption
Can encrypt
– entire database - very inflexible and inefficient
– individual fields - simple but inflexible
– records (rows) or columns (attributes) - best
• also need attribute indexes to help data retrieval
varying trade-offs
Database Encryption
Summary
introduced databases and DBMS
relational databases
database access control issues
– SQL, role-based
inference
statistical database security issues
database encryption