ENCRYPTIO
N
1
ENCRYPTION
Encryption refers to the process of transforming
data into a form that is unreadable.
Terms used:
Encryption
Decryption
Encryption algorithm
2
CONT
In the context of databases, encryption is used to
store data in a secure way,
Many databases today store sensitive customer
information, such as credit card numbers, names,
fingerprints, signatures, and identification
numbers such social-security numbers.
OVERVIEW
Encryption techniques
Encryption Support in Databases
Encryption and authentication
1.ENCRYPTION TECHNIQUES
There are a vast number of techniques for the encryption of
data. Simple encryption techniques may not provide
adequate security, since it may be easy for an unauthorized
user to break the code.
As an example of a weak encryption technique, consider the
substitution of each character with the next character in
the alphabet. Thus,
Perryridge
becomes
Qfsszsjehf
CONT
A good encryption technique has the following properties:
It is relatively simple for authorized users to encrypt and
decrypt data.
It depends not on the secrecy of the algorithm, but rather
on a parameter of the algorithm called the encryption key,
which is used to encrypt data. In a symmetric-key
encryption technique, the encryption key is also used to
decrypt data. In contrast, in public-key (also known as
asymmetric-key) encryption techniques, there are two
different keys, the public key and the private key, used to
encrypt and decrypt the data.
CONT
Its decryption key is extremely difficult for an intruder to
determine, even if the intruder has access to encrypted
data. In the case of asymmetric-key encryption, it is
extremely difficult to infer the private key even if the public
key is available.
ADVANCED ENCRYPTION
STANDARD (AES)
The Advanced Encryption Standard (AES) is a
symmetric-key encryption algorithm that was adopted as
an encryption standard by the U.S. government in 2000,
and is now widely used.
The standard is based on the Rijndael algorithm (named
for the inventors V. Rijmen and J. Daemen).
The algorithm operates on a128-bit block of data at a time,
while the key can be 128, 192, or 256 bits in length.
8
CONT
The algorithm runs a series of steps to jumble up the bits in a
data block in a way that can be reversed during decryption,
and performs an XOR operation with a 128-bit round key
that is derived from the encryption key.
A new round key is generated from the encryption key for each
block of data that is encrypted.
During decryption, the round keys are generated again from
the encryption key and the encryption process is reversed to
recover the original data. An earlier standard called the Data
Encryption Standard (DES), adopted in 1977, was very widely
used earlier.
CONT
For any symmetric-key encryption scheme to work,
authorized users must be provided with the encryption key
via a secure mechanism.
This requirement is a major weakness, since the scheme is no
more secure than the security of the mechanism by which the
encryption key is transmitted.
Public-key encryption is an alternative scheme that avoids
some of the problems faced by symmetric-key encryption
techniques. It is based on two keys: a public key and a private
key. Each user Ui has a public key Ei and a private key Di .
10
CONT
All public keys are published: They can be seen by anyone.
Each private key is known to only the one user to whom the
key belongs. If user U1 wants to store encrypted data, U1
encrypts them using public key E1. Decryption requires the
private key D1.
Because the encryption key for each user is public, it is
possible to exchange information securely by this scheme. If
user U1 wants to share data with U2, U1 encrypts the data
using E2, the public key of U2. Since only user U2 knows how
to decrypt the data, information can be transferred securely.
For public-key encryption to work, there must be a scheme for
encryption such that it is infeasible (that is, extremely hard) to
deduce the private key, given the public key.
11
CONT
Such a scheme does exist and is based on these conditions:
o
There is an efficient algorithm for testing whether or not a
number is prime.
No efficient algorithm is known for finding the prime
factors of a number.
12
CONT
For purposes of this scheme, data are treated as a collection of
integers. We create a public key by computing the product of two
large prime numbers: P1 and P2. The private key consists of the
pair (P1, P2).
The decryption algorithm cannot be used successfully if only the
product P1P2 is known; it needs the individual values P1 and P2.
Since all that is published is the product P1P2, an unauthorized
user would need to be able to factor P1P2 to steal data. By
choosing P1 and P2 to be sufficiently large (over 100 digits), we
can make the cost of factoring P1P2 prohibitively high (on the
order of years of computation time, on even the fastest computers).
13
CONT
Although public-key encryption by this scheme is secure, it
is also computationally very expensive. A hybrid scheme
widely used for secure communication is as follows:
asymmetric encryption key (based, for example, on AES) is
randomly generated and exchanged in a secure manner
using a public-key encryption scheme, and symmetric-key
encryption using that key is used on the data transmitted
subsequently.
14
CONT
Encryption of small values, such as identifiers or names, is
made complicated by the possibility of dictionary attacks,
particularly if the encryption key is publicly available. For
example, if date-of-birth fields are encrypted, an attacker
trying to decrypt a particular encrypted value e could try
encrypting every possible date of birth until he finds one
whose encrypted value matches e. Even if the encryption key
is not publicly available, statistical information about data
distributions can be used to figure out what an encrypted
value represents in some cases, such as age or zip code.
15
CONT
For example, if the age 18 is the most common age in a
database, the encrypted age value that occurs most often
can be inferred to represent 18.
Dictionary attacks can be deterred by adding extra random
bits to the end of the value before encryption (and removing
them after decryption).
Such extra bits, referred to as an initialization vector in
AES, or as salt bits in other contexts, provide good
protection against dictionary attack.
16
2.ENCRYPTION SUPPORT IN
DATABASES
Many file systems and database systems today support
encryption of data. Such encryption protects the data from
someone who is able to access the data, but is not able to
access the decryption key. In the case of file-system
encryption, the data to be encrypted are usually large files
and directories containing information about files.
In the context of databases, encryption can be done at
several different levels.
17
CONT
At the lowest level, the disk blocks containing database
data can be encrypted, using a key available to the
database-system software.
When a block is retrieved from disk, it is first decrypted
and then used in the usual fashion. Such disk-block level
encryption protects against attackers who can access the
disk contents but do not have access to the encryption key.
18
CONT
At the next higher level, specified (or all) attributes of a
relation can be stored in encrypted form. In this case, each
attribute of a relation could have a different encryption key.
Encryption of specified attributes minimizes the overhead
of decryption, by allowing applications to encrypt only
attributes that contain sensitive values such as credit-card
numbers. However, when individual attributes or relations
are encrypted, databases typically do not allow primary
and foreign key attributes to be encrypted, and do not
support indexing on encrypted attributes.
19
CONT
Encryption also then needs to use extra random bits to
prevent dictionary attacks, as described earlier.
A decryption key is obviously required to get access to
encrypted data. A single master encryption key may be
used for all the encrypted data; with attribute level
encryption, different encryption keys could be used for
different attributes.
20
CONT
In this case, the decryption keys for different attributes can
be stored in a file or relation (often referred to as wallet),
which is itself encrypted using a master key.
A connection to the database that needs to access
encrypted attributes must then provide the master key;
unless this is provided, the connection will not be able to
access encrypted data. The master key would be stored in
the application program (typically on a different computer),
or memorized by the database user, and provided when the
user connects to the database.
21
CONT
Encryption at the database level has the advantage of
requiring relatively low time and space overhead, and does
not require modification of applications.
For example, if data in a laptop computer database need to
be protected from theft of the computer itself, such
encryption can be used. Similarly, someone who gets access
to backup tapes of a database would not be able to access
the data contained in the backups without knowing the
decryption key.
22
CONT
An alternative to performing encryption in the
database is to perform it before the data are sent to
the database. The application must then encrypt the
data before sending it to the database, and decrypt
the data when it is retrieved.
This approach to data encryption requires significant
modifications to be done to the application, unlike
encryption performed in a database system.
23
3. ENCRYPTION AND AUTHENTICATION
Password-based authentication is used widely by operating systems
as well as databases. However, the use of passwords has some
drawbacks, especially over a network.
If an eavesdropper is able to sniff the data being sent over the
network, she may be able to find the password as it is being sent
across the network. Once the eavesdropper has a user name and
password, she can connect to the database, pretending to be the
legitimate user.
A more secure scheme involves a challengeresponse system. The
database system sends a challenge string to the user. The user
encrypts the challenge string using a secret password as encryption
key and then returns the result.
24
CONT
The database system can verify the authenticity of the user
by decrypting the string with the same secret password and
checking the result with the original challenge string.
This scheme ensures that no passwords travel across the
network.
Public-key systems can be used for encryption in
challengeresponse systems.
The database system encrypts a challenge string using the
users public key and sends it to the user.
The user decrypts the string using her private key, and
returns the result to the database system. The database
system then checks the response.
25
CONT
This scheme has the added benefit of not storing the secret password
in the database, where it could potentially be seen by system
administrators.
Storing the private key of a user on a computer (even a personal
computer) has the risk that if the computer is compromised, the key
may be revealed to an attacker who can then masquerade as the user.
Smart cards provide a solution to this problem. In a smart card, the
key can be stored on an embedded chip; the operating system of the
smart card guarantees that the key can never be read, but
allows data to be sent to the card for encryption or decryption, using
the private key
26
THE END
27