Introduction to SQL
Overview of the SQL Query Language:
   IBM developed original version of SQL (Sequel) language developed as
    part of System R project at the IBM San Jose Research Laboratory in the
    early 1970s.
   Renamed Structured Query Language (SQL)
   The SQL language has several parts:
       o Data Definition Language (DDL): SQL DDL provides commands for
           defining relation schemas, deleting relations, and modifying
           relation schemas.
        o Data Manipulation Language (DML): SQL DML provides the ability
          to query information from the database and to insert tuples into,
          delete tuples from, and modify in the database.
        o Integrity: The SQL DDL includes commands for specifying integrity
          constraints that the data stored in the database must satisfy.
          Updates that violate integrity constraints are disallowed.
        o View definition: The SQL DDL includes commands for defining
          views.
        o Transaction control: SQL includes commands for specifying the
          beginning and ending of transactions.
        o Embedded SQL and dynamic SQL: Embedded and dynamic SQL
          define how SQL statements can be embedded within general
          purpose programming languages, such as C, C++, and Java.
        o Authorization: The SQL DDL includes commands for specifying
          access rights to relations and views.
SQL Data Definition:
The SQL Data Definition Language (DDL) allows the specification of
information about relations, including:
     The schema for each relation.
     The types of values associated with each attribute.
     The Integrity constraints.
     The set of indices to be maintained for each relation.
     Security and authorization information for each relation.
     The physical storage structure of each relation on disk.
  Basic types:
      char(n). Fixed length character string, with user-specified length n.
      varchar(n). Variable length character strings, with user-specified
         maximum length n.
      int. Integer (a finite subset of the integers that is machine-
         dependent).
      numeric(p,d). Fixed point number, with user-specified precision of p
         digits, with n digits to the right of decimal point.
      Each type may include a special value called the null value.
      DDL Commands:
       Create
       Alter
       truncate
       Drop
      DML Commands
       Select
       Insert
       Update
       Delete
      DCL Commands:
       Grant
       Revoke
      TCL Commands:
 Commit
 Rollback
 Savepoint
Basic Schema Definition:
 Create table: In SQL relation is defined using thecreate table
   command.
 The general form of create table command:
   o r is the name of the relation
   o each Ai is an attribute name in the schema of relation r
   o Di is the data type of values in the domain of attribute Ai
Example:
Drop and Truncate:
 Drop table <tablename>:Deletes the table and its contents.
 TRUNCATE table <tablename>:It also deletes the contents of table.
Alter:
The SQL ALTER TABLE command is used to add, delete or modify
columns in an existing table.
 The syntax of an ALTER TABLE command to add a New Column in an
  existing table is:
     o alter table<tablename> add columnname Datatype;
 The syntax of an ALTER TABLE command to DROP COLUMN in an
  existing table is:
     o alter table <tablename> drop column columnname;
 The basic syntax of an ALTER TABLE command to change the DATA
  TYPE of a column in a table is:
     o alter table<tablename> modify columnname datatype;
Basic structure of SQL queries:
     Select clause:
      The SQL select provides the ability to retrieve information.
      Syntax:
        select <attribute-list>
        from <table-list>
        where <condition>;
        or
        select A1, A2, …, An
        from r1, r2, …, rm
        where P;
             o Ai represents an attribute
             o Rirepresents a relation
             o P is a predicate.
             The result of an SQL query is a relation.
      The select clause list the attributes desired in the result of a query
          o corresponds to the projection operation of the relational
             algebra.
      Example: find the names of all instructors:
                select name
                from instructor;
 SQL allows duplicates in relations as well as in query results.
 To force the elimination of duplicates, insert the keyword distinct
  after select.
 Find the names of all departments with instructor, and remove
  duplicates:
                     select distinct dept_name
                     from instructor;
 The keyword all specifies that duplicates not be removed.
                    select alldept_name
                    from instructor;
   An asterisk in the select clause denotes “all attributes”
          select *
          from instructor;
   The select clause can contain arithmetic expressions involving the
    operation, +, –, *, and /, and operating on constants or attributes of
    tuples.
    Example:
                        selectid, name, salary* 1.1
                        from instructor;
     Would return a relation that is the same as the instructor relation,
     except that the value of the attribute salary is multiplied by 1.1.
where clause:
   The where clause specifies conditions that the result must satisfy
        o Corresponds to the selection predicate of the relational algebra.
   To find all instructors in Computer Sciencedepartment with salary >
    70000.
           select name
           from instructor
           where dept_name ='Comp. Sci.'and salary > 70000
    Comparison results can be combined using the logical connectives and,
     or, and not.
    Comparisons can be applied to results of arithmetic expressions.
    The operands of the logical connectives can be expressions involving the
     comparison operators <, <=, >, >=, =, and <>.
Queries on multiple relations:
Retrieve the names of all instructors, along with their department names and
department building name.
To answer the query, each tuple in the instructor relation must be matched
with the tuple in the department relation whose dept_name value matches
the dept_name value of the instructor tuple.
Note that the attribute dept_name occurs in both the relations instructor and
department, and the relation name is used as a prefix (in
instructor.dept_name, and department.dept_name) to make clear to which
attribute we are referring. In contrast, the attributes name and building appear
in only one of the relations, and therefore do not need to be prefixed by the
relation name.
 The meaning of an SQL query can be understood as follows:
Natural join:
The natural join operation operates on two relations and produces a relation
as the result.
Example: For all instructors in the university who have taught some course,
find their names and the course ID of all courses they taught.
The example of the relations instructor and teaches, computing instructor
natural join teaches considers only those pairs of tuples where both the tuple
from instructor and the tuple from teaches have the same value on the
common attribute, ID.
A from clause in an SQL query can have multiple relations combined using
natural join, as shown here:
where each Ei can be a single relation or an expression involving natural joins.
Additional basic operations:
The rename operation:
SQL provides a way of renaming the attributes of a result relation. It uses the
as clause, taking the form:
                             old-name as new-name
For example, if we want the attribute name to be replaced with the name
instructor_name, we can rewrite the query as
It is also possible to renamea relation. One reason to rename a relation is to
replace a long relation name with a shortened version that is more convenient
to use elsewhere in the query.
select name as instructor_name, course_id
frominstructor i, teaches t
wherei.id=t.id;
In the above query, i and t can be thought of as copies of the relation
instructor, but more precisely, they are declared as aliases, that is as
alternative names, for the relation instructor. An identifier, such as i and t, that
is used to rename a relation is referred to as a correlation name in the SQL
standard, but is also commonly referred to as a table alias, or a correlation
variable.
The string operations:
SQL specifies strings by enclosing them in single quotes, for example,
'Computer'. The SQL standard specifies that the equality operation on strings is
case sensitive; as a result the expression 'comp. sci.' = 'Comp. Sci.' evaluates
to false.
Pattern matching can be performed on strings, using the operator like. We
describe patterns by using two special characters:
    Percent (%): The % character matches any substring.
    Underscore ( _): The character matches any character.
Patterns are case sensitive; that is, uppercase characters do not match
lowercase characters, or vice versa. To illustrate pattern matching, we consider
the following examples:
    'Intro%' matches any string beginning with "Intro".
    '%Comp%' matches any string containing "Comp" as a substring, for
     example, 'Intro. to Computer Science', and 'Computational Biology'.
    '_ _ _ ' matches any string of exactly three characters.
    '_ _ _ %'matches any string of at least three characters.
   SQL expresses patterns by using the like comparison operator. Consider the
   query “Find the names of all departments whose building name includes
   the substring ‘Watson’.” This query can be written as:
   selectdept name
   from department
   where building like'%Watson%';
   Attribute specification in select clause:
   The asterisk symbol “ * ” can be used in the select clause to denote “all
   attributes.” Thus, the use of instructor.* in the select clause of the query:
   selectinstructor.*
   from instructor, teaches
   where instructor.ID= teaches.ID;
   indicates that all attributes of instructor are to be selected.
   Ordering the display of tuples:
   The order by clause causes the tuples in the result of a query to appear in
   sorted order.
   Example:
   To list in alphabetic order all instructors in the Physics department, we
   write:
              select name
              from instructor
           wheredept_name = 'Physics'
           order by name;
By default, the order by clause lists items in ascending order. To specify the
sort order, we may specify desc for descending order or asc for ascending
order.
Example: Suppose that we wish to list the entire instructor relation in
descending order of salary. If several instructors have the same salary, we
order them in ascending order by name. We express this query in SQL as
follows:
   select *
   from instructor
   order by salary desc, name asc;
where clause predicates:
SQL includes a between comparison operator to simplify where clauses that
specify that a value be less than or equal to some value and greater than or
equal to some other value.
Example: If we wish to find the names of instructors with salary amounts
between $90,000 and $100,000, we can use the between comparison to
write:
select name
from instructor
where salary between 90000 and 100000;
instead of:
select name
from instructor
where salary >= 90000 andsalary <= 100000;
Example:
Find the instructor names and the courses they taught for all instructors in
the Biology department who have taught some course.”
select name, course_id
from instructor, teaches
where instructor.ID= teaches.ID and dept name = 'Biology';
Set operations:
Assume that the relations obtained as the result of the preceding queries as c1
and c2, respectively.
The Union operation:
To find the set of all courses taught either in Fall 2009 or in Spring 2010, or
both, we write:
The union operation automatically eliminates duplicates, unlike the select
clause.
The Intersect operation:
To find the set of all courses taught in Fall 2009 as well as Spring 2010we
write:
The intersect operation automatically eliminates duplicates.
The Except (minus) operation:
To find the set of all courses taught in the Fall 2009 but notin Spring 2010we
write:
or
     (selectcourse_id
     from section
     where semester='Fall' and year=2009)
minus
(select course_id
from section
where semester='Spring' and year=2010);
Null values:
      It is possible for tuples to have a null value, denoted by null, for some
       of their attributes
      null signifies an unknown value or that a value does not exist.
      The result of any arithmetic expression involving null is null
            o Example: 5 + null returns null
      The predicate is null can be used to check for null values.
            o Example: Find all instructors whose salary is null.
                   select name
                      from instructor
                      where salary is null;
      Any comparison with null returns unknown
            o Example: 5 < null or null <> null or null = null
      Three-valued logic using the truth value unknown:
            o or: (unknownortrue) = true,
                   (unknownorfalse) = unknown
                   (unknown or unknown) = unknown
            o and: (true and unknown) = unknown,
                    (false and unknown) = false,
                    (unknown and unknown) = unknown
            o not: (not unknown) = unknown
            o “P is unknown”evaluates to true if predicate P evaluates to
               unknown
      Result of where clause predicate is treated as false if it evaluates to
       unknown.
Aggregate functions:
Aggregate functions are functions that take a collection (a set or multiset)
of values as input and return a single value. SQL offers five built-in
aggregate functions:
      Average: avg
      Minimum: min
      Maximum: max
      Total: sum
      Count: count
The input to sum and avg must be a collection of numbers, but the other
operators can operate on collections of nonnumeric data types, such as
strings, as well.
Basic aggregation:
Q. Find the average salary of instructors in the Computer Science
department.
The result of this query is a relation with a single attribute, containing a
single tuple with a numerical value corresponding to the average salary of
instructors in the Computer Science department.
Q. Find the total number of instructors who teach a course in the Spring
2010 semester.
selectcount(distinct id) as "No.of Instructors"
from teaches
where semester='Spring' and year=2010;
Q. count the number of tuples in a relation.
The notation for this function in SQL is count (*).
select count(*)
from course;
Aggregation with grouping:
Apply the aggregate function also to a group of sets of tuples; we specify
this wish in SQL using the group by clause. The attribute or attributes given
in the group by clause are used to form groups. Tuples with the same value
on all attributes in the group by clause are placed in one group.
Q. Find the average salary in each department.
Result:
Q. Find the number of instructors in each department who teach a course in
the Spring 2010 semester.
Result:
The Having clause:
    It is useful to state a condition that applies to groups rather than to
     tuples.
    For example, retrieve only those departments where the average
     salary of the instructors is more than $42,000.
Q. For each course section offered in 2009, find the average total credits (tot
cred) of all students enrolled in the section, if the section had at least 2
students.
Nested subqueries:
A subquery is a select-from-where expression that is nested within another
query. This another query is called outer query.
A common use of subqueries is to perform tests for set membership, make
set comparisons, and determine set cardinality, by nesting subqueries in the
where clause.
Set membership:
SQL allows testing tuples for membership in a relation. The in connective
tests for set membership, where the set is a collection of values produced
by a select clause. The not in connective tests for the absence of set
membership.
Q. Find all the courses taught in the both the Fall 2009 and Spring 2010
semesters using the in connective of SQL.
Q. Find all the courses taught in the Fall 2009 semester but not in the Spring
2010 semester.
Q. Find the names of instructors whose names are neither “Mozart” nor
“Einstein”.
Set comparison: The ability of a nested subquery to compare sets.
Q. Find the names of all instructors whose salary is greater than at least one
instructor in the Biology department.
The phrase “greater than at least one” is represented in SQL by > some.
generates the set of all salary values of all instructors in the Biology
department. The > some comparison in the where clause of the outer
select is true if the salary value of the tuple is greater than at least one
member of the set of all salary values for instructors in Biology.
SQL also allows < some, <= some, >= some, = some, and <> some
comparisons.
Q. Find the names of all instructors that have a salary value greater than
that of each instructor in the Biology department.
The construct > all corresponds to the phrase “greater than all.” SQL also
allows < all, <= all, >= all, = all, and <> all comparisons.
Q. Find the departments that have the highest average salary.
Test for empty relations (exists and not exists):
Whenever a condition in the where clause of a nested query references
some attribute of a relation declared in the outer query, the two queries
are said to be correlated.
Nested query is evaluated once for each tuple (or combination of tuples) in
the outer query.
SQL includes a feature for testing whether a subquery has any tuples in its
result. The exists construct returns the value true if the argument subquery
is nonempty.
Q. Find all courses taught in both the Fall 2009 semester and in the Spring
2010 semester.
We can test for the nonexistence of tuples in a subquery by using the not
exists construct.
Q. Find all students who have taken all courses offered in the Biology
department.
select distinct id,name
from student s
where not exists ((select course_id
                      from course
                      where dept_name='Biology')
                      minus
                     (select course_id
                     from takes t
                     where s.id=t.id));
Here, the subquery:
finds the set of all courses offered in the Biology department. The subquery:
finds all the courses that student S.ID has taken. Thus, the outer select takes
each student and tests whether the set of all courses that the student has
taken contains the set of all courses offered in the Biology department.
Modification of the Database:
Deletion: A delete request is expressed in much the same way as a query.
We can delete only whole tuples; we cannot delete values on only particular
attributes. SQL expresses a deletion by
where P represents a predicate and r represents a relation. The delete
statement first finds all tuples t in r for which P(t) is true, and then deletes
them from r. The where clause can be omitted, in which case all tuples in r
are deleted.
The delete request can contain a nested select that references the relation
from which tuples are to be deleted.
The delete statement first tests each tuple in the relation instructor to
check whether the salary is less than the average salary of instructors in the
university. Then, all tuples that fail the test—that is, represent an instructor
with a lower-than-average salary—are deleted.
Insertion: To insert data into a relation, we either specify a tuple to be
inserted or write a query whose result is a set of tuples to be inserted.
Obviously, the attribute values for inserted tuples must be members of the
corresponding attribute’s domain. Similarly, tuples inserted must have the
correct number of attributes.
The simplest insert statement is a request to insert one tuple.
In this example, the values are specified in the order in which the
corresponding attributes are listed in the relation schema.
For the benefit of users who may not remember the order of the attributes,
SQL allows the attributes to be specified as part of the insert statement. For
example, the following SQL insert statements are identical in function to the
preceding one:
Insertion of tuples on the basis of the result of a query:
Updates: Update statement can be used to change a value in a tuple
without changing all values in the tuple.
Suppose that annual salary increases are being made, and salaries of all
instructors are to be increased by 5 percent.
If a salary increase is to be paid only to instructors with salary of less than
$70,000, we can write:
Q. Give a 5 percent salary raise to instructors whose salary is less than
average
All instructors with salary over $100,000 receive a 3 percent raise, whereas
all others receive a 5 percent raise.