KEMBAR78
Database Best Practices for Developers | PDF
0% found this document useful (0 votes)
163 views8 pages

Database Best Practices for Developers

The document provides best practices for database programming to improve performance, maintainability and quality. Some key guidelines include: 1) Write descriptive comments in code for readability. Use column names instead of SELECT * and avoid cursors when possible for better performance. 2) Use derived tables, temporary tables, and table variables sparingly to reduce disk I/O. Prefix table names and use ANSI joins for clarity. 3) Minimize null values and define datatypes consistently to reduce errors and improve maintenance. Access tables in a consistent order within procedures to prevent deadlocks.

Uploaded by

api-3747051
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
163 views8 pages

Database Best Practices for Developers

The document provides best practices for database programming to improve performance, maintainability and quality. Some key guidelines include: 1) Write descriptive comments in code for readability. Use column names instead of SELECT * and avoid cursors when possible for better performance. 2) Use derived tables, temporary tables, and table variables sparingly to reduce disk I/O. Prefix table names and use ANSI joins for clarity. 3) Minimize null values and define datatypes consistently to reduce errors and improve maintenance. Access tables in a consistent order within procedures to prevent deadlocks.

Uploaded by

api-3747051
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 8

Database coding conventions, best practices,

programming guidelines
Databases are the heart and soul of many of the recent enterprise
applications and it is very essential to pay special attention to
database programming. There are many occasions where database
programming is overlooked, thinking that it's something easy and can
be done by anyone. This is wrong. If you don't use database specialists
during your development cycle, database often ends up becoming the
performance bottleneck. Below are some of the database programming
best practices

Here are some of the programming guidelines, best practices, keeping


quality, performance and maintainability, so DBAs and database
developers can benefit.

• Try not to use system tables directly. System table structures


may change in a future release. Wherever possible, use the
sp_help* stored procedures or INFORMATION_SCHEMA views.
There will be situattions where you cannot avoid accessing
system table though!

• Write comments in your stored procedures, triggers and SQL


batches generously, whenever something is not very obvious.
This helps other programmers understand your code clearly.
Don't worry about the length of the comments, as it won't impact
the performance, unlike interpreted languages like ASP 2.0.

• Do not use SELECT * in your queries. Always write the required


column names after the SELECT statement, like SELECT
CustomerID, CustomerFirstName, City. This technique results in
less disk IO and less network traffic and hence better
performance.

• Try to avoid server side cursors as much as possible. Always stick


to 'set based approach' instead of a 'procedural approach' for
accessing/manipulating data. Cursors can be easily avoided by
SELECT statements in many cases. If a cursor is unavoidable, use
a simple WHILE loop instead, to loop through the table. WHILE
loop is faster than a cursor most of the times. But for a WHILE
loop to replace a cursor you need a column (primary key or
unique key) to identify each row uniquely and every table has a
primary or unique key.
• Avoid the creation of temporary tables while processing data, as
much as possible, as creating a temporary table means more
disk IO. Consider advanced SQL or views or table variables of
SQL Server 2000 or derived tables, instead of temporary tables.

• Try to avoid wildcard characters at the beginning of a word while


searching using the LIKE keyword, as that results in an index
scan, which is defeating the purpose of having an index. The
following statement results in an index scan, while the second
statement results in an index seek:
1. SELECT LocationID FROM Locations WHERE Specialities LIKE
'%pples'
2. SELECT LocationID FROM Locations WHERE Specialities LIKE 'A%s'

Also avoid searching with not equals operators (<> and NOT) as
they result in table and index scans.

• Use 'Derived tables' wherever possible, as they perform better.


Consider the following query to find the second highest salary
from Employees table:

SELECT MIN(Salary)
FROM Employees
WHERE EmpID IN
(
SELECT TOP 2 EmpID
FROM Employees
ORDER BY Salary Desc
)

The same query can be re-written using a derived table as shown


below, and it performs twice as fast as the above query:

SELECT MIN(Salary)
FROM
(
SELECT TOP 2 Salary
FROM Employees
ORDER BY Salary Desc
) AS A

This is just an example, the results might differ in different


scenarios depending upon the database design, indexes, volume
of data etc. So, test all the possible ways a query could be
written and go with the efficient one.

• Prefix the table names with owner names, as this improves


readability, avoids any unnecessary confusions. Microsoft SQL
Server Books Online even states that qualifying tables names,
with owner names helps in execution plan reuse.

• Use SET NOCOUNT ON at the beginning of your SQL batches,


stored procedures and triggers in production environments, as
this suppresses messages like '(1 row(s) affected)' after
executing INSERT, UPDATE, DELETE and SELECT statements.
This inturn improves the performance of the stored procedures
by reducing the network traffic.

• Use the more readable ANSI-Standard Join clauses instead of the


old style joins. With ANSI joins the WHERE clause is used only for
filtering data. Where as with older style joins, the WHERE clause
handles both the join condition and filtering data. The first of the
following two queries shows an old style join, while the second
one shows the new ANSI join syntax:
SELECT a.au_id, t.title
FROM titles t, authors a, titleauthor ta
WHERE
a.au_id = ta.au_id AND
ta.title_id = t.title_id AND
t.title LIKE '%Computer%'

SELECT a.au_id, t.title


FROM authors a
INNER JOIN
titleauthor ta
ON
a.au_id = ta.au_id
INNER JOIN
titles t
ON
ta.title_id = t.title_id
WHERE t.title LIKE '%Computer%'

• Do not prefix your stored procedure names with 'sp_'. The prefix
sp_ is reserved for system stored procedure that ship with SQL
Server. Whenever SQL Server encounters a procedure name
starting with sp_,, it first tries to locate the procedure in the
master database, then looks for any qualifiers (database, owner)
provided, then using dbo as the owner. So, you can really save
time in locating the stored procedure by avoiding sp_ prefix. But
there is an exception! While creating general purpose stored
procedures that are called from all your databases, go ahead and
prefix those stored procedure names with sp_ and create them in
the master database.
• Views are generally used to show specific data to specific users
based on their interest. Views are also used to restrict access to
the base tables by granting permission on only views. Yet
another significant use of views is that, they simplify your
queries. Incorporate your frequently required complicated joins
and calculations into a view, so that you don't have to repeat
those joins/calculations in all your queries, instead just select
from the view.

• Use 'User Defined Datatypes', if a particular column repeats in a


lot of your tables, so that the datatype of that column is
consistent across all your tables.

• Use char data type for a column, only when the column is non-
nullable. If a char column is nullable, it is treated as a fixed
length column in SQL Server 7.0+. So, a char(100), when NULL,
will eat up 100 bytes, resulting in space wastage. So, use
varchar(100) in this situation. Of course, variable length
columns do have a very little processing overhead over fixed
length columns. Carefully choose between char and varchar
depending up on the length of the data you are going to store.

• Minimize the usage of NULLs, as they often confuse the front-end


applications, unless the applications are coded intelligently to
eliminate NULLs or convert the NULLs into some other form. Any
expression that deals with NULL results in a NULL output. ISNULL
and COALESCE functions are helpful in dealing with NULL values.
Here's an example that explains the problem:

Consider the following table, Customers which stores the names


of the customers and the middle name can be NULL.
CREATE TABLE Customers
(
FirstName varchar(20),
MiddleName varchar(20),
LastName varchar(20)
)

Now insert a customer into the table whose name is Tony Blair,
without a middle name:
INSERT INTO Customers
(FirstName, MiddleName, LastName)
VALUES ('Tony',NULL,'Blair')

The following SELECT statement returns NULL, instead of the


customer name:
SELECT FirstName + ' ' + MiddleName + ' ' + LastName FROM
Customers

To avoid this problem, use ISNULL as shown below:

SELECT FirstName + ' ' + ISNULL(MiddleName + ' ','') + LastName


FROM Customers

• Always use a column list in your INSERT statements. This helps in


avoiding problems when the table structure changes (like adding
a column). Here's an example which shows the problem.

Consider the following table:

CREATE TABLE EuropeanCountries


(
CountryID int PRIMARY KEY,
CountryName varchar(25)
)

Here's an INSERT statement without a column list , that works


perfectly:
INSERT INTO EuropeanCountries
VALUES (1, 'Ireland')

Now, let's add a new column to this table:

ALTER TABLE EuropeanCountries


ADD EuroSupport bit

Now run the above INSERT statement. You get the following error
from SQL Server:

Server: Msg 213, Level 16, State 4, Line 1


Insert Error: Column name or number of supplied values does not
match table definition.

This problem can be avoided by writing an INSERT statement


with a column list as shown below:
INSERT INTO EuropeanCountries
(CountryID, CountryName)
VALUES (1, 'England')

• Perform all your referential integrity checks, data validations


using constraints (foreign key and check constraints). These
constraints are faster than triggers. So, use triggers only for
auditing, custom tasks and validations that can not be performed
using these constraints. These constraints save you time as well,
as you don't have to write code for these validations and the
RDBMS will do all the work for you.

• Always access tables in the same order in all your stored


procedures/triggers consistently. This helps in avoiding
deadlocks. Other things to keep in mind to avoid deadlocks are:
Keep your transactions as short as possible. Touch as less data as
possible during a transaction. Never, ever wait for user input in
the middle of a transaction. Do not use higher level locking hints
or restrictive isolation levels unless they are absolutely needed.
Make your front-end applications deadlock-intelligent, that is,
these applications should be able to resubmit the transaction
incase the previous transaction fails with error 1205. In your
applications, process all the results returned by SQL Server
immediately, so that the locks on the processed rows are
released, hence no blocking.

• Consider adding a @Debug parameter to your stored procedures.


This can be of bit data type. When a 1 is passed for this
parameter, print all the intermediate results, variable contents
using SELECT or PRINT statements and when 0 is passed do not
print debug information. This helps in quick debugging of stored
procedures, as you don't have to add and remove these
PRINT/SELECT statements before and after troubleshooting
problems.

• Do not call functions repeatedly within your stored procedures,


triggers, functions and batches. For example, you might need the
length of a string variable in many places of your procedure, but
don't call the LEN function whenever it's needed, instead, call the
LEN function once, and store the result in a variable, for later
use. Also if you require to use the return current system date
thro’ GETDATE() function in the program multiple times then
declare a variable in the beginning and use that variable instead
wherever required.

• Make sure your stored procedures always return a value


indicating the status. Standardize on the return values of stored
procedures for success and failures. The RETURN statement is
meant for returning the execution status only, but not data. If
you need to return data, use OUTPUT parameters.

• If your stored procedure always returns a single row resultset,


consider returning the resultset using OUTPUT parameters instead
of a SELECT statement, as ADO handles output parameters faster
than resultsets returned by SELECT statements.

• Always check the global variable @@ERROR immediately after


executing a data manipulation statement (like
INSERT/UPDATE/DELETE), so that you can rollback the transaction
in case of an error (@@ERROR will be greater than 0 in case of an
error). This is important, because, by default, SQL Server will not
rollback all the previous changes within a transaction if a
particular statement fails. The @@ROWCOUNT variable also plays an
important role in determining how many rows were affected by a
previous data manipulation (also, retrieval) statement, and based
on that you could choose to commit or rollback a particular
transaction.

• To make SQL Statements more readable, start each clause on a


new line and indent when needed. Following is an example:
SELECT title_id, title
FROM titles
WHERE title LIKE 'Computing%' AND
title LIKE 'Gardening%'

• Always store 4 digit years in dates (especially, when using char


or int datatype columns), instead of 2 digit years to avoid any
confusion and problems. This is not a problem with datetime
columns, as the century is stored even if you specify a 2 digit
year. But it's always a good practice to specify 4 digit years even
with datetime datatype columns.

• In your queries and other SQL statements, always represent date


in yyyy/mm/dd format. This format will always be interpreted
correctly, no matter what the default date format on the SQL
Server is. This also prevents the following error, while working
with dates:

Server: Msg 242, Level 16, State 3, Line 2


The conversion of a char data type to a datetime data
type resulted in an out-of-range datetime value.

• Always be consistent with the usage of case in your code. On a


case insensitive server, your code might work fine, but it will fail
on a case sensitive SQL Server if your code is not consistent in
case. For example, if you create a table in SQL Server or
database that has a case-sensitive or binary sort order, all
references to the table must use the same case that was
specified in the CREATE TABLE statement. If you name the table
as 'MyTable' in the CREATE TABLE statement and use 'mytable' in
the SELECT statement, you get an 'object not found' or 'invalid
object name' error.

• Though T-SQL has no concept of constants (like the ones in C


language), variables will serve the same purpose. Using variables
instead of constant values within your SQL statements, improves
readability and maintainability of your code. Consider the
following example:
UPDATE dbo.Orders
SET OrderStatus = 5
WHERE OrdDate < GETDATE(()

The same update statement can be re-written in a more readable


form as shown below and will be very useful when GETDATE()
has to be used multiple times in your program:

DECLARE @ORDER_PENDING int


DECLARE @ORDER_DATE DATETIME

SET @ORDER_DATE = GETDATE()


SET @ORDER_PENDING =5

UPDATE dbo.Orders
SET OrderStatus = @ORDER_PENDING
WHERE OrdDate < @ORDER_DATE

• Do not use the column numbers in the ORDER BY clause as it


impairs the readability of the SQL statement. Further, changing
the order of columns in the SELECT list has no impact on the
ORDER BY when the columns are referred by names instead of
numbers. Consider the following example, in which the second
query is more readable than the first one:
SELECT OrderID, OrderDate
FROM Orders
ORDER BY 2

SELECT OrderID, OrderDate


FROM Orders
ORDER BY OrderDate

You might also like