DATA FLOW DIAGRAMS
1.5.1 Definition
1.5.2 Advantages and Disadvantages
1.5.3 Symbols
1.5.4 Guidelines for drawing DFDs
1.5.5 DFD Levels and examples
1.5.1 Definition
• Data flow diagram (DFD) is a graphical representation
of the "flow" of data through an information system,
modeling its process aspects. A DFD is often used as a
preliminary step to create an overview of the system,
which can later be elaborated. DFDs can also be used
for the visualization of data processing (structured
design).
• A DFD shows what kind of information will be input
to and output from the system, where the data will
come from and go to, and where the data will be
stored. It does not show information about the
timing of processes, or information about whether
processes will operate in sequence or in parallel
(which is shown on a flowchart).
• Data flow diagrams were proposed by Larry Constantine, the original
developer of structured design, based on Martin and Estrin's "Data Flow
Graph" model of computation.
• There are different notations to draw data flow diagrams (Yourdon & Coad
and Gane & Sarson]), defining different visual representations for processes,
data stores, data flow, and external entities.
• A physical DFD shows how the system is actually implemented, either at the
moment (Current Physical DFD), or how the designer intends it to be in the
future (Required Physical DFD). Thus, a Physical DFD may be used to
describe the set of data items that appear on each piece of paper that
move around an office, and the fact that a particular set of pieces of paper
are stored together in a filing cabinet. It is quite possible that a Physical DFD
will include references to data that are duplicated, or redundant, and that
the data stores, if implemented as a set of database tables, would
constitute an un-normalized (or de-normalized) relational database. In
contrast, a Logical DFD attempts to capture the data flow aspects of a
system in a form that has neither redundancy nor duplication.
1.5.2 ADVANTAGES and DISADVANTAGES
ADVANTAGES DISADVANTAGES
· A simple graphical technique which is · Data flow diagram undergoes lot of
easy to understand. alteration before going to users, so
· It helps in defining the boundaries of the makes the process little slow.
system. · Physical consideration are left out.
· It is useful for communicating current · It make the programmers little
system knowledge to the users. confusing towards the system.
· It is used as the part of system · Different DFD models have different
documentation file. symbols like in Gane and Sarson process
· It explains the logic behind the data flow is represented as rectangle where as in
within the system. DeMarco and Yourdan symbol it is
DFDs can provide a detailed represented as eclipse.
representation of system components.
1.5.3 Symbols
Entities
can be people, departments, other
companies, other systems…
are called sources if they are external to the
system and provide data to the system, and
sinks if they are external to the system and
receive information from the system
1.5.3 Symbols
Processes
must have at least one input and at least one output
at the primitive level (see below) are labeled with
verb + object (e.g. “print invoice” or “add
customer”) (e.g. in the hierarchy below, none of the
processes are primitive)
at the non-primitive level, are labeled more
generally (e.g. “customer maintenance” or
“warehouse reports”)
1.5.3 Symbols
Data Stores
· can be online or “hard copy” (see notes on logical VS
physical DFD’s below)
· are labeled with a noun (e.g. the label “customer”
indicates that information about customers is kept in
that data store)
· data is stored whenever there are more than one
process that needs it and these processes don’t always
run one after the other (if the data is ever needed in
the future it must be stored)
1.5.3 Symbols
Data flow
must originate from and/or lead to a process (this means
that entities and data stores cannot communicate with
anything except processes –remember that it takes a
process to make the data flow)
can go from process to process, but that does imply that no
data is stored at that point
can have one arrowhead indicating the direction in which
the data is flowing
can have 2 arrowheads when a process is altering
(updating) existing records in a data stores.
1.5.4 Guidelines for Drawing Dataflow
Diagrams
Naming conventions:
Processes: strong verbs
dataflows: nouns
datastores: nouns
external entities: nouns
• No more than 7 - 9 processes in each DFD.
• Dataflows must begin, end, or both begin & end with a process.
• Dataflows must not be split.
• A process is not an analog of a decision in a systems or programming flowchart. Hence, a
dataflow should not be a control signal. Control signals are modeled separately as
controlflows.
• Loops are not allowed.
• A dataflow can not be an input signal. If such a signal is necessary, then it must be a part of
the description of the process, and such process must be so labeled. Input signals as well as
their effect on the behavior of the system are incorporated in the behavioral model (say, state
transition graphs) of the information system.
1.5.4 Guidelines for Drawing
Dataflow Diagrams
• Decisions and iterative controls are part of process description rather than
dataflows.
• If an external entity appears more than once on the same DFD, then a diagonal
line is added to the north-west corner of the rectangle (representing such entity).
• Updates to datastores are represented in the textbook as double-ended arrows.
This is not, however, a universal convention. I would rather you did not use this
convention since it can be confusing. Writing to a datastore implies that you have
read such datastore (you can not write without reading). Therefore, datastore
updates should be denoted by a single-ended arrow from the updating process to
the updated datastore.
• Dataflows that carry a whole record between a datastore and a process is not
labeled in the textbook since there is no ambiguity. This is also not a universal
convention. I would rather you labeled such dataflows explicitly.
1.5.4 Guidelines for Drawing
Dataflow Diagrams
Conservation Principles:
Datastores & Dataflows: Datastores can not create (or
destroy) any data. What comes out of a datastore therefore
must first have got into a datastore through a process.
Processes: Processes can not create data out of thin air.
Processes can only manipulate data they have received from
dataflows. Data outflows from processes therefore must be
derivable from the data inflows into such processes.
1.5.4 Guidelines for Drawing
Dataflow Diagrams
Levelling Conventions:
Numbering: The system under study in the context diagram is given number
`0'. The processes in the top level DFD are labelled consecutively by natural
numbers beginning with 1. When a process is exploded in a lower level DFD,
the processes in such lower level DFD are consecutively numbered following
the label of such parent process ending with a period or full-stop (for
example 1.2, 1.2.3, etc.).
Balancing: The set of DFDs pertaining to a system must be balanced in the
sense that corresponding to each dataflow beginning or ending at a process
there should be an identical dataflow in the exploded DFD.
1.5.4 Guidelines for Drawing
Dataflow Diagrams
Datastores: Datastores may be local to a specific level in the set of DFDs. A
datastore is used only if it is referenced by more than one process.
External entities: Lower level DFDs can not introduce new external
entities. The context diagram must therefore show all external entities
with which the system under study interacts. In order not to clutter higher
level DFDs, detailed interactions of processes with external entities are
often shown in lower level DFDs but not in the higher level ones. In this
case, there will be some dataflows at lower level DFDs that do not appear
in the higher level DFDs. In order to facilitate unambiguous balancing of
DFDs, such dataflows are crossed out to indicate that they are not to be
considered in balancing. This convention of crossing is quite popular, but
this text does not follow it. I would rather you followed this convention.
1.5.5 DFD LEVELS and SAMPLES
DFD LEVELS
• The DFD may be used for any level of data
abstraction. DFD can be partitioned into levels. Each
level has more information flow and data functional
details than the previous level.
CONTEXT DIAGRAM
Some important points are:
① 1 bubble (process) represents the entire
system.
② Data arrows show input and output.
③ Data Stores NOT shown. They are within the
system.
LEVEL 0 DFD
Some important points are:
① Level 0 DFD must balance with the context
diagram it describes.
② Input going into a process are different from
outputs leaving the process.
③ Data stores are first shown at this level.
LEVEL 1 DFD
Some important points are:
① Level 1 DFD must balance with the Level 0 it
describes.
② Input going into a process are different from
outputs leaving the process.
③ Continue to show data stores.
EXAMPLES of CONTEXT DIAGRAMS
EXAMPLES of CONTEXT DIAGRAMS
EXAMPLES of CONTEXT DIAGRAMS
EXAMPLES of LEVEL 0 DFDS
EXAMPLES of LEVEL 0 DFDS
EXAMPLES of LEVEL 0 DFDS
EXAMPLES of LEVEL 1 DFDS
EXAMPLES of LEVEL 1 DFDS
EXAMPLES of LEVEL 1 DFDS