0 ratings0% found this document useful (0 votes) 86 views7 pagesData Integrity
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
@
Key terms
Data integrity - the
‘accuracy, completeness
‘and consistency of data,
Validation - method
Used to ensure entered
data is reasonable and
meets certain input
criteria.
Verification - method
Used to ensure data is
correct by using double
entry or visual checks.
Check digit - additionat
digit appended to a
number to check if
entered data is
error free.
Moduto-11 - method
used to calculate a
check digit based on
modulus division by 11
Checksum -
verification method
used to check if data
transferred has been
altered or corrupted,
calculated from the
block of data to be sent
Parity check - method
Used to check if data
has been transferred
correctly that uses
even or odd parity.
Parity bit - an extra
bit found at the end
of a byte that is set to
“ifthe parity of the
byte needs to change
toagree with sender/
receiver parity protocol,
Odd parity - binary
number with an odd
number of 1-bits
Even parity ~ binary
number with an even
number of 1-bits.
Parity block - horizontal
‘and vertical parity check
‘on a block of data being
transferred.
6.2 Data integrity
WHAT YOU SHOULD ALREADY KNOW
Try these three questions before you read the second part of this chapter.
1. Look at the following validation screen from a spreadsheet.
Why is it important to have validation in applications such as
spreadsheets?
Foret Celis
Number Alignment Font Border Fil_—_—Protection|
categonr
Sample
De
CT .
4 Maren 2012
wosa012
vara
apse
Man
2012.03.14
oaale locations
English (United Kingdom)
Date onmats display date and time seal numbers as date values, Date formats that begin with
‘an asterisk () respond to changes in regional date and tine settings that re spectied forthe
‘operating system, Fonmats witout an asterisk ae no affected by operating system settings,
2 Why is proofreading not the same as verification?
3. Discuss one way online form designers can ensure that only certain
data can be input by a user. Use the date: 12 March 2019 as the
example.
Data stored on a computer should always be accurate, consistent and up to date.
Two of the methods used to ensure data integrity are validation and verification.
The accuracy (integrity) of data can be compromised
» during the data entry and data transmission stages
>» by malicious attacks on the data, for example caused by malware and hacking
» by accidental data loss caused through hardware issues.
These risks - together with ways of mitigating them - are discussed in the rest
of this chapter.
6.2.1 Validation
Validation is a method of checking if entered data is reasonable (and within a
given criteria), but it cannot check if data is correct or accurate. For example,
if somebody accidentally enters their age as 62 instead of 26, it is reasonable
‘Asbo 819 2°96 SECURITY, PRIVACY AND DATA INTEGRITY
Key terms
Parity byte ~ additional
byte sent with
transmitted data to
enable vertical parity
checking fas well
‘as horizontal parity
checking] to be carried
out
‘Automatic repeat
request (ARQ) - a type
of verification check,
‘Acknowledgement -
message sent toa
receiver to indicate that
data has been received
without error.
jut ~ time allowed
to elapse before an
‘acknowledgement is
received.
but not accurate or correct. Validation is carried out by computer software; the
most common types are shown in Table 6.1.
Validation | Description Example of data Example of data
test failing vatidation test | passing validation
test
type checks whether typing sk.34 in a field | typing 34.50 in a field
non-numeric data which should contain | which should contain
has been input into a | the price of an item | the price of an item
rnumeric-only field
range checks whether data | typing in somebody's | typing in somebody's
entered is between | age as -120 age as 48
a lower and an upper
timit
format | checks whether data__| typing in the date as _| typing in the date as
has been entered in the | 12-12-20 where the | 12/12/2020 where the
agreed format format is dd/mm/yyyy_| format is dd/mm/yyvy
length [checks whether data | typing ina telephone | typing ina telephone
has the required number as 012 345 678 | number as.
number of characters or | when it should contain | 012 345 678 90 when it
numbers 11 digits should contain 11 digits
presence | checks to make sure a | please enter passport _| please enter passport
field is not left empty | number number: AB 1234567 CD
when it should contain
data
existence | checks if data in afile | data look up for car __| data look up for a file
ora file name actually | registration plate A123 | called books_in_stock
exists BCD which does not | which exists in a
exist database
limit check | Checks only one of | typing in age as 25 | typing in somebody's
the limits (such as__| where the data entered | age as 72 where the
‘the upper limit OR the | should not be negative | upper limit is 140
lower timit)
consistency | checks whether data | typing in Mr in the title | typing in Ms in the
check ‘in two or more fields _| field and then choosing | title field and then
match up correctly | female in the sex field | choosing female in the
sex field
uniqueness | checks that each choosing the user name | choosing the website
check entered value is unique | MAXIMUS222 in a social | name Aristooo.com
networking site but | which is not already
the username already | used
exists
A Table 6.1 Common vali
6.2.2 Verification
Verification is a way of preventing errors when data is entered manually (using a
keyboard, for example) or when data is transferred from one computer to another.
Verification during data entry
When data is manually entered into a computer it needs to undergo verification
to ensure there are no errors. There are three ways of doing this: double entry,
visual check and check digits.A Figure 6.6 Barcode
Double entry
Data is entered twice, using two different people, and then compared (either
after data entry or during the data entry process).
Visual check
Entered data is compared with the original document (in other words, what is
con the screen is compared to the data on the original paper documents).
Check digits
The check digit is an additional digit added to a number (usually in the right-
most position). They are often used in barcodes, ISBNs (found on the cover of a
book) and VINs (vehicle identification number). The check digit can be used to
ensure the barcode, for example, has been correctly inputted. The check digit
can catch errors including
» an incorrect digit being entered (such as 8190 instead of 8180)
»> a transposition error where two numbers have been swapped (such as 8108
instead of 8180)
» digits being omitted or added (such as 818 or 81180 instead of 8180)
phonetic errors such as 13 (thirteen) instead of 30 (thirty).
Figure 6.6 shows a barcode with an ISBN-13 code with check digit.
‘An example of a check digit calculation is modulo-11. The following algorithm
is used to generate the check digit for a number with seven digits:
1 Each digit in the number is given a weighting of 7, 6, 5, 4, 3, 2 or 1, starting
from the left.
2 The digit is multiplied by its weighting and then each value is added to
make a total.
3. The total is divided by 11 and the remainder subtracted from 11.
4 The check digit is the value generated; note if the check digit is 10 then
X is used.
For example:
Seven digit number: 4156710
Weighting values: 7654321
Sum: (74) + (6x 1) 4 (5 5) #6 x6) +87)
+(2«1)+(1 x0)
= 2B +64 2542464214240
Total: = 106
Divide total by 11: 9 remainder 7
subtract remainder from 11: 11-7 = 4 (check digit)
final number: 41567104
When Uhis number is entered, Une check digit is recalculated and, if Ue samme
value is not generated, an error has occurred. For example, if 41576104
was entered, the check digit generated would be 3, indicating an error.
‘Asbo 819 2°96 SECURITY, PRIVACY AND DATA INTEGRITY
EXTENSION ACTIVITY 6D
Find out how the ISBN-13 method works and confirm that the number
978 034 098 382 has a check digit of 9.
Find the check digits for the following numbers using both modulo-11 and
ISBN-13.
a) 213 111 000 428
b) 909 812 123 544
3 Find a common use for the modulo-11 method of generating check digits.
Verification during data transfer
When data is transferred electronically from one device to another, there is
always the possibility of data corruption or even data loss. A number of ways
exist to minimise this risk.
Checksums
‘A checksum is a method to check if data has been changed or corrupted
following data transmission. Data is sent in blocks and an additional value, the
checksum, is sent at the end of the block of data.
To explain how this works, we will assume the checksum of a block of data is
‘byte in length. This gives a maximum value of 28 - 1 (= 255). The value
0000 0000 is ignored in this calculation. The following explains how a
checksum is generated.
If the sum of all the bytes in the transmitted block of data is < 255, then the
checksum is this value. However, if the sum of all the bytes in the data block
> 255, then the checksum is found using the following simple algorithm.
In the example we will assume the value of X is 1185.
© (X= 1185); 185/256 = 4.609 divide the sum, X, of the bytes by
256
4
® Rounding down to nearest whole round the answer down to the
number gives Y = 4 nearest whole number, Y
4
a eng ho za"
4
The difference (X - Z) gives the
® checksum: (1185 - 1024) = 161
calculate the difference (X - 2)
4
® This gives the checksum: 161 the value is the checksum
A Figure 6.7
When a block of data is about to be transmitted, the checksum for the bytes
is first calculated. This value is transmitted with the block of data. At the
receiving end, the checksum is re-calculated from the block of data received.
This calculated value is compared to the checksum transmitted. If they are the
same, then the data was transmitted without any errors; if they are different,
then a request is sent for the data to be re-transmitted.Parity checks
A parity check is another method to check whether data has been changed or
corrupted following transmission from one device or medium to another.
A byte of data, for example, is allocated a parity bit. This is allocated before
transmission. Systems that use even parity have an even number of 1-bits;
systems that use odd parity have an odd number of 1-bits.
Consider the following byte:
1 1 oO 1 1 0 oO
parity bit
A Figure 6.8
If this byte is using even parity, then the parity bit needs to be 0 since there is
already an even number of 1-bits (in this case, four).
If odd parity is being used, then the parity bit needs to be 1 to make the
number of 1-bits odd. Therefore, the byte just before transmission would be:
either (even parity): ofififovtififofo
party bie
or (odd parity): atijtijztof[ififofo
parity bit
re 6.9
Before data is transferred, an agreement is made between sender and receiver
regarding which of the two types of parity are used. This is an example of a
protocol.
EXTENSION ACTIVITY 6E
ind the parity bits for each of the following bytes:
1101101 evenparity used
0001111 — evenparity used
0111000 — evenparity used
0100 oddparity used
1
11
011011 — oddparityused
If a byte has been transmitted from ‘A’ to ‘B, and even parity is used, an error
would be flagged if the byte now had an odd number of 1-bits at the receiver's
end.
For example:
Sender's byte: olifofijlijfilfojo
party bit
Receiver's byte: oliftofolififofo
party bit
A Figure 6.10
‘Asbo 819 2°96 SECURITY, PRIVACY AND DATA INTEGRITY
In this case, the receiver's byte has three 1-bits, which means it now has odd
parity, while the byte from the sender had even parity (four 1-bits). This means
an ettor has occurred during the transmission of the data.
The error is detected by the computer re-calculating the parity of the byte
sent. If even parity has been agreed between sender and receiver, then a
change of parity in the received byte indicates that a transmission error has
occurred.
EXTENSION ACTIVITY 6F
11 Which of the following bytes have an error following data transmission?
a}11101101 even parity used
bho1oo01111 even parity used
c}00111000 even parity used
d)11110100 odd parity used
el11011011 odd parity used
2 In each case where an error occurs, can you work out which bit is
incorrect?
Naturally, any of the bits in the above example could have been changed
leading to a transmission error. Therefore, even though an error has been
flagged, it is impossible to know exactly which bit is in error.
One of the ways around this problem is to use parity blocks. In this method,
2 block of data is sent and the number of 1-bits are totalled horizontally and
vertically (in other words, a parity check is done in both horizontal and vertical
directions). As the following example shows, this method not only identifies
that an error has occurred but also indicates where the error is.
In this example, nine bytes of data have been transmitted. Agreement has
been made that even parity will be used. Another byte, known as the parity
byte, has also been sent. This byte consists entirely of the parity bits
produced by the vertical parity check. The parity byte also indicates the end
of the block of data.
Table 6.2 shows how the data arrived at the receiving end:
parity bit] bit2 | bits | bits | bits | bite | bit7 | bits
bytet| 1 1 1 1 ° 1 1 0
byte2| 1 ° o 1 0 1 0 1
byte3s| 0 1 1 1 1 1 1 0
bytes] a 0 ° 0 0 0 1 0
bytes| 0 1 1 0 1 0 0 1
byteo| 1 0 0 0 1 ° 0 0
byte7| 1 0 1 0 1 1 a 1
bytes| 0 0 0 1 1 0 1 0
byte9| 0 0 0 1 0 0 1 0
parity | 1 1 0 1 0 0 0 1
byte
‘A Table 6.2A careful study of the table shows that
» byte 8 (row 8) has incorrect parity (there are three 1-bits)
» bit 5 (column 5) also has incorrect parity (there are five 1-bits).
First, the table shows that an error has occurred following data transmission.
Second, at the intersection of row 8 and column 5, the position of the incorrect:
bit value (which caused the error) can be found, This means that byte 8 should
have been:
ofofofijfofojfilfo
which would also correct column 5 giving an even vertical parity (now has four
vbits).
This byte could, therefore, be corrected automatically, as shown above, or an
error message could be relayed back to the sender asking them to re-transmit
the block of data. One Final point; if two of the bits change value following
data transmission, it may be impossible to locate the error using the above
method.
For example, using the above example agai
ofif[o[i[ififolfo
This byte could reach the destination as:
ofivtififtififof[i
or: ofifofifofofofo
or: ofif[o[ifofififo
All three are clearly incorrect, but they have retained even parity so will
not trigger an error message at the receiving end. Clearly, other methods to
complement parity when it comes to error checking of transmitted data are
required (such as checksum).
Automatic repeat request (ARQ)
‘Automatic repeat request (ARQ) is another method to check data following
data transmission. This method can be summarised as follows:
>» ARQ uses acknowledgement (a message sent to the receiver indicating that
data has been received correctly) and timeout (the time interval allowed to
elapse before an acknowledgement is received).
» When the receiving device detects an error following data transmission, it
asks for the data packet to be re-sent.
» If no error is detected, a positive acknowledgement is sent to the sender.
» The sending device will re-send the data package if
- it receives a request to re-send the data, or
= a timeout has occurred.
» The whole process is continuous until the data packet received is correct or
until the ARQ time limit (timeout) is reached.
>» ARQ is often used by mobile phone networks to guarantee data integrity.
‘Asbo 819 2°9