Assembly Language String Ops
Assembly Language String Ops
-~
The String
·Instructions
11.1
The Direction Flag In Chapter 5, we saw that the FLAGS register contains six status flags
and three control flags. We know that_ the status flags reflect the result of an
operation that the processor has done. The control flags are used to control
the pr~r's operations. :_ , .
-One of the contr?I flags Is the directiu11 (lag (Df). Its purpose Is to
determine.the direction in which string operations wlll proceed. These op-
erations are impiemented by the two index registers SI and DI. Suppose, for
example, th.at the following string h~is been declared:
205
206 11.2 Moving a String
S1'RING1 DB 'ABCOE'
And this string is stored in memory starting at offset 0200h:
0203h 044h D
0204h 045h E
11.2
Moving a String ~uppose we have defined two strings as follows:
.DATA
STRINGl OB 'HELLO'
STRING2 OB 5 OUP I?)
and we would like to move the contents of STRING I (the source string) into
STRING2 (the destination string). This operation Is neooed for many string
operations, such a) duplicating a string or concatem1ting strings (attaching
one string to the. end of another string),
The MOVSB instruction
MOVSB ;move string byt~
copies the contents of the byte addressed by DS:SI, to the byte addressed by
ES:OI. The contents of the source byte are unchanged. After the byte has
been moved, both SI .and DI arc automatically jncrcqienu:d. If OF .. 0., pr
decremented If OF= I. For example, to move the first two bytes of STRINGl
to STIUNG2, we exccut('. the following Instructions: ·
.MOV AX,@DATA
MOV DS,AX ; initialize DS
MOV ES,AX ; and ES
LEA SI,STRINGl 1Sl points to source strinq
LEA DI,STRING2 ;DI points to destination string
CLO ;clear OF
MOVSB ;move first byte
MOVSB ; and second byte
See Figure 11.1.
Chapter 11 The String lnstruetions 207
SI
STRING1 t±· 1·e·1·L·1·L·1 ·o· j_
Offset 0 2 3 4'
01
STRING2 . .; -' I I l I -I .. I
Offset 56789
, ..
After MOVSB
SI
STRING1 I'H' I'~' ., I'L' I' I
'L' 0'
Offset 0 2 3 4
DI
I
STRING2
l'H'I I
Offset 5 6 7 8 9
·1:
• l
AfterMOVSB
.-
SI _
STRINGt 1·H·1 ·e· 1{· 1·L· j ·o· 1
Offset 0 2 3 4
DI· ·
STRING2 • I'H' i'E' I I • "' I' I
Offset 5.• 6 7 B 9
MOVSB Is the first irutructlon ·we have seen that pennits a memory-
memory operation. It is also the first Instruction that Involves the ES register.
The REP prefix cawes MOVSB to be executed N times. Aft.er each MOVSB,
CX Is decremented until it becomes 0. For example, to copy STRING I of the
preceding section Into STRING2, we execute
CLD
LEA SI,STRINGl
LEA DI,STRING2
MOV CX,5 ;no. of chars in STRING!
REP MOVSB
Example I I. I Write instructions to copy STRING 1 of the preceding section
Into STRING2 In reverse order.
...
SolutJon: The idea is to get SI pointing to the end of STRING 1, DI to'
the beginning of STRING2, then move characters as SI travels to the left
across STRING 1.
MOVSW
There Is a word form of MOVSn. ft is
MOVSW ; move string word
MOVSW moves a word from the source string to the d~·stinatlon string. l.ike-
MOVSB, it expects DS:SI to point to a source string word, and ES:DI to point
to a destination string word. After a string word has lx.>cn moved, boty1
and DI are increased by 2 if DF = 0, or are decreased by 2 if OF = 1.
MOVSB and MOVSW have no effect on the flags.
Solution: The idea Is to move 40, 50, and 60 forward one position In
the array, then insert 30.
Chapter 11 The String Instructions 209
11.3
Store String The STOSB Instruction
DI
STRING1
1·~·1 ·e· 1·l· 1·l· 1·0 • I ~
Offset 0 2 3 4 Al
After STOSS
DI
STRING1
I'A' I '~' I 'l' I 'l' 1·o·1 ~
Offset 0 2 3 4 Al
After STOSS
DI
STRING I I'A' I'A' I 'l" I 'l' 1·o·1 ~
Offset 0 ·2 3 4 Al
At line 23, the procedure uses STOSB to 'store input characters in the string.
STOSB automatically increments DI; at line 24, the character count in BX is
incremented.
The procedure takes into account the possibility of typing errors. If
the user hits the backspace key, then at line 19 the procedure decrements
·DI and BX. The backspace itself is not stored. When the next legitimate
character is read, it repl~ces the wrong one in the string. Note: if the last
characters typed before the carriage return arc backspaces, the wrong char-
acters will remain in the string, but the· count of legitimate characters in liX
will be correct.
Wo 11co DJ:' An CTD ln-r c:trinn ;Y'\nnt in tha fn11nUJinn cortinr\c
11.4
Load String The LODSB instruction
LOO SB ;load s~~ing byte
moves the byte addressed by DS:SI into AL. SI is then incremented if DF =
0 or decremented if DF. = 1. The word 'forin Is
LODSW ;load string word
it moves the word addressed by DS;Sl into AX; SI is Increased by 2 if DF =
0 or decreased by 2 if DI= = 1.
LODSB can be used to examine the characters of a ~tring, as shown
later.
LODSB and,LODSW have.no effect on the flags.
To illustrate LODSB, suppose STRING I is defined as
STRINGl ·oa 'ABC'
111e following code successively' loads the first and second bytes of STP.ING I
into AL
MOV AX,@DATA
MOV DS,AX ; initialize DS
LEA SI, STRINGl ; SI' points to STRING!
CLO ;process left to right
LODSB ;load~first byte into AL
LODSB; ~load~iicond byte into AL
SI
STRING1 j·~·j·e·j·c·j
Off5C?t 0 2 Al
After LODSB
~
STIUNG1
Offset 0 2 Al
After LODSB
SI
.STRING1
l'A'l'B'l'~'I ~
Offset. 0 2 AL
Sample execution:
C>li'GM11_3
THIS PROGRAM TESTS 'l'WO PROCEDURES
THIS PROGR
11.S
Scan String The instruction
SCASB ;scan string byte
can be used to examine a string for a target byte. The target byte is contained
in AL. SCASB subtracts the string byte pointed to by ES:DI from the contents
of AL and uses the result to set the flags. The result is not stored. Afterward,
DI is incremented if DF = 0 or decremented if DF = 1.
The word form is
SCASW ;scan string word
in this case, the t~rget word is in AX. SCASW subtracts the word addressed
by E.S:Dl from AX and sets the flags. DI is increased by 2 if DF = 0 or decreased
by 2 if DF = I. .
All the status flags are affected ~y SCAS£! and SCASW.
DI
STRING1
, .~lB' I I
'C'
~
Offset 0 2 AL
After SCASB
DI
STRING1
I ·e· 1·c1 ·
'A'!
~
ZF = 0 (not found)
Offset 0 2 AL
After SCASB
DI
STRING I
1·A·1·e· 1·~· 1. ~ ZF "' 1 (found)
Offset 0 2 AL
Chapter 11 The String Instructions 215
is defined, then these instructions examine the first two bytes of STRING!,
looking for "B"
MOV AX,@DATA
MOV AX,ES ; initialize ES
CLO ;left to right processing
LEA D~, STRINGl ;DI pts to STRINGl
MOV AL, 'B' ;target character
SCASB :; scan first byte
SCASB ;scan second. byte
See Figure-11.4. Note_ that when-'the iarget"'B" was found, ZF = 1 and because
· SCASB automatically" increm_ents DI, DI points to the byte after the target,
not the target itself. · '
- In looking for a target byte in a string, the string Is traversed until
the byte is found or the string ends. If CX is initialized to the number cf
bytes in the string,
,
REPNE_ , SCASB ; repeat SCASB while not equal
(to target)
will repeatedly 'su.btract each string byte from AL, update DJ, and decrement
ex until theri(is a zero re~ult (the target is found) or ex = 0 (the string
ends). Note: REPNZ (repeat· while ·not zero) generates the same machine
code as REPNE. . .
As an example, let\ write a prni;ram t0 ~uunt thP 11umbE'r "t vowels
0
5: VOWELS DB 'Ar:IOU'
6: CONSONANTS DB 'BCOFGHJKLMNPQRSTVWXYZ'
7. OUTl DB OOH,OAH,'vowels • $'
a: ou·r2 OB ', consonanLs ... S'
9: VOWELCT OW 0
10: CONS CT ow 0
12: MAIN PROC
13: HOV AX,@DATA
14: HOV OS,AX ; initialize OS
15: HOV ES,AX ;and ES
16: LEA OI,STRING ;DI pts to string
17: CALL READ_STR ;BX ~ no. of chars read
18: MOV SI,DI ;SI pts to string
19: CLO ;left to right processing
20: REPEAT:
21: ; load a string character
22: LOOSB ;char in AL
23: ;if it's a vowel
24: LEA OI,VOWELS ;DI pts to vowels
25: MOV CX,5 ;5 vowels
26: REPNE SCASB ; is char a vowel?
27: JNE CK_CONST ;no other char
28: ;then increment vowel count
29: INC VOWELCT
30: JM? UNTIL
31: ;else if it's a consonant
32: CK_CONST:
33: LEA DI, CONSONANTS ; DI pts to consonants
34: HOV CX, 21 ;21 consonants
35: REPNE SCASD ;is chat a consonant?
36: JNE UNTIL ;no
37: ;then increment consonant count
38: INC CONSCT
39: UNTIL:
40: DEC BX ;BX has no. chars left in str
41: Jl<E REPEAT ;loop if chars left
42: ;output no. of vowels
43: MOV AH,9 ;prepare to print
44: LEA DX,OUTl ;get vowel message
45: INT 21H ;print i t
4 6: MOV AX,VOWELCT ;get vowel count
47: CALL OUT DEC ;print i t
4 8: ;output no. of consonants
.; 9: MOV AH,9 ;prepare to print
50: LEA OX,OUT2 ;get consonant message
51: INT 21H· ;print it
~2: MOV AX,CONSCT ;get consonant count
53: CALL OUTOEC ;print it
54: ;dos exit
55: HOV AH, 4CH
56: INT 21H
57: MAIN . ENDP
58: ;REAO_STR goes here
5 9: ·; OUTOEC 9oes he re
60: END MAIN
Chapter 7 7 The String Instructions 217
Because the program uses both LODSB, which loads the byte in DS:SJ,
and SCASB, which scans the byte in ES:Dl, both DS and ES must be initialized.
BX is used as a loop counter and is set to the number of bytes In the string
CCX is used elsewhere in the program).
Li11c 22. LODSB puts a string character In AL and advances SI to
the next one.
Li11e 26. To sec if the character in AL is a vowel, the program
, scans the string VOWELS by executing REPNE SCASB. This in-
struction subtracts each byte of VOWELS from AL and sets the
flags. The instruction returns ZF = 1 if the character is a vowel
and ZF = O if it isn't.
Linc 35. If the target was not a vowel, the program scans the string
CONSONANT~. in exactly the same way it scanned VOWELS.
S<11~1plc: ,·xc:rntiu11:
C>PGMll 4
A,E,I,0,U ARE VOWELS.
vowel.: = 9, consonants 5
-..e;
1~1.6
Compare String The CMPSB Instruction
The following i~structions compare t~e first two bytes of the preceding strings:
MOV r;x, @DATA
MOV DS,AX ; initialize DS
MOV ES,AX ; and ES
CLD ;left to r.ight processing
LEJ\ !;!,STHINGl ;!;I µL~ Lu ~TRINGl
218 77.6 CompareString
SI
STRING1
1·l·j ·s· 1·c I
Offset 0 2
DI
STRING2
1·l·j ·c1 ·o·1
Offset 3 4 5
After CMPSB
51
STRING1
1·A·1·c·j·o·j RESULT= 041h- 041h =0 (not stored)
ZF,. 1, SF m 0
Offset 0 2
DI
STRING2
I'Al~' I I 'C'
Offset 3 4 5
After CMPSB
SI
STRING1
l'A' I'C' I'6'1 RESULT =043h -
ZF "'O. SF= 0
042h = 1 (not stored)
Offset 0 2
DI
STRING2
rpl!J
Offset 3 4 s
.. .
repeatedly executes CMPSB or CMPSW and decrements CX until {I) then:
is a mismatch between corresponding string bytes or words, or (2) CX = 0.
The !lags are set according to the result ot the last ~omparison.
CMPSB may be used to compare two character strings to see which
comes first alphabetically, or if they are identical, ur it one string i~ a sub~tring
of the other (this means that one string is contained within the other a~ a
sequence of con~ecutivc characters).
As an example, suppose STRl and STR2 are strings O!· length 10. Thi'.
following instructions put O in AX If the strings are identical. put I in AX
if STJU comes first alphabetically, or put 2 in AX if STR2 comes first alpha-
betically (assume OS and ES are initialized).
M0v cx,10;length 01 St!r:,ys
LEA SI,STRl
;SI points to STRl
I.EA DT,STR2
;DI poincs to STk2
CLD ;lert to r1yht ~recessing
R~PE CMPSB ;compdre strrn~ byte5
JL 5TR1 FIRST ; STRl precedes ~TR2
JG STR~ _FIRST ; STH2 precedes ST kl
;her~ if strings are identical
MOV AX,O ;put 0 in /..X
JMP EXIT ; and PX it
;here if STRl !:'recedes STR2
STRl FlR~T:
MOV AX,l ;µut l in ;,x
J;~p EXIT ;ar.d e>:i t.
;h~re if STR2 pceced~s STRl
STR2 FIRST:
MOV Jl.X, 2 ;PUT 2 .in AX
EXIT:
11.6.1
Finding a Substring of There arc several ways to determine whethe' one string is a substring
a String of another. The following way is probably the simplest. Suppose we declare
S0Bl • DB 'ABC'
SUB2 DB 'CAB'
MAINST DB 'ABABCA'
and we war:t to see whether SUl31 and SUl32 arc ~ubstrings of MAINST.
Let's begin with SUlll. We can compare corresponding characters in
the strings
SUBl ' A El c
I I +
MA:NST /..B.~BCA
MA INST
I BI AI
A D C A
l1AINST A B A B C A
+
There is a mismatch, and there is no need to proceed further, for if we did
we would be trying to match the three characters of SUB2 with the two
remaining characters "CA" of MAINST. Thus SUB2 is not a substring of
MA INST.
Actually, we could have predicted the last place to search. It is
STOP= l\.IAINST + iength of MAINST - length of SUl.12
=MAINST + 6 - 3 = MAINST + 3
Herc is an algorithm and a program that searches a main string
MAINST for a substring SUBST.
HEPEAT
compare corresponding characters in MAINST
( f rem START on I and SUBST
IF all charactet·s match
THE!J
SUBST found in MAINST
I::LSl::
START = START + 1
f.W: !F
u:n·rL CSUi'.ST found in MAI!JST)
OP. (START > STOP)
Dlsp!a1' results
After r('ading SUBST and MAlNST, and verifying that neither string
is null ;md SUilST is not longer than MAINST, in lines 44-50 the program
computes STOP (the pl<ice in MAINST to stop searching), and initializes
START (the place to start searching) to the beginning of MAINST.
Chapter 11 'The String Instructions 221
At line 51, the program enters a REPEAT loop where the characters of
SUllST are compared with the part of MAINST from STAlff on. In lines 53-56,
CX is set to the length of SUBSl; SI is pointed to SUBS'!: DI is pointed to STAJn;
and corresponding characters arc compared with REl'E CMl'Sll. If ZF = 1, then
the match is successful and the program jumps to line 66 where the message
"SUBST is a substring of MAINST" is displayed. If ZF = 0, there was a mismatch
between characters and START is incremented at line 59. The search <.:ontinues
until SUBST matches part of MAINST or START > STOP; in the latter case, the
message "SUBST is not a substring of MAINST" is displayed.
Sample executions:
C>PGMll_S
ENTER SUBST
ABC
El'TER MA!~ST
XYZABABC
SUBST IS A SUBSTRING OF MAINST
C>PGMll 5
ENTER 9UBST
ABO
ENTER MAINST
ABACAOACD
suss·r IS NOT A SUBSTRING OF MAINST
Chapter 11 The String Instructions 223
11.7
General Form of the Let us summarize the byte and word forms of the string instructions:
String Instructions
Instruction Destination Source Byte form Word form
Move string
.
ES:OI
Compare string ES:OI
OS:SI
OS:SI
MOVSB
CMPSB
MOVSW
CMPSW
Store string ES:DI AL or AX STOSB STOSW
Load'Wing AL or AX OS:SI LOOSB LODSW
Scan string ES:OI AL or AX SCASB SCA SW
The operands of these instructions are implicit; that is, they arc not
part" of the instructions themselves, However, there arc forms of the string
instructions in which the operands appear explicitly, They are as follows:
Instruction Example
MOVS destinat1on_string, source_string MOVSB
CMPS destination_string, source_string CMPSB
STOS d.estiriation_string STOS STRING2
LOOS sour~~.,.stnng LOOS STRINGl
SCAS d_estination_string SCAS STRING2
When the assembler encounters one of these general forms, it checks to see
if (1) the source string Is in the segment addressed by DS and the df'stination
string Is In the segment addressed by ES, and (2) in the case of MOVS and
CMPS, if the strings are of the same type; that is, both byte strings or word
strings. If so, then the instruction is coded as either a byte form, such as
MOVSB, or a word form, such as MOVSW, to match the data declaration of
the string: For example, suppose that DS and ES address the following seg-
ment: ' ·
.DATA
STRINGl DB 'ABCDE'
STRING2 DB 'EFGH'
STRING3' DB ' 'IJKL'
STRING4 DB 'MNOP'
STRINGS ow 1,2,3,4,S
STRING6 OW 7 I 8 I~
Then the foilowing pairs ·of instructions are equivall:nt
MOVS STRING2, STRINGl MOVSB
MOVS STRING6,STRINGS MOVSW
LOOS. STRING4 LODS3
LOOS STRINGS LODSW
SCAS STRINGl SCA SB
STOS STRING6 STOSW
It is·importact to note that if the general forms are used, it is still necessary to
make DS:SI and ES:DI point to the source and destination st:I:ngs, respectively,
There are advantages and disadvantages in using the general forms
of the string instructions. An advantage is that because the operands appear
as part of the code, program documentation is improved. A disadvantage is
224 Summary
Even though the specified source and destination operands arc STRING3 and
STRING4, respectively, when MOVS is executed the first byte of STRING l is
moved lo the first byte of STRJNG2. This is became the assembler translates
MOVS STRING4, STRING3 into the machine code for MOVSB, and 51 and
DI arc pointing to the first bytes of STRING l and STRING2, respectivcry.
Summary
Glossary
(memory) string A byte or word array
·' .
. New Instructions
CLD LODSW SCASW
CMPS MOVS STD
CMP:3!:1 MOVSB STOS
CMPSW MOVSW STOSB
LODS
LODSB
SCAS
SCASB
-
S'PE>SW
Exercises
1. Suppose
SI contains 1OOh Byt~ 100h contains. 10h
Di contains 200h Byte 1O1 h contains 1Sh
A>« contains 4142h Byte 200h contains ~
OF= 0 Byte ~Olh ·contains 25"
Give .the source, dcstktatlon,' and vOJIU«! moved fot each of the fol-
lowing instructions. Also give the new rontents of SI and DI.
a. MOVSB
b. MO"JSW
c. STOSB
d. STOSW
e. LODSB
f. LO::JSW
2. · Sup1>0sc lhc following dccl:irations have been made:
d, SCA SE
e. CMPSH
6. Suppose the following string has been declared:
Write instructinm that will cause each ..... to tJe rq1laced by "E".
7. Suppose the following string has been declared:
:.; I S A ':' ;:
(? j
Programming Exercises
8. ,\ palimlrumc is a characll'r ~!ring that re;uh lhl' \dllll' for\\·ard or
backward. In d(·ciding if ;i string b a pali11d1ume, \\e ignore
b!Jnks, punctuation. <tnl1 letter case. for •%•111pk "\!;1dJill, I'm
Adam" or "A man, a µJan, a canal, l'anam.1 1"
Write a program that (a) lets the user input a string, (b) prints it
forwa1d and !Jackw<ird without pum:tuation .lllu !Jl;inks un succes-
sive lines, and (CJ decides whl'lher it is a pJlindrorrn: anJ prints
the conclusion, "
9. In spread~hcel app1;c;1Linn,, it is useful to di,pl.1y 1111111ih:r\ lighl-
ju~tified in fixed fields. For example, thl'\l' m11nbers are right-justi-
fied in a field oi Jo char.icters:
l 3 ,, :,
n:,254 s
S6'
12. Write a procedure INSERT that will im<:rt a 'tring STIUNG 1 into a
string STJUNG2 at a spedficd point.
Input
SI offset address of STRING 1
DI offset address of STRING2
BX length of STRING 1
CX length of STRING2
AX offset address at whJCh to insert STRING 1
Output
DI offset address of new string
BX length of new stnng
The prucl·<.lur<: ma>· assu11H? that 11C'itlwr \trin~ ha\ O ll·ngth, and.
that thl' ad<.lrl''' in AX i' within SI HJM;2.
Writ<: a program that inputs twQ string' STHl '.':G l JnJ STJ\l:"G2, a
nonnegative decimal intcg<:r !\', 0 <= N <= .JO, im<.:rts SllW-.:C I
into STl{ING2 at position N bytes after th.c b""ginning u!·
STRING2, and displays the resulting string. You may assume .th<ll :_·
N <= length of STRING2 and that the knt;th of ea(h string is!"''
than 40.
228 Programming Exercises
13. Write a proccuure VELETE that will remove N IJyte~ from a ~Iring
at a specified point and close the gap.
Input
DI offset address of string
BX length of string
CX number of bytes N to be removed
SI offset address within string at which to remove bytes
Output
DI offset address of new string
BX length of new string
The procedure may assume that the string has nonzero length,
the number of bytes to be removed is not greater than the length
of the string, and that the address in SI is within the string.
Wiite a program that reads a string STRING, a decimal integer S
that represents a position in STRING, a decimal integer N that rep-
resents the number of bytes to be removed (both integers be-
tween- 0 and 80), calls DELETE to remove N bytes at position S,
and prints the. resulting string. You may assume 0 s N s L - S,
where L =length of STRING.