KEMBAR78
3 Array | PDF | Matrix (Mathematics) | Pointer (Computer Programming)
0% found this document useful (0 votes)
72 views78 pages

3 Array

The document discusses arrays, structures, and polynomials. It defines arrays as a set of index-value pairs where each index is associated with a value. Structures (records) group together data of different types under one name. Polynomials can be represented as structures containing pairs of exponents and coefficients. The document presents two approaches to representing polynomials in memory: using separate data structures for each polynomial versus storing all polynomials in a single global array.

Uploaded by

ALBERT ZHANG
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views78 pages

3 Array

The document discusses arrays, structures, and polynomials. It defines arrays as a set of index-value pairs where each index is associated with a value. Structures (records) group together data of different types under one name. Polynomials can be represented as structures containing pairs of exponents and coefficients. The document presents two approaches to representing polynomials in memory: using separate data structures for each polynomial versus storing all polynomials in a single global array.

Uploaded by

ALBERT ZHANG
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 78

Chapter 2

Arrays and structures

All the programs in this file are selected from


Ellis Horowitz, Sartaj Sahni, and Susan Anderson-Freed
“Fundamentals of Data Structures in C”,
Outline
Arrays, Structures, and Unions
Polynomials
Sparse Matrices
Arrays, Structures,
and Unions
2.1, 2.2, 2.3; page 51 - 63
Arrays
Array: a set of index and value

data structure
For each index, there is a value associated with
that index.

representation
implemented by using consecutive memory.

Example: int list[5]: list[0], …, list[4] each contains an integer

element
Structure Array is
objects: A set of pairs <index, value> where for each value of index
there is a value from the set item. Index is a finite ordered set of
one or more dimensions, for example, {0, … , n-1} for one dimension,
{(0,0),(0,1),(0,2),(1,0),(1,1),(1,2),(2,0),(2,1),(2,2)} for two dimensions,
etc.
Functions:
for all A  Array, i  index, x  item, j, size  integer
Array Create(j, list) ::= return an array of j dimensions where list is a
Create(3,{2,2,1}) : A j-tuple whose ith element is the size of the
ith dimension. Items are undefined.
3D, 2x2x1 array
Item Retrieve(A, i) ::= if (i  index) return the item associated with
index value i in array A
else return error

Array Store(A, i, x) ::= if (i in index)


return an array that is identical to array
A except the new pair <i, x> has been
inserted else return error
end array

ADT2.1: Abstract Data Type Array


Arrays in C
int list[5], *plist[5];

list[5]: five integers


list[0], list[1], list[2], list[3], list[4]
*plist[5]: five pointers to integers
plist[0], plist[1], plist[2], plist[3], plist[4]

implementation of 1-D array


list[0] base address = 
list[1]  + 1*sizeof(int)
list[2]  + 2*sizeof(int)
list[3]  + 3*sizeof(int)
list[4]  + 4*size(int)
Arrays in C (Continued)

Compare int *list1 and int list2[5] in C.

Same: list1 and list2 are (pointers).


Difference:list2 reserves five locations.

Notations:
list2: a pointer to list2[0]
(list2 + i): a pointer to list2[i] ( &list2[i] )
*(list2 + i): ( list2[i] )
Example: 1-dimension array addressing
int one[] = {0, 1, 2, 3, 4};
Goal: print out address and value
void print1(int *ptr, int rows)
{
/* print out a one-dimensional array using a pointer */
int i;
printf(“Address Contents\n”);
for (i=0; i < rows; i++)
printf(“%4u%5d\n”, ptr+i, *(ptr+i));
printf(“\n”);
}
call print1(&one[0], 5)

Address Contents
1228 0
1230 1
1232 2
1234 3
1236 4
*Figure 2.1: One- dimensional array addressing
Structures (records)
Anonymous structure Named structure

struct { struct man {


char name[10]; char name[10];
int age; int age;
float salary; float salary;
} person; };

strcpy(person.name, “james”); struct man person


person.age=10;
person.salary=35000;
Create user-defined structure
data type typedef struct human_being {
char name[10];
int age;
float salary;
};

or

typedef struct {
char name[10];
int age;
float salary
} human_being;

human_being person1, person2;


Unions
Similar to struct, but only one field is active.
Example: Add fields for male and female.
typedef struct gender_type {
enum tag_field {female, male} gender;
union {
int children; #define FALSE 0
#define TRUE 1
int beard;
} u;
};
typedef struct human_being {
char name[10]; human_being person1, person2;
int age;
person1.gender_info.gender=male;
float salary; person1.gender_info.u.beard=FALSE;
date dob;
gender_type gender_info;
}
Self-Referential Structures
One or more of its components is a pointer to itself.

typedef struct list { Construct a list with three nodes


char data; item1.link=&item2;
list *link; item2.link=&item3;
} malloc(): obtain a node
free(): free memory
list item1, item2, item3;
item1.data=‘a’; a 1223 b c
item2.data=‘b’; Address: 1223

item3.data=‘c’;
item1.link=item2.link=item3.link=NULL;
Polynomials
2.4 page64 - 71
Ordered List Examples
ordered (linear) list: (item1, item2, item3, …, item n)
(MONDAY, TUEDSAY, WEDNESDAY, THURSDAY,
FRIDAY, SATURDAYY, SUNDAY)
(2, 3, 4, 5, 6, 7, 8, 9, 10, Jack, Queen, King,
Ace)
(1941, 1942, 1943, 1944, 1945)
(a1, a2, a3, …, an-1, an)
Operations on Ordered List
Find the length, n , of the list.
Read the items from left to right (or right to left).
Retrieve the ith element.
Store a new value into the ith position.
Insert a new element at the position i , causing elements
numbered i, i+1, …, n to become numbered
i+1, i+2, …, n+1
Delete the element at position i , causing elements
numbered i+1, …, n to become numbered i, i+1, …, n-1
array (sequential mapping)?
Polynomials A(X)=3X20+2X5+4, B(X)=X4+10X3+3X2+1
Structure Polynomial is

objects: p( x) = a1x + ... + an x ; a set of ordered pairs of <ei,ai>


e1 en

where ai in Coefficients and ei in Exponents, ei are integers >= 0

functions:
for all poly, poly1, poly2  Polynomial, coef  Coefficients, expon 
Exponents

Polynomial Zero( ) ::= return the polynomial, p(x) = 0


Boolean IsZero(poly) ::= if (poly) return FALSE
else return TRUE
Coefficient Coef(poly, expon) ::= if (expon  poly) return its
coefficient else return Zero
Exponent Lead_Exp(poly) ::= return the largest exponent in
poly
Polynomial Attach(poly,coef, expon) ::= if (expon  poly) return error
else return the polynomial poly
with the term <coef, expon>
inserted
Polynomial Remove(poly, expon) ::= if (expon  poly)
return the
polynomial poly with the
term whose exponent is
expon deleted
else return error
Polynomial SingleMult(poly, coef, expon) ::= return the polynomial
poly • coef • xexpon
Polynomial Add(poly1, poly2) ::= return the polynomial
poly1 +poly2
Polynomial Mult(poly1, poly2) ::= return the polynomial
poly1 • poly2

End Polynomial
*ADT2.2:Abstract data type Polynomial
Polynomial Addition
A(X)=3X20+2X5+4
B(X)=X4+10X3+3X2+1
C(X) = A(X) + B(X),C(X) = ?
Polynomial Addition
/* d =a + b, where a, b, and d are polynomials */ Example:
d = Zero( ) A(X)=3X20+2X5+4
while (! IsZero(a) && ! IsZero(b)) do { B(X)=X4+10X3+3X2+1
switch COMPARE (Lead_Exp(a), Lead_Exp(b)) {
case -1: d =
Attach(d, Coef (b, Lead_Exp(b)), Lead_Exp(b)); data structure 1:
b = Remove(b, Lead_Exp(b));
break;
case 0: sum = Coef (a, Lead_Exp (a)) + Coef ( b,
Lead_Exp(b));
if (sum) {
Attach (d, sum, Lead_Exp(a));
a = Remove(a , Lead_Exp(a));
b = Remove(b , Lead_Exp(b));
}
break; #define MAX_DEGREE 101 (100 + 1)
typedef struct {
int degree;
float coef[MAX_DEGREE];
} polynomial;
case 1: d =
Attach(d, Coef (a, Lead_Exp(a)), Lead_Exp(a));
a = Remove(a, Lead_Exp(a));
}
}

insert any remaining terms of a or b into d

advantage: easy implementation


disadvantage: waste space when sparse

*Program 2.5 :Initial version of padd function


data structure 2:

use one global array to store all


polynomials
startA finishA startB finishB avail

coef 2 1 1 10 3 1
exp 1000 0 4 3 2 0
0 1 2 3 4 5 6
specification representation A(X)=2X1000+1
poly <start, finish> B(X)=X4+10X3+3X2+1
A <0,1>
B <2,5>*Figure 2.3: Array representation of two polynomials
storage requirements: startA, finishA, 2*(finishA-startA+1)
nonparse: twice as much as (1)
when all the items are nonzero

MAX_TERMS 100 /* size of terms array */


typedef struct {
float coef;
int expon;
} polynomial;
polynomial terms[MAX_TERMS];
int avail = 0;
Add two polynomials:
D=A+B
void padd (int starta, int finisha, int startb, int finishb,
int * startd, int *finishd)
{
/* add A(x) and B(x) to obtain D(x) */
float coefficient;
*startd = avail;
while (starta <= finisha && startb <= finishb)
switch (COMPARE(terms[starta].expon,
terms[startb].expon)) {
case -1: /* a expon < b expon */
attach(terms[startb].coef, terms[startb].expon);
startb++
break;
case 0: /* equal exponents */
coefficient = terms[starta].coef +
terms[startb].coef;
if (coefficient)
attach (coefficient, terms[starta].expon);
starta++;
startb++;
break;
case 1: /* a expon > b expon */
attach(terms[starta].coef, terms[starta].expon);
starta++;
}
/* add in remaining terms of A(x) */
for( ; starta <= finisha; starta++)
attach(terms[starta].coef, terms[starta].expon);
/* add in remaining terms of B(x) */
for( ; startb <= finishb; startb++)
attach(terms[startb].coef, terms[startb].expon);
*finishd =avail -1;
}

Analysis: O(n+m)
where n (m) is the number of nonzeros in A(B).
*Program 2.6: Function to add two polynomial
void attach(float coefficient, int exponent)
{
/* add a new term to the polynomial */
if (avail >= MAX_TERMS) {
fprintf(stderr, “Too many terms in the polynomial\n”);
exit(1);
}
terms[avail].coef = coefficient;
terms[avail++].expon = exponent;
}
*Program 2.7:Function to add anew term

Problem: Compaction is required


when polynomials that are no longer needed.
(data movement takes time.)
Sparse Matrices
2.5; page 72 – 84
Sparse Matrices
Matrix ➔ table of values

4 x 5 matrix
00304 4 rows
00570 Row 2
5 columns
00000 20 elements
02600 6 nonzero
Column 4 elements
Sparse Matrices
Sparse Matrices
Sparse Matrices
An n x n matrix may be stored as an n x n
array.
This takes O(n²) space.
The example structured sparse matrices may
be mapped into a 1D array so that a mapping
function can be used to locate an element
quickly; the space required by the 1D array is
less than that required by an n x n array.
Unstructured Sparse Matrices
Airline flight matrix.
airports are numbered 1 through n
flight(i,j) = list of nonstop flights from airport i to
airport j
n = 1000 (say)
n x n array of list pointers => 4 million bytes
total number of nonempty flight lists = 20,000 (say)
need at most 20,000 list pointers => at most 80,000
bytes
Unstructured Sparse Matrices
Web page matrix.
web pages are numbered 1 through n
web(i,j) = number of links from page i to page j

Web analysis.
authority page …
page that has many links to it
hub page …
links to many authority pages
Sparse Matrix
col0 col1 col2 col3 col4 col5
row0 15 0 0 22 0 − 15
row1
 0 11 3 0 0 0 
 
row2  0 0 0 −6 0 0
 
row3  0 0 0 0 0 0 
row4 91 0 0 0 0 0
 
row5  0 0 28 0 0 0
5*3 6*6
(a) 15/15 (b) 8/36

*Figure 2.4:Two matrices


sparse matrix
data structure?
SPARSE MATRIX ABSTRACT DATA
TYPE
Structure Sparse_Matrix is
objects: a set of triples, <row, column, value>, where row
and column are integers and form a unique combination, and
value comes from the set item.
functions:
for all a, b  Sparse_Matrix, x  item, i, j, max_col,
max_row  index

Sparse_Marix Create(max_row, max_col) ::=


return a Sparse_matrix that can hold up to
max_items = max _row * max_col and
whose maximum row size is max_row and
whose maximum column size is max_col.
Sparse_Matrix Transpose(a) ::=
return the matrix produced by interchanging
the row and column value of every triple.
Sparse_Matrix Add(a, b) ::=
if the dimensions of a and b are the same
return the matrix produced by adding
corresponding items, namely those with
identical row and column values.
else return error
Sparse_Matrix Multiply(a, b) ::=
if number of columns in a equals number of
rows in b
return the matrix d produced by multiplying
a by b according to the formula: d [i] [j] =
(a[i][k]•b[k][j]) where d (i, j) is the (i,j)th
element
else return error.
* Structure 2.3: Abstract data type Sparse-Matrix
(1) Represented by a two-dimensional array.
Sparse matrix wastes space.
(2) Each element is characterized by <row, col, value>.

# of rows (columns)
# of nonzero terms

row, column in ascending order


Sparse_matrix Create(max_row, max_col) ::=

#define MAX_TERMS 101 /* maximum number of terms +1*/


typedef struct {
int col;
int row;
int value;
} term; # of rows (columns)
# of nonzero terms
term a[MAX_TERMS]
Transpose a Matrix

(1) for each row i


take element <i, j, value> and store it
in element <j, i, value> of the transpose.

difficulty: where to put <j, i, value>


(0, 0, 15) ====> (0, 0, 15)
(0, 3, 22) ====> (3, 0, 22)
(0, 5, -15) ====> (5, 0, -15)
(1, 1, 11) ====> (1, 1, 11)
Move elements down very often.

(2) For all elements in column j,


place element <i, j, value> in element <j, i, value>
Transpose a Matrix

row col value row col value

a[0] 6 6 8 b[0] 6 6 8
[1] 0 0 15 [1] 0 0 15
[2] 0 3 22 [2] 0 4 91
[3] 0 5 -15 transpose [3] 1 1 11
[4] 1 1 11 [4] 2 1 3
[5] 1 2 3 [5] 2 5 28
[6] 2 3 -6 [6] 3 0 22
[7] 4 0 91 [7] 3 2 -6
[8] 5 2 28 [8] 5 0 -15

(a) (b)
void transpose (term a[], term b[])
/* b is set to the transpose of a */
{
int n, i, j, currentb;
n = a[0].value; /* total number of elements */
b[0].row = a[0].col; /* rows in b = columns in a */
b[0].col = a[0].row; /*columns in b = rows in a */
b[0].value = n;
if (n > 0) { /*non zero matrix */
currentb = 1;
for (i = 0; i < a[0].col; i++)
/* transpose by columns in a */
for( j = 1; j <= n; j++)
/* find elements from the current column */
if (a[j].col == i) {
/* element is in current column, add it to b */
columns

elements

b[currentb].row = a[j].col;
b[currentb].col = a[j].row;
b[currentb].value = a[j].value;
currentb++
}
}
}

* Program 2.8: Transpose of a sparse matrix

Scan the array “columns” times. ==> Time complexity


The array has “elements” elements. O(columns*elements)
Discussion: compared with 2-D array representation
for time complexity

O(columns*elements) vs. O(columns*rows)

elements --> columns * rows, when non-sparse


O(columns*columns*rows)

Problem: Scan the array “columns” times.

Solution: fastTranspose
Determine the number of elements in each column of
the original matrix.
==>
Determine the starting positions of each row in the
transpose matrix.
a[0] 6 6 8 After transpose
a[1] 0 0 15
a[2] 0 3 22
a[3] 0 5 -15
a[4] 1 1 11
a[5] 1 2 3
a[6] 2 3 -6
a[7] 4 0 91
a[8] 5 2 28

[0] [1] [2] [3] [4] [5]


row_terms = 2 1 2 2 0 1
starting_pos = 1 3 4 6 8 8
void fast_transpose(term a[ ], term b[ ])
{
/* the transpose of a is placed in b */
int row_terms[MAX_COL], starting_pos[MAX_COL];
int i, j, num_cols = a[0].col, num_terms = a[0].value;
b[0].row = num_cols; b[0].col = a[0].row;
b[0].value = num_terms;
if (num_terms > 0){ /*nonzero matrix*/
for (i = 0; i < num_cols; i++)
columns row_terms[i] = 0;
for (i = 1; i <= num_terms; i++)
elements row_term [a[i].col]++
starting_pos[0] = 1;
for (i =1; i < num_cols; i++)
columns starting_pos[i]=starting_pos[i-1] +row_terms [i-1];
for (i=1; i <= num_terms, i++) {
j = starting_pos[a[i].col]++;
elements b[j].row = a[i].col;
b[j].col = a[i].row;
b[j].value = a[i].value;
}
}
}
*Program 2.9:Fast transpose of a sparse matrix

Compared with 2-D array representation for time complexity


O(columns+elements) vs. O(columns*rows)

elements --> columns * rows when nonsparse


O(columns+elements) --> O(columns*rows)

Cost: Additional row_terms and starting_pos arrays are required.


Sparse Matrix
Definition: [D] =[A] * [B]
Multiplication
m*p m*n n*p
Procedure: Fix a row of A and find all elements in column j
of B for j=0, 1, …, p-1.
Alternative 1. Scan all of B to find all elements in column j.
Alternative 2. Compute the transpose of B.
(Put all column elements consecutively)

1 0 0 1 1 1 1 1 1
1 0 0 0 0 0 = 1 1 1
    
1 0 0 0 0 0 1 1 1
Col 0 Col 1 Col 2
Sparse Matrix Multiplication
Alternative 1. Scan all of B to find all elements in j.
Alternative 2. Compute the transpose of B.
(Put all column elements consecutively)

D=A*B
row
column 0 0 9 0 15 0 -1 0 5 3
15 0 -1 5 -1 0 4
0 3 5 3 0 1 5

b[0] 3 4 7 b[0] 4 3 7
a[0] 2 3 4 b[1] 0 1 5
b[1] 0 2 9
a[1] 0 0 15 b[2] 0 2 3
b[2] 1 0 5
a[2] 0 2 -1 b[3] 1 1 -1
b[3] 1 1 -1
a[3] 1 1 3 transpose b[4] 2 0 9
b[4] 1 3 4
a[4] 1 2 5 b[5] 2 2 1
b[5] 2 0 3
b[6] 2 2 1 b[6] 3 1 4
b[7] 2 3 5 b[7] 3 2 5
void mmult (term a[ ], term b[ ], term d[ ] )
/* multiply two sparse matrices */
{
int i, j, column, totalb = b[0].value, totald = 0;
int rows_a = a[0].row, cols_a = a[0].col,
totala = a[0].value; int cols_b = b[0].col,
int row_begin = 1, row = a[1].row, sum =0;
term new_b[MAX_TERMS];
if (cols_a != b[0].row){
fprintf (stderr, “Incompatible matrices\n”);
exit (1);
}
fast_transpose(b, new_b); cols_b + totalb
/* set boundary condition */

a[totala+1].row = rows_a;
new_b[totalb+1].row = cols_b;
new_b[totalb+1].col = 0;

for (i = 1; i <= totala; ) { at most rows_a times


column = new_b[1].row;
for (j = 1; j <= totalb+1;) {
/* mutiply row of a by column of b */
if (a[i].row != row) {
storesum(d, &totald, row, column, &sum);
i = row_begin;
for (; new_b[j].row == column; j++) b[0] 4 3 7
; a[0] 2 3 4 b[1] 0 1 5
column =new_b[j].row a[1] 0 0 15 b[2] 0 2 3
} a[2] 0 2 -1 b[3] 1 1 -1
a[3] 1 1 3 b[4] 2 0 9
a[4] 1 2 5 b[5] 2 2 1
b[6] 3 1 4
b[7] 3 2 5
else switch (COMPARE (a[i].col, new_b[j].col)) {
case -1: /* go to next term in a */
i++; break;
case 0: /* add terms, go to next term in a and b */
sum += (a[i++].value * new_b[j++].value);
break;
case 1: /* advance to next term in b*/
j++
}
} /* end of for j <= totalb+1 */
for (; a[i].row == row; i++)
;
row_begin = i; row = a[i].row;
} /* end of for i <=totala */
d[0].row = rows_a;
d[0].col = cols_b; d[0].value = totald;
}

*Program 2.10: Sparse matrix multiplication


void storesum(term d[ ], int *totald, int row, int column,
int *sum)
{
/* if *sum != 0, then it along with its row and column
position is stored as the *totald+1 entry in d */
if (*sum)
if (*totald < MAX_TERMS) {
d[++*totald].row = row;
d[*totald].col = column;
d[*totald].value = *sum;
}
else {
fprintf(stderr, ”Numbers of terms in product
exceed %d\n”, MAX_TERMS);
exit(1);
}
}
Program 2.11: storSum function
Analyzing the algorithm
cols_b * termsrow1 + totalb +
cols_b * termsrow2 + totalb +
…+
cols_b * termsrowp + totalb
= cols_b * (termsrow1 + termsrow2 + … + termsrowp) +
rows_a * totalb
= cols_b * totala + row_a * totalb

O(cols_b * totala + rows_a * totalb)


Compared with matrix
multiplication using array
for (i =0; i < rows_a; i++)
for (j=0; j < cols_b; j++) {
sum =0;
for (k=0; k < cols_a; k++)
sum += (a[i][k] *b[k][j]);
d[i][j] =sum;
}
O(rows_a * cols_a * cols_b) vs.
O(cols_b * total_a + rows_a * total_b)
optimal case: total_a < rows_a * cols_a
total_b < cols_a * cols_b
worse case: total_a --> rows_a * cols_a, or
total_b --> cols_a * cols_b
String
2.7; page 87 - 97
String
Usually string is represented as a character array.
General string operations include comparison, string
concatenation, copy, insertion, string matching, printing, etc.
String Matching:
Straightforward solution
Algorithm: Simple string matching e.g., String. indexAt

Input: P and T, the pattern and text strings; m, the length of P


and n, the length of T
The pattern is assumed to be nonempty.
Output: The return value is the index in T where a copy of P
begins, or -1 if no match for P is found.

Complexity: O(m*n)
Two Phases of KMP
Knuth, Morris, Pratt pattern matching algorithm

Phase 1:generate an array to indicate the moving direction.

Phase 2:make use of the array to move and match string


Example 1 for the KMP Algorithm
Example 2 Case for the KMP
Algorithm
The KMP Alogrithm

Failure array
String Matching The Knuth-
Morris-Pratt Algorithm

or 0
0
The prefix function, Π
Following pseudocode computes the prefix fucnction, Π:
Compute-Prefix-Function (p)
1 m  length[p] //’p’ pattern to be matched
2 Π[1]  0
3 k0
4 for q  2 to m
5 do while k > 0 and p[k+1] != p[q]
6 do k  Π[k]
7 If p[k+1] = p[q]
8 then k  k +1
9 Π[q]  k
10 return Π
Example: compute Π for the pattern ‘p’ below:
p a b a b a c a
Initially: m = length(p) = 7
Π[1] = 0 (or -1)
k=0
q 1 2 3 4 5 6 7
Step 1: q = 2, k=0 p a b a b a c a
Π[2] = 0
Π 0 0

q 1 2 3 4 5 6 7
Step 2: q = 3, k = 0, p a b a b a c a
Π[3] = 1
Π 0 0 1

q 1 2 3 4 5 6 7
Step 3: q = 4, k = 1 p a b a b a c a
Π[4] = 2 Π 0 0 1 2
Step 4: q = 5, k =2 q 1 2 3 4 5 6 7
Π[5] = 3 p a b a b a c a
Π 0 0 1 2 3

Step 5: q = 6, k = 0 q 1 2 3 4 5 6 7
Π[6] = 0 p a b a b a c a
Π 0 0 1 2 3 0

q 1 2 3 4 5 6 7
Step 6: q = 7, k = 0
p a b a b a c a
Π[7] = 1
Π 0 0 1 2 3 0 1

After iterating 6 times, the prefix q 1 2 3 4 5 6 7


function computation is p a b a b a c a
complete: →
Π 0 0 1 2 3 0 1
The KMP Alogrithm (cont’d)
The KMP Matcher
int pmatch(char *string, char *pat)
{ /* Knuth, Morris, Pratt的字串樣式比對演算法 */
int i = 0, j = 0;
int lens = strlen(string);
int lenp = strlen(pat);
while ( i < lens && j < lenp ) {
if (string[i] == pat[j]) {
i++;
j++;
} else if (j = = 0)
i++;
else
j = failure[j-1]+1;
}
return ( (j == lenp) ? (i-lenp) : -1);
}
Illustration: given a String ‘S’ and pattern ‘p’ as follows:

S b a c b a b a b a b a c a c a

p a b a b a c a

Let us execute the KMP algorithm to find whether


‘p’ occurs in ‘S’.
For ‘p’ the prefix function, Π was computed previously and is as follows:

q 1 2 3 4 5 6 7

p a b a b a c a

Π 0 0 1 2 3 0 1
Failure function
Initially: n = size of S = 15;
m = size of p = 7 q 1 2 3 4 5 6 7
p a b a b a c a
Step 1: i = 1 Π 0 0 1 2 3 0 1
comparing p[1] with S[1]
S b a c b a b a b a b a c a a b

p a b a b a c a
P[1] does not match with S[1]. ‘p’ will be shifted one position to the right.

Step 2: i = 2
comparing p[1] with S[2]
S b a c b a b a b a b a c a a b

p a b a b a c a
P[1] matches S[2]. Since there is a match, p is not shifted.
Step 3: i = 3
Comparing p[2] with S[3] p[2] does not match with S[3]
S b a c b a b a b a b a c a a b

p a b a b a c a
Backtracking on p, comparing p[1] and S[3] (Π[1] + 1 = 0 + 1 = 1)
Step 4: i = 3
comparing p[1] with S[3] p[1] does not match with S[3]
S b a c b a b a b a b a c a a b

p a b a b a c a
Step 5: i = 4
comparing p[1] with S[4] p[1] does not match with S[4]

S b a c b a b a b a b a c a a b

p a b a b a c a
Step 6: i = 5
comparing p[1] with S[5] p[1] matches with S[5]

S b a c b a b a b a b a c a a b

p a b a b a c a

Step 7: i = 6
Comparing p[2] with S[6] p[2] matches with S[6]
S b a c b a b a b a b a c a a b

p a b a b a c a
Step 8: i = 7
Comparing p[3] with S[7] p[3] matches with S[7]

S b a c b a b a b a b a c a a b

p a b a b a c a

Step 9: i = 8
Comparing p[4] with S[8] p[4] matches with S[8]

S b a c b a b a b a b a c a a b

p a b a b a c a
Step 10: i = 9
Comparing p[5] with S[9] p[5] matches with S[9]
S b a c b a b a b a b a c a a b

p a b a b a c a

Step 11: i = 10
Comparing p[6] with S[10] p[6] doesn’t match with S[10]

S b a c b a b a b a b a c a a b

p a b a b a c a
Backtracking on p, comparing p[4] with S[10] because after mismatch q = Π[5] = 3
(Π[5] + 1 = 3 + 1 = 4
Step 12: i = 11
Comparing p[4] with S[10] p[4] matches with S[10]

S b a c b a b a b a b a c a a b

p a b a b a c a
Step 13: i = 12
Comparing p[5] with S[11] p[5] matches with S[11]

S b a c b a b a b a b a c a a b

p a b a b a c a

Step 14: i = 13
Comparing p[6] with S[12] p[6] matches with S[12]

S b a c b a b a b a b a c a a b

p a b a b a c a
Step 15: i = 13
Comparing p[7] with S[13] p[7] matches with S[13]

S b a c b a b a b a b a c a a b

p a b a b a c a

Pattern ‘p’ has been found to completely occur in string


‘S’.
The analysis of the K.M.P.
Algorithm
O(m+n)
O(m) for computing function f
O(n) for searching P

You might also like