Data Structures for
VB programmers
Francesco Balena
Agenda
.NET basic types
Object, String, StringBuilder, Char classes
Arrays
creating, sorting, clearing, copying, searching arrays
Collections
BitArray, Stack, Queue, ArrayList, StringCollection, HashTable,
SortedList classes
Regular Expressions
2
The System.Object class
All .NET types inherit from System.Object
the only exception are Interfaces
exposes five public members and two protected members
Equal: virtual method that returns True if current object
has same value as passed argument (often overridden)
GetHashCode: virtual method that returns a unique ID for
this object (used when object is a key in a collection)
GetType: a non-virtual method that returns a System.Type
that identifies the type of current object (Reflection)
ToString: a virtual method that returns the name of the
class (often overridden to provide textual representation)
ReferenceEquals: returns True if a reference point to the
same object (same as Equal, but isn't overridden)
MemberwiseClone and Finalize protected methods
3
The String class
An immutable sequence of characters
Read-only properties
Length same as Len(s), Chars(i) same as Mid(s, i, 1)
Padding methods
PadLeft(len) same as RSet, PadRight(len) same as LSet
Conversion and trimming methods
ToLower, ToUpper, TrimStart, TrimEnd, Trim
Search methods
IndexOf, LastIndexOf, StartsWith, EndsWith
Replace only works with single chars
only case-sensitive comparisons
Substring methods
SubString, Insert, Remove
Mid command
4
String class
' Insert a substring
Dim s As String = "ABCDEFGHIJK"
s = s.Insert
Insert(3,
Insert "1234") ' => ABC1234DEFGHIJK
' trim leading whitespaces
Dim cArr() As Char = { " "c, Chr(9) }
s = s.TrimStart
TrimStart(cArr)
TrimStart
' pad with leading zeros
s = s.PadRight
PadRight(10,
PadRight "0"c) ' => 0000012345
' compare two strings in case-insensitive mode
If String.Compare
Compare(s1,
Compare s2, True) = 0 Then ok = True
' strings supports enumeration in For Each loops
Dim s As String = "ABCDE"
Dim c As Char
For Each c In s
Debug.Write(c & ".") ' => A.B.C.D.E.
Next
5
The Char class
Represents a single character
Exposes several useful shared methods for testing a
character
IsControl, IsDigit, IsLetter, IsLetterOrDigit, IsLower,
IsNumber, IsPunctuation, IsSymbol, IsUpper, and
IsWhiteSpace
these methods take a single char or a string+index
' Check an individual Char value
Debug.Write(Char.IsDigit("1"c)) ' => True
' Check the N-th character in a string
Debug.Write(Char.IsDigit("A123", 0)) ' => False
6
The StringBuilder class
The System.Text.StringBuilder class stores strings
uses an internal buffer that grows automatically
better performance than System.String, fewer GCs
also useful when passing strings to API functions
Conceptually similar to VB6 fixed-length strings
no need to discard trailing spaces
the result is never truncated
Default initial capacity is 16 chars, but can be overridden
Dim sb As New StringBuilder(1000)
Supports most string methods
Insert, Remove, Replace, ...
The Append(s) method
The Length property returns current internal length
7
The StringBuilder class
Imports System.Text
' a StringBuilder object with initial capacity of 1000 chars
Dim sb As New StringBuilder(1000)
StringBuilder(1000)
' Create a comma-delimited list of the first 100 integers.
Dim n As Integer
For n = 1 To 100
' Note that two Append methods avoid a slow & operator
sb.Append
Append(n)
Append
sb.Append
Append(",")
Append
Next
Debug.WriteLine(sb) ' => 1,2,3,4,5,6,...
Debug.WriteLine("Length is " & sb.Length
Length.ToString)
Length ' => 292
8
Arrays
9
The Array class
Instance read-only properties
Rank returns the number of dimensions
Length returns the total number of elements
Instance methods
GetLength(n) number of elements along a given dimension
GetLowerBound(n), GetUppperBound(n)
like LBound,UBound, except N is zero based, not 1-based
Clone method creates a copy
does a shallow copy (as opposed to a deep copy)
The Array.Reverse method reverses the order
Reverse(arr) or Reverse(arr, start, length) variations
10
Sorting arrays
Array.Sort shared method overloaded variations
Sort(arr)
Sort(arr, start, length) to sort subarrays
Indexed sorts
Sort(keys, arr) uses a parallel array that holds the keys
Sort(keys, arr, start, length) does it with a subarray
the keys array is sorted as well
The Sort method honors the IComparable interface
All the above can take an additional IComparer argument
Dim Employees() As Employee = { ... }
Dim HireDates() As Date
' ... fill HireDates() with data from employees ...
Array.Sort(HireDates,
Array.Sort Employees, 0, N)
11
IComparable interface
A class implements this interface to make itself sortable
The CompareTo(obj) method
returns -1 if current object is less than argument
return 0 if current object is equal to argument
returns +1 if current object is greater than argument
Simple to implement
Supports only one sort key
Dim Persons() As Person = { New Person("John", "Smith"), _
New Person("Robert", "Doe"), New Person("Joe", "Doe") }
Array.Sort(Persons)
Dim p As Person
For Each p In Persons
Debug.WriteLine(p.ReverseName)
Next
12
IComparable interface
Class Person
Implements IComparable
Public FirstName, LastName As String
Sub New(ByVal FName As String, ByVal LName As String)
Me.FirstName = FName
Me.LastName = LName
End Sub
Function ReverseName() As String
Return LastName & ", " & FirstName
End Function
Private Function CompareTo(ByVal Obj As Object) As Integer _
Implements IComparable.
IComparable.CompareTo
Dim other As Person = CType(Obj, Person)
Return StrComp(ReverseName, other.ReverseName, _
CompareMethod.Text)
End Function
End Class
13
IComparer interface
A class implements this interface to work as a comparer for
two objects
several .NET methods take an IComparer argument
typically a class can contain one or more nested classes that
implement this interface, one class for each sort criteria
The Compare(obj1, obj2) method returns -1, 0, or 1
Compare is a reserved words in VB.NET, use [Compare]
The CaseInsensitiveComparer classes in
System.Collections
' Sort the array on name
Array.Sort(Persons, New Person.CompareByName
Person.CompareByName)
CompareByName
' Sort the array on reversed name
Array.Sort(Persons, New Person.CompareByReverseName
Person.CompareByReverseName)
CompareByReverseName
14
IComparer interface
Class Person
' ...
Class CompareByName
Implements IComparer
Function ComparePers(ByVal o1 As Object, ByVal o2 As Object) _
As Integer Implements IComparer.Compare
IComparer.Compare
Dim p1 As Person = CType(o1, Person)
Dim p2 As Person = CType(o2, Person)
Return StrComp(p1.CompleteName, p2.CompleteName, _
CompareMethod.Text)
End Function
End Class
Class CompareByReverseName
Implements IComparer
' note how you can avoid name conflits
Function [Compare](ByVal o1 As Object, ByVal o2 As Object) _
As Integer Implements IComparer.Compare
IComparer.Compare
' note the on-the-fly casts
Return StrComp(CType
CType(o1,
CType(o1, Person).ReverseName,
Person). _
CType(o
CType(o2
(o2, Person).ReverseName,
Person). CompareMethod.Text)
End Function
End Class
End Class
15
Deleting and inserting elements
Array.Copy shared method works inside the same array
can be used to shift elements in both directions (smart copy)
' Generic routines for deleting and inserting elements
Sub ArrayDeleteItem(ByVal arr As Array, ByVal ndx As Integer)
' Shift elements from arr(ndx+1) to arr(ndx)
Array.Copy(arr,
Array.Copy ndx + 1, arr, Index, arr.Length - ndx - 1)
' Clear the last element - works with any array type
Array.Clear(arr,
Array.Clear arr.Length - 1, 1)
End Sub
Sub ArrayInsertItem(ByVal arr As Array, ByVal ndx As Integer)
' Shift elements from arr(ndx) to arr(ndx+1)
Array.Copy(arr,
Array.Copy Index, arr, ndx + 1, arr.Length - ndx - 1)
' Clear element at arr(ndx) - works with any array type
Array.Clear(arr,
Array.Clear ndx, 1)
End Sub
16
Searching values
The Array.IndexOf method does case-sensitive searches
res = Array.IndexOf(arr, search)
res = Array.IndexOf(arr, search, start)
res = Array.IndexOf(arr, search, start, length)
returns the index of first match, -1 if not found
requires a loop to extract all matches
The Array.LastIndexOf method starts at the last element
Dim arr() As String = {"Robert", "Joe", "Ann", "Chris", "Joe"}
Dim Index As Integer = -1
Do
' Search next occurrence, exit if not found
Index = Array.IndexOf
Array.IndexOf(arr,
IndexOf "Joe", Index + 1)
If Index = -1 Then Exit Do
Debug.WriteLine("Found at index " & Index.ToString)
Loop
17
Searching values
The Array.BinarySearch method works with sorted arrays
res = Array.BinarySearch(arr, search)
res = Array.BinarySearch(arr, start, length, search)
both forms support an additional IComparer argument
if not found, returns a negative value as the NOT complement
of the index where the element should be inserted
Dim Index As Integer = Array.BinarySearch
Array.BinarySearch(strArray,
BinarySearch "David")
If Index >= 0 Then
Debug.WriteLine("Found at " & Index.ToString)
Else
' Negate the result to get the index for insertion point
Index = Not Index
Debug.WriteLine("Not Found. Insert at " & Index.ToString)
End If
18
Collections and Dictionaries
19
The System.Collections namespaces
This namespace contains several collection-like classes
BitArray
Stack
Queue
ArrayList
Hashtable
SortedList
...
All these objects implement one or more of the following
interfaces
IEnumerable, ICloneable
ICollection, IList, IDictionary
All these object can store Object (i.e. any type of value)
with the exception of BitArray
20
Main collection interfaces
ICollection interface: an unordered group of items
inherits from IEnumerable (For Each support)
Count read-only property
CopyTo copies data into a destination array
IList interface: items that can be individually indexed
inherits from ICollecton (For Each support, Count, CopyTo)
Item(index) property sets or returns an element
Add(obj), Insert(obj, index) methods add and element
Remove(obj), RemoveAt(index), Clear remove elements
Contains(obj), IndexOf(obj) methods search an element
IDictionary interface: items that can be searched by a key
inherits from ICollecton (For Each support, Count, CopyTo)
Item(obj) property sets or returns an element
Add(key, obj) method add an element
Remove(key), Clear methods remove one/all elements
Keys, Values properties return the ICollection of keys/values
Contains(key) method tests whether a key is there
21
The BitArray class
The BitArray class holds Boolean values in a compact form
Create the array, optionally initialize all items
Dim ba As New BitArray(1000)
Dim ba As New BitArray(1000, True)
Initialize from a Boolean array or another BitArray
Dim ba As New BitArray(boolArr)
Get and Set methods to read/write elements
SetAll(bool) method sets all the elements
Not method negates all elements
And(ba), Or(ba) combine values with another BitArray
CopyTo(dest, index) copies to a compatible array
e.g. Boolean, Integer, Long
22
The BitArray class
' An array of 1000 values set to True
Dim ba As New BitArray(1000,
BitArray True)
' Set element at index 100 and read it back.
ba.Set
Set(100,
Set True)
Debug.Write(ba.Get
Get(100))
Get ' => True
' Move all the elements back to an Integer array.
Dim intArr(1000) As Integer
' 2nd arg = index where the copy begins in target array.
ba.CopyTo
CopyTo(intArr,
CopyTo 0)
' AND all the bits with the complement of bits in ba2.
ba1.And
And(ba2.Not
And Not)
Not
' count how many True values are in the BitArray
Dim b As Boolean, Count As Integer
For Each b In ba
If b Then Count += 1
Next
Debug.Write("Found " & Count.ToString & " True values.")
23
The Stack class
The Stack class implements a Last-In-First-Out (LIFO)
structure
must specify the max capability in the constructor
Dim st As New Stack(1000)
Push(obj) method pushes a value on the stack
Pop method pops it
Peek method returns the top-on-stack without popping it
Count returns the number of items in the stack
Contains(obj) checks whether a value is on the stack
24
The Stack class
' Create a stack that can contain 100 elements
Dim st As New Stack(100)
Stack
' Push three values onto the stack
st.Push
Push(10)
Push
st.Push
Push(20)
Push
st.Push
Push(30)
Push
' Pop the value on top of the stack and display it
Debug.WriteLine(st.Pop
Pop)
Pop ' => 30
' Read the TOS without popping it
Debug.WriteLine(st.Peek
Peek)
Peek ' => 20
' Now pop it
Debug.WriteLine(st.Pop
Pop)
Pop ' => 20
' Determine how many elements are now in the stack
Debug.WriteLine(st.Count
Count)
Count ' => 1
Debug.WriteLine(st.Pop
Pop)
Pop ' => 10
' Is the value 10 somewhere in the stack?
If st.Contains
Contains(10)
Contains Then Debug.Write("Found")
25
The Queue class
The Queue class implements a First-In-First-Out (FIFO)
structure
you must specify an initial capacity in the constructor
plus an optional growth factor (default is 2)
Dim qu As New Queue(1000, 1.5)
Enqueue(obj) method inserts a new value
Dequeue method extracts the next value
Peek method reads the next value but doesn't dequeue it
Count property returns the number of elements in the
queue
Clear methods clears the queue contents
Contains(obj) checks whether a value is in the queue
26
The Queue class
Dim qu As New Queue(100)
Queue
' Insert three values in the queue
qu.Enqueue
Enqueue(10)
Enqueue
qu.Enqueue
Enqueue(20)
Enqueue
qu.Enqueue
Enqueue(30)
Enqueue
' Extract the first value and display it
Debug.WriteLine(qu.Dequeue
Dequeue)
Dequeue ' => 10
' Read the next value but don't extract it
Debug.WriteLine(qu.Peek
Peek)
Peek ' => 20
' Extract it
Debug.WriteLine(qu.Dequeue
Dequeue)
Dequeue ' => 20
' Check how many items are still in the queue
Debug.WriteLine(qu.Count
Count)
Count ' => 1
' Extract the last element, check that the queue is now empty
Debug.WriteLine(qu.Dequeue
Dequeue)
Dequeue ' => 30
Debug.WriteLine(qu.Count
Count)
Count ' => 0
27
The ArrayList class
The ArrayList class can be considered the combination of
the Array and Collection classes
as an array: dim, reference by index, sort, reverse, search
an a collection: append and insert elements, remove them
Specify an initial capacity in the constructor (default is 16)
Dim al As New ArrayList(1000)
can also change by assigning to the Capacity property
al.Capacity = 2000
you can't control the growth factor (it is always 2)
the TrimToSize method shrinks the ArrayList
ArrayList implements IList
Add, Insert, Remove, RemoveAt, Clear, etc.
The ArrayList.Repeat method creates an ArrayList and
initializes it elements
Dim al As ArrayList = ArrayList.Repeat("", 100)
28
The ArrayList class
' An ArrayList with 100 elements equal to a null string
Dim al As ArrayList = ArrayList.Repeat
ArrayList.Repeat("",
.Repeat 100)
al.Add
Add("Joe")
Add
al.Add
Add("Ann")
Add
al.Insert
Insert(0,
Insert "Robert") ' Insert at the beginning of the list
al.RemoveAt
RemoveAt(0)
RemoveAt ' Remove 2nd element ("Joe")
' Remove removes only one element - Use a loop to remove all
Do
Try
al.Remove
Remove("element
Remove to remove")
Catch
Exit Do
End Try
Loop
' More concise, but less efficient, solution
Do While al.Contains
Contains("element
Contains to remove")
al.Remove
Remove("element
Remove to remove")
Loop
29
The ArrayList class
AddRange(sourceCol) method appends all the elements in
a ICollection object to the current ArrayList
InsertRange(destIndex, sourceCol) inserts all the
elements in a ICollection object into the current ArrayList
RemoveRange(count, index) removes multiple elements
ToArray(type) returns an array of specified type
CopyTo(sourceIndex, destArr, destIndex, count) copies a
subset of the ArrayList into a compatible array
ArrayList.Adapter(ilist) creates an ArrayList wrapper for
an IList-only object
useful for sorting, searching, reversing, etc.
any changes are reflected in the inner IList object
30
The ArrayList class
' Extract all the elements to an Object array (never throws)
Dim objArr() As Object = al.ToArray
ToArray()
ToArray
' Extract elements to a String array (might throw)
Dim strArr() As String = al.ToArray
ToArray(
ToArray(GetType(String))
GetType(String))
' Copy items [10-100], starting at element 100 in strArr
al.CopyTo
CopyTo(0,
CopyTo strArr, 100, 91)
' Combine two ArrayLists into a third one
joinedAl = New ArrayList(al1.Count
ArrayList Count + al2.Count
Count)
Count
joinedAl.AddRange
AddRange(al1)
AddRange
joinedAl.AddRange
AddRange(al2)
AddRange
' move items in a ListBox into another ListBox in reverse order
ListBox2.Items.AddRange
AddRange ( ListBox1.Items )
Dim ad As ArrayList = ArrayList.Adapter(ListBox2.Items)
ArrayList.Adapter
ad.Sort
Sort
ad.Reverse
Reverse
31
The StringCollection class
Dim sc As New System.Collections.Specialized.StringCollection
System.Collections.Specialized.StringCollection
' Fill it with month names, in one operation
' (MonthNames returns a String array, which implements IList)
sc.AddRange
AddRange(DateTimeFormatInfo.CurrentInfo.MonthNames())
AddRange
Dim s As String ' Display the contents
For Each s In sc
Debug.Write(s)
Next
' A temporary ArrayList that wraps around the StringCollection
Dim al As ArrayList = ArrayList.Adapter
ArrayList.Adapter(sc)
.Adapter
' Sort the inner StringCollection through the wrapper
al.Sort
Sort
al.Reverse
Reverse
' Destroy the wrapper object, which isn't necessary any more
al = Nothing
32
The Hashtable class
The Hashtable class is a series of (value, key) pairs
implements IDictionary
similar to VB6 Dictionary, but the key can be anything
uses key's GetHashCode to derive the hash code
an initial capacity can be specified in the constructor
The load factor
the ratio between existent values and available buckets
the lower it is, the more memory is used, the fewer collisions
set in the constructor (the value 1 is "enough good")
Dim ht As New Hashtable ' Default capacity and load factor
Dim ht As New Hashtable(1000)
Hashtable ' Initial capacity
' Specified initial capability and custom load factor.
Dim ht As New Hashtable(1000,
Hashtable 0.8) ' custom capacity and l.f.
' Decrease the load factor of the current Hashtable
ht = New HashTable(ht,
HashTable 0.5)
33
The Hashtable class
ht.Add
Add("Joe",
Add 12000) ' Syntax is .Add(key, value)
ht.Add
Add("Ann",
Add 13000)
' Referencing a new key creates an element
ht.Item
Item("Robert")
Item = 15000
ht("Chris") = 11000 ' Item is default member
ht("Ann") = ht("Ann") + 1000
' keys are compared in case-insensitive mode
ht("ann") = 15000 ' creates a *new* element
' Reading a non-existing element doesn't create it.
Debug.WriteLine(ht("Lee")) ' returns Nothing
' Remove an element given its key
ht.Remove
Remove("Chris")
Remove
' How many elements are now in the hashtable?
Debug.WriteLine(ht.Count
Count)
Count ' => 4
' Adding an element that exists already throws an exception
ht.Add
Add("Joe",
Add 11500) ' ArgumentException
34
The Hashtable class
Enumerating the Hashtable
at each iteration you get a DictionaryEntry object
this object exposes a Key and Value property
Keys and Values properties return an ICollection object
' Display keys and values
Dim e As DictionaryEntry
For Each e In ht
Debug.Write("Key=" & e.Key.ToString)
e.Key
Debug.WriteLine("Value=" & e.Value.ToString)
e.Value
Next
' Display all the keys in the hashtable
Dim o As Object
For Each o In ht.Keys ' use ht.Values to get values
Debug.WriteLine(o)
Next
35
The Hashtable class
Elements aren't sorted in any way
you can use anything as a key, including numbers
but ht(0) isn't the first element in the Hashtable
this is the main difference with the ArrayList
Comparisons are case-sensitive
can create a case-insensitive Hashtable as follows
Imports System.Collections
Dim ht As Hashtable = _
Specialized.CollectionsUtil
Specialized.CollectionsUtil.
CollectionsUtil.CreateCaseInsensitiveHashtable()
CreateCaseInsensitiveHashtable
36
The SortedList class
The SortedList class is the most versatile in the group
all the features of the array and the collection objects
manages two internal arrays, for keys and for values
these arrays grow as require
values are kept sorted by their key
the key must support IComparable
Several constructors are available
can specify an initial capacity (default is 16)
can initialize with the values in a IDictionary (e.g. Hashtable)
Dim sl As New SortedList(1000)
SortedList(1000) ' initial capacity
' initialize with all the elements in an IDictionary object
Dim ht As New Hashtable()
ht.Add("Robert", 100): ht.Add("Ann", 200): ht.Add("Joe", 300)
Dim sl As New SortedList(ht)
SortedList(ht)
37
The SortedList class
Affect the sort order by providing an IComparer object
Class ReverseStringComparer
Implements IComparer
Function CompareValues(x As Object, y As Object) _
As Integer Implements IComparer.Compare
' Just change the sign of the Compare method
Return -String.Compare(x.ToString, y.ToString)
End Function
End Class
' A SortedList that sorts through a custom IComparer
Dim sl As New SortedList(New
New ReverseStringComparer)
ReverseStringComparer
' A SortedList that loads all the elements in a Hashtable and
' sorts them with a custom IComparer object
Dim sl As New SortedList(ht, New ReverseStringComparer)
ReverseStringComparer
38
The SortedList class
Other methods
ContainsKey(obj) checks whether a key exists
ContainsValue(obj) checks whether a value exists
GetKey(index) returns a key at given position
GetByIndex(index) returns a value at given position
IndexOfKey(obj) returns the index of a given key
IndexOfValue(obj) returns the index of a given value
SetByIndex(index, obj) assigns a new value to an item
similar to Item, but works with the index instead of key
39
String collections and dictionaries
StringCollection is a low-overhead collection of strings
in the System.Collections.Specialized namespace
Most properties and methods of ArrayList
Item, Count, Clear, Add, AddRange, Insert, Remove,
RemoveAt, IndexOf, Contains, and CopyTo
Capacity is missing, constructor takes no arguments
use ArrayList.Adapter to implement missing functionality
StringDictionary is a low-overhead dictionary of string
Add, Remove, Clear, ContainsKey, ContainsValue, CopyTo
40
Custom collections
41
Inheriting new collections
The easiest way to create a new collection class is inheriting
from an abstract class provided by the runtime
they implement much of the functionality, you just add the
missing pieces
ReadOnlyCollectionBase class: read-only collections
can refer to inner ArrayList through the InnerList property
only need to create the collection in the constructor and
implement the Item property (no need for Add and Remove)
can be read-only or read-write
CollectionBase class: strongly-typed collections
can refer to inner ArrayList through the InnerList property
must implement only Add and Item members
may add a Create method that works as a constructor
DictionaryBase class: strongly-typed dictionary
can refer to inner dictionary through the Dictionary property
must implement only Add and Item members
42
ReadOnlyCollectionBase class
Class PowersOfTwoCollection
Inherits System.Collections.ReadOnlyCollectionBase
System.Collections.ReadOnlyCollectionBase
Sub New(ByVal MaxExponent As Integer)
MyBase.New()
' Fill the inner ArrayList object
Dim Index As Integer
For Index = 0 To MaxExponent
InnerList.Add(2
InnerList ^ Index)
Next
End Sub
' Support for the Item element (read-only)
Default ReadOnly Property Item(ByVal
Item(ByVal ndx As Integer) As Long
Get
Return CLng(InnerList
InnerList.Item
InnerList.Item(ndx))
.Item
End Get
End Property
End Class
43
CollectionBase class
' A collection object that can only store Square objects
Class SquareCollection
Inherits System.Collections.CollectionBase
System.Collections.CollectionBase
Sub Add(
Add(ByVal
ByVal Value As Square) ' strongly-typed Add
InnerList.Add(Value)
InnerList
End Sub
Default Property Item(ByVal
Item(ByVal Index As Integer) As Square
Get
Return CType(InnerList
InnerList.Item(Index),
InnerList Square)
End Get
Set(ByVal Value As Square)
InnerList.Item(Index)
InnerList = Value
End Set
End Property
Function Create(ByVal Side As Single) As Square
Create = New Square(Side)
Add(Create)
End Function
End Class
44
DictionaryBase class
Class SquareDictionary
Inherits System.Collections.DictionaryBase
System.Collections.DictionaryBase
Sub Add(
Add(ByVal
ByVal Key As String, ByVal Value As Square)
Dictionary.Add(Key,
Dictionary Value)
End Sub
Function Create(Key As String, Side As Single) As Square
Create = New Square(Side)
Dictionary.Add(Key,
Dictionary Create)
End Function
Default Property Item(ByVal
Item(ByVal Key As String) As Square
Get
Return CType(Dictionary
Dictionary.Item(Key),
Dictionary Square)
End Get
Set(ByVal Value As Square)
Dictionary.Item(Key)
Dictionary = Value
End Set
End Property
End Class
45
IEnumerable and IEnumerator
interfaces
Together they add support for enumeration (For Each)
IEnumerable.GetEnumerator returns an object that
supports IEnumerator
IEnumerator exposes three methods
MoveNext() As Boolean, returns True if there are more items
Current, returns the current value
Reset, resets the internal pointer
46
IEnumerable and IEnumerator
interfaces
Class PowersOfTwo
Implements IEnumerable
Dim MaxValue As Integer
Sub New(ByVal MaxValue As Integer)
Me.MaxValue = MaxValue
End Sub
Function GetEnumerator() As IEnumerator _
Implements IEnumerable.
IEnumerable.GetEnumerator
Return New PowersOfTwoEnumerator(MaxValue)
End Function
End Class
47
IEnumerable and IEnumerator
interfaces
Class PowersOfTwoEnumerator
Implements IEnumerator
Dim MaxValue, CurValue As Integer
Sub New(ByVal MaxValue As Integer)
Me.MaxValue = MaxValue
Reset()
End Sub
Function MoveNext()
MoveNext As Boolean Implements IEnumerator.
IEnumerator.MoveNext
CurValue += 1
Return (CurValue <= MaxValue)
End Function
Sub Reset()
Reset Implements IEnumerator.Reset
IEnumerator.Reset
CurValue = 0
End Sub
ReadOnly Property Current()
Current As Object _
Implements IEnumerator.Current
IEnumerator.Current
Get
Return 2 ^ CurValue
End Get
End Property
End Class
48
Regular Expressions
49
Introduction to regular expressions
Regular expression classes provide you with a powerful
means to parse and process text files
replace or delete substrings
fill "gaps" in the String.Replace function
parse log files
extract information out of HTML files
The System.Text.RegularExpressions namespace contains
all the classes
Extremely sophisticated algorithms
the parser creates internal byte-code
can be optionally compiled to MSIL (and then to native code!)
comparable to Awk and Perl languages
50
The Regex hierarchy
RegEx
Matches property
MatchCollection
Match
Groups property
Group
Captures property
CaptureCollection
Capture
51
The Regex hierarchy
The Regex class represents an immutable regular
expression
a pattern that can be later applied to the parsed string
this is the root of the class hierarchy
the Matches method returns a MatchCollection
the Replace method returns a modified string
also available as shared methods
so that you don't have to instantiate a Regex object
The MatchCollection class is the collection of matches
The Match class represents an individual match
main properties: Value, Index, Length
The Group class represents a group of chars in a match
a pattern defines groups using ( ),
52
Enumerating matches
' This regular expression defines any group of 2 characters,
' consisting of a vowel following by a digit (\d)
Dim re As New Regex("[
Regex("[aeiou
("[aeiou]
aeiou]\d")
' This source string contains 3 matches for the Regex
Dim source As String = "a1 = a1 & e2"
' Get the collection of matches.
Dim mc As MatchCollection = re.Matches(source)
' How many occurrences did we find?
Debug.Write(mc.Count) ' => 3
' Enumerate the occurrences
Dim m As Match
For Each m In mc
' Display text and position of this match
Debug.Write (m.Value
Value & " at index " & m.Index
Index.ToString)
Index
Next
53
Replacing substrings
Dim source As String = "a1 = a1 & e2"
' Search for the "a" character followed by a digit
Dim re As New Regex("a\d")
' Drop the digit that follows the "a" character
Debug.Write(re.Replace
Replace(source,
Replace "a")) ' => a = a & e2
' same as above, but doesn't instantiate a Regex object
Debug.Write(Regex
Regex.Replace
Regex.Replace("a1
.Replace = a1 & e2", "a\d", "a"))
54
Character escapes
The regex language contains several constructs that can
represents special characters, groups of characters,
repetated characters or words, etc.
Character escapes match single (non printable) characters
or characters with special meaning
special characters are .$^{[(|)*+?\
any non-special character matches itself
\t tab (9)
\r carriage return (13)
\n newline (10)
\x20 any character in hex notation (\x20 = space)
\$ backslash is an escape character (\$ = dollar )
Example
\t\$\x20 matches a tab, followed by the $ symbol and a
space
55
Character classes
Character classes match a single character from a group
. the dot matches any character (except newline)
[aio] any character in square brackets
[^aio] any character not in square brackets
[A-Za-z] any character in a range
[^A-Z] any character not in a range
\w a word character, same as [a-zA-Z0-9]
\W a non-word character, same as [^a-zA-Z0-9]
\d a digit, same as [0-9]
\D a non-digit, same as [^0-9]
\s a whitespace char, same as [ \f\n\r\t\v]
\S a non-whitespace char, same as [^ \f\n\r\t\v]
Example
\s[aeiou]\w\w\W matches a space, followed by a 3-char
word, whose first char is a vowel
56
Atomic zero-width assertions
Atomic zero-width assertions indicate a special position in
the string where the pattern is to be searched
they don't consume characters
^ the beginning of the string
$ the end of the string, before the \n char if there
\b the word boundary, between a \w and a \W char
\B not on a word boundary
The behavior may change when in multiline mode
^ and $ stand for the beginning/end of individual line
\A and \Z stand for the beginning/end of the entire string
Examples
abc$ matches the "abc" chars at the end of the string/line
\bDim\b matches the "Dim" word
\bDim\B matches all words that begin with "Dim" and have 4
or more chars
57
Quantifiers
A quantifier adds optional quantity data to a regular expr
applies to the character, character class, or group that
immediately precedes it
* zero or more matches, same as {0,}
+ one or more matches, same as {1,}
? zero or one matches, same as {0,1}
{N} exactly N matches
{N,} N or more matches
{N,M} between N and M matches
Examples
\bA\w* matches a word that begins with "A"
\b[aeiou]+\b matches a word that contains only vowels
\d{3,5} matches integer values in the range 100-99999
\d*\.?\d* matches a decimal number (eg 123.45)
\d*\.?\d*[EeDd]?\d* matches an exponential number
58
Grouping constructs
You can group a subexpression between ( )
necessary to apply quantifiers to the entire subexpression
(the )+ matches "the the "
groups are automatically numbered, starting at 1
You can use \N to back-reference group number N
\b\w*(\w)\1\w*\b matches words that contains a repeated
char
You can assign names to groups with (?<name>expr)
and later reference them with \k<name>
\b(?<char>\w)\w+\k<char>\b matches words that begin and
end with same char
Regex and Match objects expose the Groups collection
each Group object has Value, Index, and Length properties
59
Alternating constructs
Inside the parenthesis you can define alternative constructs
by separating them with the | character
(the |a |an)+ matches a repeated article
^\s*[A-Za-z]\w* ?= .*$ matches a variable assignment
^\s*(Dim|Public|Private)\s+[A-Za-z]\w*\s+
As String\s*$ matches declarations of String variables
in last two cases, must use multiline mode
60
Positional constructs
(?=subexpr): the match must precede something,
\w+(?=,) matches a word followed by a comma, without
matching the comma
(?!subexpr): the match must NOT precede something
\w+\b(![,:;]) matches a word not followed by a comma, a
color, or a semicolon - without matching whatever the
following punctuation symbol is
(?<=subexpr): the match must follow something
(?<=\d+[EeDd])\d+ matches the exponent part of an number
in exponential format (eg "45" in "123D45")
(?<!subexpr): the match must not follow something
(?<!)\b\w+ matches a word that doesn't follow a comma
61
RegEx options
The RegEx constructor takes an optional bit-coded
RegexOption argument that affects parsing
IgnoreCase does case-insensitive searches
Multiline is for parsing text files
changes the meaning of ^ and $ special chars
Compile compiles to MSIL
...
Examples
Dim re As New RegEx("\b(\w)\w+\1\b", _
RegexOptions.IgnoreCase)
Dim re As New RegEx("\s+Class (\w+)", _
RegexOptions.IgnoreCase BitOr _
RegexOptions.Multiline)
62
Search variations
The Split method is similar to the String.Split method
except it defines the delimiter using a regular expression
rather than a single character
you can optionally specify max number of elements and
starting index
source = "123, 456,,789"
' this defines a comma, preceded or followed
' by zero or more spaces (result includes empty elements)
Dim re As New Regex("\s*,\s*")
' to discard empty elements and CRLF use the following
Dim re As New RegEx("[ ,\r\n]", RegexOptions.Multiline]
Dim s As String
For Each s In re.Split
Split(source)
Split
' Note that the 3rd element is a null string.
Debug.Write(s & "-") ' => 123-456--789
Next
63
Replace substrings
The Replace method lets you define a replace pattern
you can optionally specify max number of elements and
starting index
This pattern can include placeholders
$0 is the regular expression being searched
$1-$99 are numbered groups in the regular expression
${name} are named groups in the regular expression
' change date format from mm-dd-yy to dd-mm-yy
' works with any separator and 2 or 4 year digits
Dim source As String = "12-2-1999 10/23/2001 4/5/2001 "
Dim pattern As String = "\b(?<mm>\d{1,2})(?<sep>(/|-))" _
& "(?<dd>\d{1,2})\k<sep>(?<yy>\d{2})\b"
Dim re As New Regex(pattern)
Debug.Write(re.Replace
Replace(source,
Replace "${dd}${sep}${mm}${sep}${yy}"))
' => 2-12-1999 23/10/2001 5/4/2001
64
Replace with callback
For more sophisticated replace operations you can define a
callback function that is called for each match
the callback function receives a Match object
use the Groups collection to find substrings
Sub Main()
' define two integers separated by a + symbol
Dim re As New Regex("(\d+)\s*\+\s*(\d+)")
Dim source As String = "a = 100 + 234: b = 200+345"
' Replace all sum operations with their result.
Debug.WriteLine(re.Replace(source,
re.Replace(source, AddressOf DoSum))
DoSum
' => a = 334: b = 545
End Sub
Function DoSum(ByVal m As Match) As String
Dim n1 As Long = CLng(m.Groups(1).Value)
Dim n2 As Long = CLng(m.Groups(2).Value)
Return (n1 + n2).ToString
End Function
65
Questions ?
66