# String Matching Analysis¶

Consider a string of finite length Let it be . Finding whether a string of length exsists in is known as String Matching, Following is some of the comparision based String Matching Algorithms.

• Brute Force String Matching Algorithm
• Horspool String Matching
• Boyer - Moore String Matching

Before looking at the analysis part, we shall examine the Language in built methods to sorting

## The in operator and str.index()¶

We have already seen the in operator in several contexts. Let’s see the working of in operator again

In [1]:
x = 'this is some random text used for illustrative purposes'
In [2]:
x
Out[2]:
'this is some random text used for illustrative purposes'
In [3]:
'this' in x
Out[3]:
True
In [4]:
'not' in x
Out[4]:
False
In [5]:
x.index('is')
Out[5]:
2
In [6]:
x.index('not')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-a1f052cc5af7> in <module>()
----> 1 x.index('not')

## Standard import statement¶

In [7]:
from openanalysis.string_matching import StringMatchingAlgorithm,StringMatchingAnalyzer

StringMatchingAlgorithm is the base class providing the standards to implement sorting algorithms, SearchVisualizer visualizes and analyses the algorithm

## StringMatchingAlgorithm class¶

Any String Matching Algorithm, which has to be implemented, has to be derived from this class. Now we shall see data members and member functions of this class.

### Data Members¶

• name - Name of the Searching Algorithm
• count - Holds the number of basic operations performed

### Member Functions¶

• __init__(self, name): - Initializes algorithm with a name
• match(self, text, pattern) _ The base String Matching function. Sets count to 0.

## An example …. Horspool String Matching Algorithm¶

Now we shall implement the class Horspool

In [10]:
class Horspool(StringMatchingAlgorithm):                                   # Must derive from StringMatchingAlgorithm
def __init__(self):
StringMatchingAlgorithm.__init__(self, "Hosrpool String Matching")
self.shift_table = {}
self.pattern = ''

def generate_shift_table(self, pattern):                              # class is needed to include helper methods
self.pattern = pattern
for i in range(0, len(pattern) - 1):
self.shift_table.update({pattern[i]: len(pattern) - 1 - i})

def match(self, text: str, pattern: str):
StringMatchingAlgorithm.match(self, text, pattern)
self.generate_shift_table(pattern)
i = len(self.pattern) - 1
while i < len(text):
j = 0
while j < len(self.pattern) and text[i-j] == self.pattern[len(self.pattern)-1-j]:
j += 1
self.count += j                                         # Increment count here
if j == len(self.pattern):
return i-len(self.pattern)+1
if text[i] in self.shift_table:
i += self.shift_table[text[i]]
else:
i += len(self.pattern)
return -1

## StringMatchingAnalyzer class¶

This class provides the visualization and analysis methods. Let’s see its methods in detail

• __init__(self, matching): Initializes visualizer with a String Matching Algorithm.
• searcher is a class, which is derived from StringMatchingAlgorithm
• analyze(self):
• Plots the number of basic operations performed
• Both Text length and Pattern Length are varied
• Samples are chosen randomly from pre defined large data
In [11]:
StringMatchingAnalyzer(Horspool).analyze()