TheDeveloperBlog.com

Home | Contact Us

C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML

<< Back to VBNET

VB.NET Levenshtein Distance Algorithm

Implement Levenshtein distance, which tells us the number of String edits.
Levenshtein. In 1965 Vladmir Levenshtein introduced a distance algorithm. This tells us the number of changes or edits one string must go through to become another string. In VB.NET we compute Levenshtein distance with a Function.

Levenshtein: Returns the number of character edits that must occur to get from String A to String B.

Note: In Levenshtein distance, edits include removals, inserts and replacements.

Example. The internals of the Levenshtein distance Function are complex. Internally it allocates a two-dimensional array. This 2D array has dimensions equal to the lengths of the two argument strings.

For: A nested For-loop is executed. In the inner block of the For-loops, the cost of the change of two elements is computed.

Note: The cost is one for a different character and zero for the same character.

For Each, For

Then: Array elements are assigned, considering adjacent array elements and the cost. The ending array element is returned as the distance.

VB.NET program that computes Levenshtein distance Module Module1 Sub Main() Console.WriteLine(LevenshteinDistance("aunt", "ant")) Console.WriteLine(LevenshteinDistance("Sam", "Samantha")) Console.WriteLine(LevenshteinDistance("flomax", "volmax")) End Sub ''' <summary> ''' Compute LevenshteinDistance. ''' </summary> Public Function LevenshteinDistance(ByVal s As String, ByVal t As String) As Integer Dim n As Integer = s.Length Dim m As Integer = t.Length Dim d(n + 1, m + 1) As Integer If n = 0 Then Return m End If If m = 0 Then Return n End If Dim i As Integer Dim j As Integer For i = 0 To n d(i, 0) = i Next For j = 0 To m d(0, j) = j Next For i = 1 To n For j = 1 To m Dim cost As Integer If t(j - 1) = s(i - 1) Then cost = 0 Else cost = 1 End If d(i, j) = Math.Min(Math.Min(d(i - 1, j) + 1, d(i, j - 1) + 1), d(i - 1, j - 1) + cost) Next Next Return d(n, m) End Function End Module Output 1 5 3
Next, we review the output of this Function. The number returned by Levenshtein for the arguments "aunt" and "ant" is 1. This means only one edit is required to get from aunt to ant—the letter U is removed.
For the second result, we have five edits. The name "Sam" must be edited five times to get "Samantha"—five letters are added. And to get to "volmax" from "flomax" three edits must occur.

Note: Swapping two characters, as with the "ol" and "lo" in "volmax" and "flomax" is considered two separate edits, not one.

Summary. Aspects of the Levenshtein distance algorithm, such as its design and concepts, are not covered here. But the implementation here is correct on the example Strings tested. It is not fast.

Tip: Certain optimizations, and even caches, could be used to enhance its performance, such as memoization.

And: A data structure such as a Dictionary could store the two strings and also a cached distance. This would improve certain programs.

Dictionary
© TheDeveloperBlog.com
The Dev Codes

Related Links:


Related Links

Adjectives Ado Ai Android Angular Antonyms Apache Articles Asp Autocad Automata Aws Azure Basic Binary Bitcoin Blockchain C Cassandra Change Coa Computer Control Cpp Create Creating C-Sharp Cyber Daa Data Dbms Deletion Devops Difference Discrete Es6 Ethical Examples Features Firebase Flutter Fs Git Go Hbase History Hive Hiveql How Html Idioms Insertion Installing Ios Java Joomla Js Kafka Kali Laravel Logical Machine Matlab Matrix Mongodb Mysql One Opencv Oracle Ordering Os Pandas Php Pig Pl Postgresql Powershell Prepositions Program Python React Ruby Scala Selecting Selenium Sentence Seo Sharepoint Software Spellings Spotting Spring Sql Sqlite Sqoop Svn Swift Synonyms Talend Testng Types Uml Unity Vbnet Verbal Webdriver What Wpf