TheDeveloperBlog.com

Home | Contact Us

C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML

C# All Substring Method

This C# program uses an algorithm to generate all possible substrings in a string.

All Substrings. Is one string contained in another?

This can be accomplished with IndexOf. But we can also extract all possible substrings, put them in a Dictionary, and do hash lookups. This can be much faster.

Intro. I developed an algorithm that needed to tell if any one string is contained in any other string. The program had about 6000 to 100000 strings. This can be done by looping over every string, and then comparing it to every other string.

However: This decimates performance. The algorithm, written this way, was slow. I decided to optimize it.

So: I developed an algorithm that extracts all possible substrings from each string. It puts all of these in a Dictionary.

And: It then can replace a loop over thousands of strings with a single hash lookup. This made performance about 30 times better.

Example. To get started, I needed a method that generates all possible substrings. This example code loops through all possible lengths of a substring. It then loops over all possible starting indexes.

Then: It writes each substring to the console with WriteLine. In the actual algorithm, these substrings are added to a Dictionary.

Key and value: In the Dictionary, the key is the substring, and the value is the original string.

C# program that generates all substrings

using System;

class Program
{
    static void Main()
    {
	string value = "abcdefghi";
	// Avoid full length.
	for (int length = 1; length < value.Length; length++)
	{
	    // End index is tricky.
	    for (int start = 0; start <= value.Length - length; start++)
	    {
		string substring = value.Substring(start, length);
		Console.WriteLine(substring);
	    }
	}
    }
}

Output

a
b
c
d
e
f
g
h
i
ab
bc
cd
de
ef
fg
gh
hi
abc
bcd
cde
def
efg
fgh
ghi
abcd
bcde
cdef
defg
efgh
fghi
abcde
bcdef
cdefg
defgh
efghi
abcdef
bcdefg
cdefgh
defghi
abcdefg
bcdefgh
cdefghi
abcdefgh
bcdefghi

Discussion. When writing this algorithm, I at first tried to use a nested loop. This was simpler to develop. Its initialization time was also lower. It needed no fancy collections like a Dictionary.

However, the simpler algorithm was worse. It was O(N squared) because it had a nested loop. With changes, I reduced this. I avoided shorter strings. Still, performance was poor. And it would become worse as data was added.

So: I had to rethink the algorithm to solve the performance disaster. As usual, the Dictionary saved the day.

Tip: An algorithm that uses nested loops is sometimes optimized by one that creates a data structure in a Dictionary.

Summary. Micro-optimizations often lead to small performance benefits. By redesigning an algorithm, the advances are sometimes more severe. And with an effective algorithm, micro-optimizations fade in importance or become unnecessary.


Related Links

Adjectives Ado Ai Android Angular Antonyms Apache Articles Asp Autocad Automata Aws Azure Basic Binary Bitcoin Blockchain C Cassandra Change Coa Computer Control Cpp Create Creating C-Sharp Cyber Daa Data Dbms Deletion Devops Difference Discrete Es6 Ethical Examples Features Firebase Flutter Fs Git Go Hbase History Hive Hiveql How Html Idioms Insertion Installing Ios Java Joomla Js Kafka Kali Laravel Logical Machine Matlab Matrix Mongodb Mysql One Opencv Oracle Ordering Os Pandas Php Pig Pl Postgresql Powershell Prepositions Program Python React Ruby Scala Selecting Selenium Sentence Seo Sharepoint Software Spellings Spotting Spring Sql Sqlite Sqoop Svn Swift Synonyms Talend Testng Types Uml Unity Vbnet Verbal Webdriver What Wpf