C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML
If the strings are only ASCII, you can change them to be stored as single bytes. This reduces the memory usage by one byte per letter. We change string representations to be smaller.
Example. The concept behind this benchmark is simple. It allocates an array of 10,000 strings. The memory this requires is measured. Then another method (Compress) changes each string into a byte array. And the memory of this array is measured.
Byte Array: Memory Usage, Read All Bytes
C# program that changes string representation using System; using System.IO; using System.Text; class Program { static void Main() { long a = GC.GetTotalMemory(true); string[] array = Get(); long b = GC.GetTotalMemory(true); array[0] = null; long c = GC.GetTotalMemory(true); byte[][] array2 = Compress(Get()); long d = GC.GetTotalMemory(true); array2[0] = null; Console.WriteLine(a); Console.WriteLine(b); Console.WriteLine(c); Console.WriteLine(d); } static string[] Get() { string[] output = new string[10000]; for (int i = 0; i < 10000; i++) { output[i] = Path.GetRandomFileName(); } return output; } static byte[][] Compress(string[] array) { byte[][] output = new byte[array.Length][]; for (int i = 0; i < array.Length; i++) { output[i] = ASCIIEncoding.ASCII.GetBytes(array[i]); } return output; } } Output 39128 479800 39784 320056
In this program, the string[] required about 480,000 bytes. The byte[][] (a jagged array of byte arrays) required 320,000 bytes. There was no data loss in these strings because the strings were ASCII-only.
GC.CollectJagged ArraysConvert String, Byte Array
Converting back to strings. You can convert the byte arrays back into strings by calling ASCIIEncoding.ASCII.GetString. Please note this will have a performance and memory cost to create new strings.
Discussion. Is this useful? Probably not. However, if you have a program that stores a huge number of ASCII strings that are rarely needed, but must be stored in memory, this could be a useful optimization.
However: There is an additional cost when you need to convert back into strings.
Summary. We looked at an optimization that can compress ASCII strings to use only one byte per character instead of two bytes. In some cases, this alternate representation could save a significant amount of memory.