Question: You are given an array of strings and you are asked to display all the anagrams within the array. For those who don’t know, two words are anagrams if they contain the same characters.
Answer: The brute force solution would be to take each string and compare it to every other string in the array one character at a time ( O(n^4) solution, assuming there are n strings with maximum n characters each). This will surely give you the answer but almost certainly not the job unless you improve it. If you have no idea what the O(?) notation is, this might be a good time to read up on it (wikipedia).
Let’s use this example to improve on the solution {“abc”, “albert”, “cat”, “gate”, “cab”, “grow”, “act”}
One way we can quicken the string comparison process is by pre-sorting each string alphabetically (any sorting would be fine actually). Pre-sorting helps you avoid comparing each character in a string to each character in another string. You only need to compare every character to the character in the corresponding position of the other string. This reduces the order of the solution ( O(n^2 log n (sorting n strings with n characters each) + n^3 (n^2 comparisons of n characters each)) = O(n^3)) assuming the sorting algorithm is O(n log n)). However, this increases memory usage since you now need to store the original strings so you can print them out later.
After sorting the characters in the string, the array would look like this {“abc”, “abelrt”, “act”, “aegt”, “abc”, “gorw”, “act”}.
Currently, we are comparing each string with every other string. That seems inefficient. How about in addition to sorting the characters in each string, we sort each string in the array alphabetically (again any other sorting will do). Sorting the strings in the array means you do not have to compare each string to every other string, you only have to compare it to the next string in line. If they happen to be the same (i.e. found an anagram), then you can compare with the one after that. Whichever the case is, the order of comparison in this case will be of the order O(n) instead of O(n^2). This sorting will reduce the order to ( O(n^2 log n (sorting characters) + n^2 log n (sorting strings) + n^2 (n comparisons of n characters each)) = O(n^2 log n) assuming the sorting algorithm is O(n log n)). Make sure you also move the strings from the original array in order to maintain the mapping of the sorted array to the original array.
After sorting the strings, the array would look like {“abc”, “abc”, ”abelrt”, “act”, “act”, “aegt”, “gorw”}.
There is an alternate solution using hashing. You can first sort the characters in the string ( O(n log n) ) and then put each string in a hash table ( O(n) ). If the spot in the hash table is already taken, then that is an anagram. The order for this solution is ( O(n^2 log n (sorting n strings with n characters each) + n^2 (hashing n strings) + n (n insertions into hash table) = O(n^2 log n) assuming hashing each string is O(n) ).
As you can see, both the solutions have the same order of execution. Do you prefer one over the other? Why? Post as a comment.
As usual, if there is something wrong with this solution, feel free to ping me. Any improvements are always welcome. Cheers!



