First non repeating character in string

First non repeating character in string

Given a string, find first non repeating character in a string. For example, string  = “abcbdbdebab”, the first non repeating character would be c. Even though ‘e’ is also non repeating in string, ‘c’ is output as it is first non repeating character.

The first point which should be taken into account here is that there needs to be some kind of counting. Counting of characters. For a non repeating character, this count should be exactly 1. For keeping count against a character, some kind of mapping is required. This mapping will come from a hash. So, let’s have a hash with character as keys and count as value. But wait, for ASCII characters, do we need an hash? Well no. We can do it with an array with characters used as indices. This, however, puts a restriction that input can have only ASCII characters, for UNICODE, this solution won’t work and we map have to go for actual hash.

Now, that we have counted number of occurrence of each character in string, how to use it to find first non repeating character? Let’s go back to our string, for each character, check what is count for that character. If the count is 1, return that character. Simple as that!!

Code to find first non repeating character in a string

Complexity of algorithm is O(n) with O(1) space complexity. There is always some confusion here about space complexity, we think as 256 characters are used, should it not be counted as space complexity. Definitely. But in asymptotic notations, this space is independent of size of input, so space complexity remains O(1).

One more thing, even though time complexity is O(n), input string is scanned twice, first time to get count of characters and second time to find first non repeating character in that string.

And little better approach

In case, string is too long, i mean millions of characters, most of them repeated, very few non repeated, this solution may turn slow in the last step. In other words, we need to find some mechanism so that we don’t need to scan input string again.  In this case, some information needs to be stored with count, so that we can figure out if the character is first non repeating or not.

To achieve above mentioned solution, the extra information to be stored is index of first occurrence of each character. Now, once we have already scanned through string, scan through the array. Check for characters with count as 1. If character has count as 1, see if index of it’s occurrence in string is less than index of previous character with count as 1. If yes, change character to be returned to new character. Once, we are done with scanning count array, we will have first non repeating character.

Find first non repeating character in string : Algorithm

1. Start with 0 index of string till end
2. Count occurrence of each character and store them in array.
   With count,if the char is seen first time, store it's index too.
3. Now, scan the count array again.
4. If a character has count as 1, check if index of this character.
5. If index is less than previous or previously char does not exist, change non repeating character to this character.
6. return non repeating character.

Find first non repeating character in string : Implementation

Complexity of this algorithm remains same, however, second scan of string is now not required. In other words, second scan is now independent of size of input.

Python code
Learning it so bear with some non optimized code or silly things 🙂

Implementation of second method in python

Hope this explains the doubts around this problem. Please share your views in comments if you have some different method or idea. If you liked the post, please share it and spread the knowledge.