Friday, June 14, 2013

How do you sort an array of 100 million values, given that the values are positive and unique. Disregard the space complexity of the solution i.e. assume that you have sufficient memory.

At first glance this question sounds baffling, even though it relaxes one of the criteria of space complexity. For sorting a data store that large, the time complexity is also critical.

Here is an approach which I thought was quite ingenuous. It also takes advantage of the available space. Here goes...

Solution  


Create an array of bits whose length is the length of the maximum value of an unsigned integer. In this array, each element has a default value of 0.

So the array looks like this:

          0 , 1 ,2 , 3, 4, 5,....                                                .....N
B[] = [0][0][0][0][0][0][0][0][0][0][0][0]......[0][0][0][0][0][0]

where N is the maximum value of an unsigned integer.

Now parse the source array of numbers and for each number encountered in the source, index that into the bit array above to set that bit (i.e. change it to 1).

e.g. If S[0] = 2345 then B[2345] = 1, i.e. that corresponding index has its bit flag set to indicate that the value is in the source array.

As each of the values is considered as a lookup in the bit array, the numbers are implicitly sorted since they are indices on a contiguous memory location. The values which do not occur in the array would not have their bit flag set.

Another "side-effect" of this process is that searching for any number can be done in constant time, since it gets converted to an array lookup which is O(1). So if you want to find if a number exists in this array, all you have to do is to index the bit array and check the bit flag at that index location - if it is set then the number exists, otherwise it does not!

e.g. if you want to find if 34987 exists in the source array, check if B[34987] = 1 or 0. If it is 1 then it exists, else it does not.



No comments: