Thursday, March 22, 2012

Efficient Programming?

Hola!

What is an efficient program?

In a layman term - A program which does, what its "supposed to do", in a "fast" and "cheap" way...

A program can be very fast...

  • If we decrease the number of loops inside it
  • By choosing the "right data-structure" which can be used to efficiently manipulate the data, the program uses etc...


A program can be cheap...

  • If it consumes less memory (RAM space while it runs)
  • By using as less "Objects" as we can
  • Or by again using right data-structure


Ok... theory.. its always boring...  :-)

Lets jump on to solving an interesting problem... Take up a role of  Server-side Developer of a reputed website company, where the first task given to you is to validate the user password!

Here is the requirement -

You should reject a password if it contains a "repetitive sequence". A password is said to contain "repetitive sequence" if it has same sequence repeated one after the other (adjacent to each other).

Eg : 123123aaa (sequence "123" is repeated and adjacent), AbcAbczzsdkjksd, hhhsdsaBCDBCD are passwords with "repetitive sequence", where  passwords like 123abc123 (even though "123" is repeated, they are not adjacent), 12345678, abcdfgh are not.

And a sequence can be a word with at least 2 or more  distinct characters. For Eg: if 123abc123 is a password, then "12", "23, "3a" can be sequence of length 2, and "123", "23a", "3ab" can be sequence of 3 characters etc.


Once the requirement is understood.. You tend to develop a vague algorithm of your own!.. that's good...  then you start working on it.

Similarly I made my own vague thought and came up with following algorithm...

First -
      Let me start dividing my password into List of Sequences
Second -
     I use a Set to determine repetitive sequence

Fair enough... So I started coding... and came up with following listing -


public class DoubleSequenceCheck {


       // Method which generates distinct sequence out of my password string       
public List<String> getSequenceCombination(String pwd) {
List<String> list = new ArrayList<String>();
for(int i=0; i<pwd.length(); i++) {
for(int j= i+1; j<=pwd.length(); j++) {
if((j-i) > 1)
list.add(pwd.substring(i, j));
}
}
System.out.println(list);
return list;
}

       //Method which does Double Sequence check
public boolean containsDoubleSeq(String pwd) {
List<String> combi = getSequenceCombination(pwd);
Set<String> set = new HashSet<String>();
for(String s : combi) {
                        /*Here Set.add returns false, if theres already a sequence present inside it; hence, the pwd contains double sequence*/
boolean res = set.add(s);
if(res == false) {
return true;
}
}
return false;
}

public static void main(String [] args) {
DoubleSequenceCheck dc = new DoubleSequenceCheck();
dc.getSequenceCombination("AbcAbc1231234***");
}
}




Now, its time to Test... so I input - "123123aaa" - And my program returned "true" (saying yes the password contains repetitive double sequence)... Wow! nice...
Now I input - "123aaa123" (Expecting my program to return "false") and my program still returns "true".....

Since I am not keeping the track of "Index" of sequences, above program fails to find out "repetitive double sequence"  but it surely finds out "double sequence"...  fine.. let me fine tune it again....

Now when i think of storing "index" of a sequence along with the "sequence" itself - I can think of the data-structure called Map (a key, value pair holder)

Lets see the improvised version of my program "in making" ..  in the below listing...


public class DoubleSequenceCheck {


Map<String, Integer> map = new HashMap<String, Integer>();

public void getSequenceCombination(String pwd) {
map.clear();
for(int i=0; i<pwd.length(); i++) {
for(int j= i+1; j<=pwd.length(); j++) {
if((j-i) > 1) {
                 //Store the starting index of a sequence, along with the sequence
map.put(pwd.substring(i, j),i);
}
}
}
System.out.println(map.keySet());
}

public boolean containsDoubleSeq(String pwd) {
getSequenceCombination(pwd);
Set<String> set = new HashSet<String>();
for(String s : map.keySet()) {
boolean res = set.add(s);
if(res == false) {
return true;
}
}
return false;
}

public static void main(String [] args) {
DoubleSequenceCheck dc = new DoubleSequenceCheck();
System.out.println(dc.containsDoubleSeq("123em123"));
}
}


Now wait a minute, a Map's keys cannot be repeated, is there really a need of using Set again to check duplicates? Yeah! Set is redundant... lets remove it and do all stuffs using Map...

Also, make use of "Index" stored within Map to find-out the presence of repetitive sequence. The logic would be to calculate the difference in position of the 2 sequences.
If the difference  ==  the length of sequence
then they are adjacent to each other and hence are repetitive.

Eg:

If we consider a string array "123123ss"

and its index arranged as                         1  | 2 |  3 |  1 | 2 | 3 |  s |  s                    
                                                              0  | 1 |  2 |  3 | 4 | 5 | 6 |  7

Now the start index of first "123" sequence = 0 and the start index of next "123" sequence is = 3
The difference between indices = (3 - 0 )= 3 which is length of sequence "123".

Implementing all this in one method leads to a even more improved version of the program ....



public class DoubleSequenceCheck {


Map<String, Integer> map = new HashMap<String, Integer>();

public boolean containsDoubleSeq(String pwd) {
map.clear();
for(int i=0; i<pwd.length(); i++) {
for(int j= i+1; j<=pwd.length(); j++) {
if((j-i) > 1) {
String key = pwd.substring(i, j);
if(map.get(key) != null) {
int pos = map.get(key);
int len = i - pos;
if(len < 0) {
len = -(len);
}
if(key.length() == len) {
return true;
}
}
map.put(pwd.substring(i, j),i);
}
}
}
System.out.println(map.keySet());
return false;
}

public static void main(String [] args) {
DoubleSequenceCheck dc = new DoubleSequenceCheck();
System.out.println(dc.containsDoubleSeq("123ssBssB123em"));
}
}


Looks, good.. :-) and works fine too!

So, improvisation... yes.. the key to write efficient program is improvisation...

  1. Given an algorithm, first try to bring desired output. Don't worry about the efficiency, just try to create a solution.
  2. Once you have a solution, re-scan your program to check whether it contains redundant loops? redundant data-structure? unnecessary block of code which is delaying execution? - get rid of them!
  3. After each improvisation, do regression testing and repeat Step 2 until you see no scope for improvisation.


The above listing is not a perfect solution, and can be still improvised. Do leave a comment if you see any scope for improvisation.

Happy programming :-)



No comments:

Post a Comment