Wednesday, 1 November 2017

String split behavior in Python/Scala/Java & AWK/Perl/Ruby

https://chriszetter.com/blog/2017/10/29/splitting-strings/

The crux part of the post is about what happens when we split an empty string with a delimiter.

In Python, "".split(",") returns [""]. The reason behind why this works this way is explained as:

"aa".split(",") returns ["aa"]
"a".split(",") returns ["a"]
so, "".split(",") returns [""]

This behavior is same for Java, Scala and other languages which are mentioned in the post.

But in Ruby, "".split(",") returns []. The reason why this works this way is because:

"aa,aa".split(",") returns ["aa","aa"]
"aa".split(",") returns ["aa"]
"".split(",") returns []

Ruby operates based on field concept which is inherited from AWK.

For more information please refer the link mentioned on the top of the post.

Sunday, 29 October 2017

Computation of x^n

Brute force approach

The naive way of solving this problem is:


result = 1

for(int i = 1; i <= n; i++)
{
    result = result * x;
}

But this involves n multiplications. So, the asymptotic complexity of the above algorithm is O(n). Can we do better?


Optimized approach


Yes. We can. The following approach is taken from "Elements of Discrete Mathematics, by C.L.Liu".


For a given integer n, first we need to write down its binary representation. Then traverse the binary representation from MSD to LSD. If the bit is 1, write down SX and if the bit is 0, write down S.


For example, the binary representation of number 11 is 1011. The sequence we obtain after substitution is SXSSXSX. Ignore the initial SX which corresponds to the MSD bit. Then we obtain the sequence SSXSX. If we substitute S by Square and X by multiply by x, we obtain {square, square, multiply by x, square, multiply by x}. Pseudo code for the above algorithm is:


result = x

//Ignore the MSD bit.
for(current bit = MSD + 1 to LSD)
{
    if(current bit is 1)
    {
        result = result * result * x;
    }
    else
    {
        result = result * result;
    }
}

Analysis of the algorithm


Now, let us analyze the complexity of the algorithm. Any number n can be represented as binary representation which will have log n bits. Let us call log n as b. As we ignore the MSD bit, we have (b-1) bits. In the worst case, all the bits can be 1 which means we have (b-1) {square, multiply by x} sequence. {square, multiply by x} needs 2 multiplications. One for square and one for multiplication. So, the total number of multiplications needed are (b-1) * 2. As b is log(n), we need (log n - 1) * 2 multiplications. Ignoring the constants, we have asymptotic complexity as O(log n).