String decompression challenge solution

I am sharing my solution in Java to one of the former Google interview questions, which is posted by Google itself here:

I am copying the description below just in case the original link gets removed, and without it the solution will not have any meaning.


The Challenge

In this exercise, you’re going to decompress a compressed string.

Your input is a compressed string of the format number[string] and the decompressed output form should be the string written number times. For example:

The input


Would be output as


Other rules

Number can have more than one digit. For example, 10[a] is allowed, and just means aaaaaaaaaa

One repetition can occur inside another. For example, 2[3[a]b] decompresses into aaabaaab

Characters allowed as input include digits, small English letters and brackets [ ].

Digits are only to represent amount of repetitions.

Letters are just letters.

Brackets are only part of syntax of writing repeated substring.

Input is always valid, so no need to check its validity.


My solution uses the approach which I find the most intuitive for this case, the recursive one. The recursive solution can also be converted to iterative by using a stack instead of recursive function calls.

We know that we need to iterate over the input string, and when we notice the encoding pattern (digits followed by opening square bracket), we have to repeat the same logic for the capture group by calling again the same function.

We do not want the called function to write to the output sb buffer directly, because the returned value will need to get repeated and the number of repetitions is known only by the parent call, which captured the digits. For this reason we pass an empty current buffer as output buffer.

On the other hand, as long as we are not capturing digits, we add to the current characters to the current buffer. This buffer will be flushed to the output buffer once we encounter one opening square bracket [.

As any recursive solution, we need to identify the base case; otherwise we will have infinite calls. The base case is that case which will not call itself anymore.

In this challenge the base case is when a string does not contain another encoded nested string, for example abcd is a base case. It does not have any encoded string (starting with number, followed by [, then the string inside, and finally the closing bracket ]).

For all the other cases, once we notice that an encoded string is starting (started capturing digits), when we find the opening bracket (in this case this will be always after the number as per the problem description), we call the recursive function to do the same for the rest of the string. We do this until the closing bracket is found, or until another encoded string is found. If the latter happens then we call the recursive function itself to handle it.

Another important point here is that we need to skip the parts already processed by the recursive calls, and we do this by returning the index of the last processed character as the return value of the function.

The upper function in the call stack sets its i variable value to the returned one, and then proceeds from there. The groups captured by the recursive call, are then repeated by the parent call and a and added to the output buffer sb.

The full solution in Java is given below:

In case of questions, feel free to comment and I will be glad to answer.