ONWUBAWUBWUBDARKWUBWUBDESERTWUBHIGHWAYWUBWUB
but notWUBONAWUBDARKWUBDESERTWUBHIGHWAY
(there's no WUB between "ON" and "A")
The exercise was to extract the original lyric from the DJ's dubstep version.
My first stab at this was very mechanical. Walk the string looking for "WUB" and then throwing them away. It looked like this:
public static String SongDecoderAlpha (String song) {
final StringBuilder builder = new StringBuilder();
while (song.length() > 0) {
if (song.startsWith(WUB)) {
song = song.substring(3);
} else {
int i = song.indexOf(WUB);
if (i >= 0) {
String temp = song.substring(0, i);
appendWord(builder, temp);
song = song.substring(i);
} else {
appendWord(builder, song);
break;
}
}
}
return builder.toString();
}
private static void appendWord(final StringBuilder builder,
final String temp) {
if (builder.length() > 0) {
builder.append(" ");
}
builder.append(temp);
}
I wasn't really excited about this. It did what it needed to do, passed all of the tests, but it is very complex. A while loop with nested booleans. I think it wouldn't be too hard to figure out what it does, but it wouldn't be easy to understand all the permutations. Plus it has the additional appendWord() function, reducing duplicated code and complexity, but also adding another chunk of code to understand. So, I decided to come up with another solution. public static String SongDecoderBeta (String song) {
song = song.replace(WUB, " ");
song = song.trim();
while (song.contains(" ")) {
song = song.replace(" ", " ");
}
return song;
}
This one was better. There are fewer lines of code and no if statements. There is literally almost nothing to it. It also more clearly describes what the solution is:- replace all the WUB tokens with spaces
- clean up all the extra spaces
So, a revised implementation would look like this:
public static String SongDecoder (String song)
{
song = song.replace(WUB, " ");
song = song.trim();
song = song.replaceAll(" +", " ");
return song;
}
Three lines of code and the only thing that is not direct is what the regex is doing, but that's easy enough to look up. I'm not sure which one I'd use in production code.Thinking about how this solution worked, I realized that the code was extracting the original lyrics by parsing out the WUBs. I decided to experiment with this approach to see if it was more understandable. The result was this:
public static String SongDecoder(String song)
{
final String[] words = song.split(WUB);
final StringBuilder stringBuilder = new StringBuilder();
String word;
for (int i = 0; i < words.length; i++) {
word = words[i];
if (word.length() == 0) {
continue;
}
if (stringBuilder.length() > 0) {
stringBuilder.append(" ");
}
stringBuilder.append(word);
}
return stringBuilder.toString();
}
This is better than the original in that it is self-descriptive of the approach. The one thing I hadn't considered was the behavior of String.split() when there consecutive delimiters. For example, the input RWUBWUBWUBLWUB resulted in the following tokens:[R]
[]
[]
[L]
Not sure if this the defined behavior of split() or not; I didn't dig into it. But that meant I had to add another if statement to handle those empty string tokens. I do like this approach because the code to add a space between words only existed in one branch. This eliminated the need to extract it into another method.Looking at the lines of code and number of branches, it's easy enough to see which solution is the simplest. But I wanted to quantify the differences. I found the website lizard.ws which offers online complexity analysis of 8 languages, include Java. I plugged in each version of my solution and got the following results:
Solution | NLoC | Complexity | Tokens |
---|---|---|---|
mechanical | 25 | 6 (4+2) | 160 |
replace | 9 | 2 | 51 |
greedy replace | 7 | 1 | 40 |
split | 17 | 4 | 104 |
NLoC is Noncommented Lines of Code: anything that isn't a blank line or a comment line.
I think that makes it clear: The greedy replace solution has the lowest possible complexity score and one-third the lines of code and tokens of my original solution. By removing all that other code for your brain to parse, I'm comfortable you'd have the overhead to ponder the regex.
No comments:
Post a Comment