Word Loop (#149)

Here's a fun little challenge from the Educational Computing Organization of Ontario.

Given a single word as input try to find a repeated letter inside of it such that you can loop the text around and reuse that letter. For example:

$ ruby word_loop.rb Mississippi
i
p
p
Mis
ss
si

or:

$ ruby word_loop.rb Markham
Ma
ar
hk

or:

$ ruby word_loop.rb yummy
yu
mm

If a loop cannot be made, your code can just print an error message:

$ ruby word_loop.rb Dana
No loop.


Quiz Summary

There were two kinds of solvers this week: hardcore programmers who love a good challenge and cheaters like me. Both approaches are neat, but I want to focus on the under appreciated cheater this time. Contrary to the best practices for gamblers, cheater is a wonderful label for a programmer to have.

Cheaters always find the system and make it work for them. So what's the system this time? Have another look at one of the quiz examples:

i
p
p
Mis
ss
si

Okay, the first priority is to find a possible loop. We need a repeated letter with some letters between the two occurrences, obviously. How many letters between though? Well, the minimum loop is:

*.
..

The next smallest loop is what we see in Mississippi:

*.
..
..

That's three and five and we can already see that each size must add two more letters. So, we need an odd number of letters and at least three of those. Now we can find loops.

Once we see the loop, it becomes clear that the loop divides the word into pieces. Let me call them out:

^ > = before the loop characters
^ * = the repeated letter
^ . = loop characters
>*. ^ = after the loop characters
..
..

Seeing these makes a cheater wonder, can I output each part naturally down the lines? To do that, we would first need to spit out those after the loop characters, in reverse order. That shouldn't be tough since we already know how to find a loop, but notice that each of those is also indented. It turns out that they are just indented by the length of characters before the loop, so that's trivial to deal with.

Now we would need the get those before the loop characters in the output. No problem there, it's a simple print statement.

The loop is the trickiest part. First, we see that the repeated letter and the first loop character can be printed along with the before characters. The rest of the characters have an indent, but it's the same thing we figured out earlier and we can deal with that. Then outputting the remaining characters of the loop can be as simple as printing two columns: one pulling letters off the back of the loop and the other pulling letters off the front.

Wow, cheaters work harder than you think, eh? If we can do all of those pieces, we can print the word loop as we go. The reason I like trying an approach like this is that each step is pretty simple and easy to understand. We had to do a bit of thinking that the computer probably could have done for us, but then you have to be smart enough to tell the computer how to do the thinking.

Let's move the ideas into code at this point. We will examine Ken Bloom's solution. Ken is a cheater and a better one than I am at that! I'll tell you what Ken taught me shortly, but for first let's see the code. Brace yourself because I'm giving you the whole thing in one shot:

ruby
def loopword word
matchinfo=
word.match(/(.*?)(.)(.(?:..)+?)\2(.*)/i)
if matchinfo
_,before,letter,looplets,after=matchinfo.to_a
pad=" "*before.size
after.reverse.split(//).each{|l| puts pad+l}
looplets=looplets.split(//)
puts before+letter+looplets.shift
until looplets.empty?
puts pad+looplets.pop+looplets.shift
end
else
puts "No loop."
end
end

# ...

Isn't it great to be a cheater?

Let's break it down. Probably the hardest part to understand is the very first step. Ken hits the word with a tricky Regexp to figure out if it has a loop in it. That's right, you can count an odd number of at least three characters with a regular expression. Let's examine the pieces:

ruby
/ (.*?) # characters before the loop, if any
(.) # the first appearance of our repeated character
(.(?:..)+?) # an odd number of at least three loop characters
\2 # the repeat of our repeated character
(.*) # characters after the loop, if any
/ix
# make the expression case insensitive

There are two good tricks in there. First, matching at least three odd characters is shown to be just any character followed by one or more groups of two characters. That's handy to remember.

The other trick is using a back-reference to catch the repeated character. What I didn't know about this trick though was that Ruby's regular expression engine is smarter than I gave it credit for. I knew this would match:

ruby
>> "-i---i-"[/(\w).+\1/]
=> "i---i"

However, I didn't know the following would work if you made the expression case insensitive:

ruby
>> "-I---i-"[/(\w).+\1/i]
=> "I---i"

I spent too much effort in my code to get a match to work on the word Markham when I could have just used the /i switch. Thanks for the tip, Ken.

Once we have tried the expression, the if statement checks to see if we found a loop. If we didn't the else clause can print our error message and we're done.

When we did find a match, Ken starts by pulling out each capture of the expression into easy to manage variables. Here's my chance to teach Ken a new trick though. See how he created an unused variable (_) to capture the match as a whole? It's not really needed if you switch the method call on the MatchData object:

ruby
>> md = "Mississippi".match(/(.*?)(.)(.(?:..)+?)\2(.*)/i)
=> #<MatchData:0x58a188>
>> md.to_a
=> ["Mississippi", "M", "i", "ssiss", "ppi"]
>> md.captures
=> ["M", "i", "ssiss", "ppi"]

Let's get back to the code. Here it is again to save scrolling:

ruby
def loopword word
matchinfo=
word.match(/(.*?)(.)(.(?:..)+?)\2(.*)/i)
if matchinfo
_,before,letter,looplets,after=matchinfo.to_a
pad=" "*before.size
after.reverse.split(//).each{|l| puts pad+l}
looplets=looplets.split(//)
puts before+letter+looplets.shift
until looplets.empty?
puts pad+looplets.pop+looplets.shift
end
else
puts "No loop."
end
end

# ...

The rest of the code prints the word as I described earlier. First, a variable is set to that indent we will need in multiple places. The next line reverse()s and split()s the after loop characters, printing each one with the indent. After that, Ken breaks up the loop characters into an Array for easy removal at both ends. The next line prints the before the loop characters, the repeat, and the first loop character. Finally, the until loop handles the remaining loop character two at a time, one from each end. That process prints the entire word and makes a complete solution.

The rest of Ken's code just called the method on a set of sample words:

ruby
# ...

loopword "Mississippi"
puts
loopword "Markham"
puts
loopword "yummy"
puts
loopword "Dana"
puts
loopword "Organization"

The non-cheating solutions are also very interesting. They decided that this problem wasn't hardcore enough for them and they could maximize the fun by trying to create multiple loops and reuse as many characters as possible. It's great code that leads to insane output, so be sure to check those out as well.

My thanks to cheaters and hardcore solvers alike. You always give me more interesting material than I can even talk about.

Tomorrow we will tackle a typical programming task in a completely non-typical manner...