Count and Say (#138)

by Martin DeMello

Conway's "Look and Say" sequence (http://en.wikipedia.org/wiki/Look-and-say_sequence) is a sequence of numbers in which each term "reads aloud" the digits of the previous term. For instance, the canonical L&S sequence starts off 1, 11, 21, 1211, 111221, ..., because:

* 1 is read off as "one 1" or 11.
* 11 is read off as "two 1's" or 21.
* 21 is read off as "one 2, then one 1" or 1211.
* 1211 is read off as "one 1, then one 2, then two 1's" or 111221.
* 111221 is read off as "three 1, then two 2, then one 1" or 312211.

Over on rec.puzzles, Eric A. proposed a variant in which the letters of a sentence are grouped, then "counted aloud", omitting the "s"s for the plural form. Thus, seeding the sequence with "LOOK AND SAY", we get:

0. LOOK AND SAY
1. TWO A ONE D ONE K ONE L ONE N TWO O ONE S ONE Y
2. ONE A ONE D SIX E ONE K ONE L SEVEN N NINE O ONE S TWO T TWO W ONE Y
3. ONE A ONE D TEN E TWO I ONE K ONE L TEN N NINE O THREE S THREE T ONE V
THREE W ONE X ONE Y

and so on. (Note the difference between this and the L&S sequence--the letters are counted rather than read in order). Eric wants to know when the sequence enters a cycle, and how long that cycle is. Well?


Quiz Summary

If you'll permit a brief diversion, this solution to the Look and Say problem from Simon Kröger is worth a look:

ruby
class String
def look_and_say
gsub(/(.)\1*/){|s| "#{s.size}#{s[0,1]}"}
end
end

s = '1'
12.times {p s; s = s.look_and_say}

As you can see, the bulk of the work here is done by a single regular expression. The pattern matches any non-newline character, followed by a run of zero or more of the exact same character. Replacing those matches with a count of the matched run followed by that first unique character will generate a single step in the sequence. Wrap that in an iterator, like the times() call here, and you have yourself a complete solution.

I thought that was pretty slick and just wanted to make sure we all got a good look at it. On to the actual quiz problem.

Solving Count and Say is really just two steps:

1. Translating the counts to English words
2. Iterating over the transformations while watching for a repeated pattern

The first step has come up in the Ruby Quiz a couple of times now. See the English Numerals for the first time this challenge came up.

This occurrence was almost identical to previous showings. Everyone argues about the correct way to translate the numbers and then steals Glenn Parker's solution to Ruby Quiz #25. I'll spare you covering that material again and just recommend you have a look back at that quiz.

Ironically, that turns out to be the harder part of this quiz, with step two being pretty easy to whip up.

Let's examine are the relevant pieces of Gavin Kistner's code. First, we have some extensions to the core classes:

ruby
# ...

module Enumerable
def item_counts
inject( Hash.new(0) ){ |counts, item|
counts[ item ] += 1
counts
}
end
end

class String
def look_and_say
counts = upcase.scan( /[A-Z]/ ).item_counts
counts.keys.sort.map{ |letter|
"#{counts[letter].to_english.upcase} #{letter}"
}.join( ' ' )
end
end

# Code courtesy of Glenn Parker in Ruby Quiz #25
class Integer
# ...

def to_english
# ...
end

# ...
end

# ...

The extension to Enumerable is a simple counter of items. It creates and returns a Hash where the keys are unique items within the Enumerable object and the values are the counts for how many times that item occurs.

The method added to String handles steps in the Count and Say sequence just as we saw Simon do earlier for Look and Say. In this version, scan() is used to locate the letters which are transformed into the count Hash we just examined. The code then walks the letters in sort()ed order adding the count and letter to the result. The count is converted to an English word before it lands in the String using Glen's code.

The rest of the problem is just locating the repeating sequence:

ruby
SEED = "LOOK AND SAY"

# ...

str = SEED
strs_seen = {}
0.upto( 9999 ){ |i|
puts "%4d. %s" % [ i, str ]
if last_seen_on = strs_seen[ str ]
print "Cycle from #{i-1} back to #{last_seen_on}"
puts " (#{i - last_seen_on} lines in cycle)"
break
else
strs_seen[ str ] = i
end
str = str.look_and_say
}

This code runs in a loop with a counter for two reasons. First, we may need to protect ourselves from too much looping, in case the pattern never repeats or doesn't start repeating for a very long time. The other reason is that it allows us to track where the repetition begins and how long each cycle is.

The if statement is what actually detects the duplicate. On all iterations, save the last one, the else branch just adds the current phrase to a seen Hash. This has phrases for the key and the iteration count where they appeared as the value. When a duplicate comes up, the if branch will run, printing the collected statistics and ending the iteration.

It's really just that simple.

My thanks to all who answered this random question form the wild. It's funny that those can be so much fun.

Tomorrow we will tackle a simple problem, but flex our programmer muscles by trying to solve it as efficiently as possible...