Palindromes Puzzle - Solutions
Itzik describes the solutions to the palindromes puzzle posted last week.
January 27, 2007
First of all, thanks to all who posted your solutions (publicly or privately).
I got solutions from: Steve Kass, Adam Machanic, Rob Farley,
Maciej Pilecki and FoxTrot.
I got two types of solutions:
1. Brute Force method - generating all possible sentences, then filtering only
the ones that are palindromes. This logic allows for a very short and simple
solution query, but is extremely inefficient.
2. Constructing sentences from the edges, going inwards, and this way not
pursuing sentences that are known not to have potential to become
palindromes.
The winners would have to be Adam Machanic, Rob Farley and Steve Kass;
all three used the second approach.
Here's an example for a solution based on the Brute Force method:
with c as( select cast(' ' + word + ' ' as varchar(max)) as sentence from words union all select c.sentence + w.word + ' ' from c join words as w on c.sentence not like '% ' + w.word + ' %')select rtrim(ltrim(sentence)) as palindromefrom cwhere replace(sentence, ' ', '') = reverse(replace(sentence, ' ', ''));
The nice thing about the brute force solution is of course that it's short
and simple. But it's extremely slow.
For n words, the brute force solution generates the following number of
sentences (before the outer query filters palindromes):
n + n*(n-1) + n*(n-1)*(n-2) + ... + n!
To illustrate how big those numbers can be, here's a recursive CTE that
calculates the number of sentences produced out of n unique words:
with c as( select n as words, cast(n as bigint) as sentences, cast(n as bigint) as v, n-1 as i from nums -- auxiliary table of numbers where n = 1)select words, sentencesfrom c where i = 0order by words;words sentences----------- --------------------1 12 43 154 645 3256 19567 136998 1096009 98640910 986410011 10850511112 130206134413 1692679748514 23697516480415 355462747207516 5687403955321617 96685867240468918 1740345610328442019 33066566596240399920 6613313319248080000
I think that the brute force solution is a good example to use for teaching
purposes to describe the logic of recursive queries.
As for the more efficient method (constructing sentences from the edges,
going inwards), following are the solutions that I got.
I used the following sample data to test the performance of the solutions:
create table words(word varchar(50) not null primary key);insert into words values('even');insert into words values('never');insert into words values('odd');insert into words values('or');insert into words values('r');insert into words values('e');insert into words values('n');insert into words values('er');insert into words values('madam');insert into words values('adam');
Solution by Steve Kass (165 ms):
create function ends( @lft varchar(200), @rgt varchar(200), @used varchar(8000)) returns table as return with WW(w1word,w1wordX,w2word,w2wordX) as ( select w1.word,@lft+w1.word,w2.word,w2.word+@rgt from words as w1, words as w2 where substring(@lft+w1.word,1,len(w2.word+@rgt)) = substring(reverse(w2.word+@rgt),1,len(@lft+w1.word)) and (w2.word w1.word) and @used not like '%[ *]'+w1.word+'[ *]%' and @used not like '%[ *]'+w2.word+'[ *]%' ), T(w,p,good,side,used) as ( select w1wordX+w2wordX as w, left(w1wordX,len(w2wordX)) + '.' + right(w2wordX,len(w1wordX)) as p, len(substring(w1wordX,1,len(w2wordX))) as good, sign(len(w2wordX)-len(w1wordX)) as side, replace(@used,'*',' '+w1word+'*'+w2word+' ') as used from WW union all select @lft+word+@rgt, @lft+word+@rgt, len(@lft+word+@rgt)/2, 0, replace(@used,'*',' '+word+' ') from words where @lft+word+@rgt = reverse(@lft+word+@rgt) and @used not like '%[ *]'+word+'[ *]%' ) select *, substring(w,good+1,len(w)-2*good) as middle from Tgowith S(w,p,good,side,used,middle) as ( select * from ends('','','*') union all select E.* from S cross apply ends( parsename(S.p,2)+case when side=-1 then middle else '' end, case when side=1 then middle else '' end + parsename(S.p,1), S.used ) as E) select replace(rtrim(ltrim(used)),'*',' ') from S where middle = reverse(middle)go
Steve's description of the solution:
"It first finds single words that are palindromes or pairs of
words that could be the leftmost and rightmost words in a
palindrome, such as "odder do" or "re bracer" For this to
be possible, left(first,minLen) = left(reverse(second),minLen)
where minLen is the minimum length of the two words.
To fill in "odder do" to a palindrome, we need to find words
that fit between odder and do that make a palindrome if
"der" (the unpaired middle) is appended to the left of those
words.
One way to start is by finding all possible single or double
words ending in red and having an appropriate variation of the
property above, so that they still can be extended inward to
a palindrome. For example, inserting "oblate bored" is ok,
because all the new mismatch occurs in oblate. On the other
hand, inserting "cat tarred" does not work, because the
sentence "odder cat tarred do" cannot be made a palindrome
by adding words in the middle. The c and the r will never
match.
The ends() function finds appropriate single or double words
that can take a potential palindrome and make it closer to one
(based on number of characters from each end that match).
I start with nothing, so there is no unmatched @lft or @rgt, nor
@used (which holds used words). That's the base case of the
recursive query. Then for every potential palindrome found,
@lft or @rgt is identified, and @used is, so when ends() is
invoked again, it finds fill-ins for the existing mismatch
in that potential palindrome. Because we need a table of
fill-ins for each potential palindrome, and to do this,
ends() needs information specific to that potential palindrome
(the @lft or @rgt along with the @used string of used words)
a CROSS APPLY is the appropriate way to go.
Once there are no more possible fill-ins, there will be both
incomplete and complete palindromes in the table. The complete
ones are those for which @lft+@rgt is itself a palindrome,
so that ' is a solution for the fill-in, meaning that the
potential palindrome is already a palindrome."
Solution by Adam Machanic (41 ms):
with n as( select convert(varchar(1000), a.word) as parta, convert(varchar(1000), b.word) as partb, convert(varchar(1000), a.word) as repparta, convert(varchar(1000), reverse(b.word)) as reppartb from words a, words b where a.word b.word and ((len(a.word) > len(b.word) and a.word like reverse(b.word) + '%') or (reverse(b.word) like a.word + '%')) union all select convert(varchar(1000), case when len(repparta) > len(reppartb) then parta when len(repparta) len(repparta) then partb when len(repparta) > len(reppartb) then w.word + ' ' + partb when p.n = 1 then partb else w.word + ' ' + partb end) as partb, convert(varchar(1000), replace(case when len(repparta) > len(reppartb) then parta when len(repparta) len(repparta) then partb when len(repparta) > len(reppartb) then w.word + ' ' + partb when p.n = 1 then partb else w.word + ' ' + partb end, ' ', ''))) from n cross join words w cross apply ( select 1 union all select 2 where len(repparta) = len(reppartb) ) p(n) where case when ' ' + parta + ' ' + partb + ' ' not like '% ' + word + ' %' then case when len(repparta) > len(reppartb) then case when len(word) > len(repparta)-len(reppartb) then case when right(word, len(repparta)-len(reppartb)) = right(repparta, len(repparta)-len(reppartb)) then 1 else 0 end else case when word = substring(repparta, len(reppartb)+1, len(word)) then 1 else 0 end end when len(reppartb) > len(repparta) then case when len(word) > len(reppartb) - len(repparta) then case when left(word, len(reppartb)-len(repparta)) = substring(reppartb, len(repparta)+1, len(reppartb)-len(repparta)) then 1 else 0 end else case when word = substring(reppartb, len(repparta)+1, len(word)) then 1 else 0 end end else 1 --equal end else 0 end = 1)select finalfrom( select distinct rtrim(ltrim(parta + ' ' + partb)) final from n) mwhere replace(final, ' ', '') = reverse(replace(final, ' ', ''))union allselect wordfrom wordswhere word = reverse(word)
Adam's description of the solution:
"Much like the other solutions, the goal here is to build the palindrome from
the outside in. The anchor part of the recursive CTE finds all pairs in the
words table, then finds the shorter word and uses it in a LIKE predicate
against the longer word. The solution I discovered is to always reverse the
rightmost word/group of words in the list: If the two words are "abcd" and
"cba", the predicate 'abcd' LIKE REVERSE('cba') + '%' is true. Likewise, if
the words are "abc" and "dcba", the predicate REVERSE('dcba') LIKE 'abc' +
'%' is true. Based on these, the anchor part finds potential palindromes.
The recursive part continues this logic in a slightly different way. It
uses a cross-join to the Words table to get all combinations of words with
the input set. The case expression does the following: First, it makes sure
that the given word has not already been used in the input set. Next, it
determines whether the right part or left part is longer. The same basic
logic is used in both cases, so we'll follow the right part being longer.
Assume that the two parts are: "abc" and "dcba". The difference is one
character, and that character is "d". The logic is to append any possible
words onto the left part, so the CASE expression finds all words in this
case that start with the letter "d". There is also a special case for the
differential being longer than the input. For instance, if the two parts
were "abc" and "gfedcba", and the word being compared were "de", that should
be a match. In that case, the expression looks at only as many characters
in the difference as the number of characters in the input word.
Finally, there is a special case for if the two input words are equal in
length. If so, a new row will be generated using the CROSS APPLY operator
(using the correlation name P). If the input set has the parts "abc" and
"cba", the idea is that any possible word could be a seed for a new
palindrome, appended to either the right or left part. So if the lengths are
equal and P.n = 1, the input word is appended to the left part. Otherwise,
it's appended to the right part.
After all of the recursion is done, the outer query does a final check to
determine whether or not each row is a palindrome. This final set is
unioned with any single words in the table that are, themselves, palindromes
(checked using REVERSE)."
Solution by Rob Farley (22 ms):
with pals as(--Base select cast('' as varchar(max)) as startchars, cast('' as varchar(max)) as endchars, cast(' ' as varchar(max)) as startwords, cast(' ' as varchar(max)) as endwordsunion all--Sides even, so find a suitable word select startchars + w.word, endchars, startwords + w.word + ' ', endwords from pals p cross join words w where len(p.startchars) = len(p.endchars) and exists (select * from words w2 where w.word like reverse(w2.word) + '%' or reverse(w2.word) like w.word + '%') and startwords not like '% ' + w.word + ' %' and endwords not like '% ' + w.word + ' %'union all--Right longer than left, so find a word for there select startchars + w.word, endchars, startwords + w.word + ' ', endwords from pals p cross join words w where len(p.startchars) len(p.endchars) and (w.word like '%' + reverse(right(startchars,len(startchars)-len(endchars))) or reverse(right(startchars,len(startchars)-len(endchars))) like '%' + w.word) and startwords not like '% ' + w.word + ' %' and endwords not like '% ' + w.word + ' %')select rtrim(ltrim(startwords)) + ' ' + rtrim(ltrim(endwords))from palswhere len(startchars) > 0and startchars + endchars = reverse(startchars + endchars)
Rob's description of the solution:
"I figure that the best way of approaching this would be to build it up,
looking for words that would fit as candidates for being added to the start
or end of my palindromes. I wanted to have a space-separated word list as
well as a spaces-removed word list - one is good for both presentation at
the end and checking for unique words, the other is good for actually making
sure that the palindrome is being built correctly.
So taking the 'build it up' approach, a CTE seemed the way to go. I put a
starting scenario in which was to have an empty string at the start and end.
I was originally planning to remove this entry once the CTE was done, but
then later decisions changed my mind for me.
I figured I had three states that I could find myself in. One was when the
string holding the start of the palindrome was longer than the string
holding the end, another was when the string holding the end was longer than
the start, and the third state was when they were the same length. This
could then form a foundation for building the CTE (which I will come to in
the next paragraph). Once the CTE was built, I could filter out the working
lines - which were those ones where the result wasn't a palindrome, and the
base (empty) entry.
So when one string was longer than the other, I wanted to add a word to the
other side. This word had to be a candidate, so that if the difference
between the start-part and end-part was 'never' in the end-part, I could add
any word to the start-part that matched 'reven%', or was one of 'r', 're',
'rev', 'reve' (which translates into: 'reven' like word + '%'). Similar
logic would apply to the end. If the sides were even, I would have to find a
word for which there was candidate for the other side. So I wouldn't pick
'never' for the end-part if there wasn't a word that I would be able to
populate in the next step. I figured I could probably just throw in any word
here and the algorithm would still be valid (because it would get discarded
as a candidate later), but I included it anyway, thinking that it would be
good to get rid of bad candidates earlier.
Regardless of what state I was in, I also had to make sure I hadn't already
used the word, which is why lines like
startwords not like '% ' + w.word + ' %'
--keep appearing everywhere.
The fact that I wanted to cater for when I had the same length string on
both sides influenced me to leave the base case in there as an empty string.
Otherwise I would need to have a base which queried the words table for
candidate words, and that didn't really appeal to me all that much. And I
wasn't really looking for the fastest way of doing it, I was just looking
for a way of doing it.
And that's about it really... I just built the palindromes from the outside
in, populating a CTE, which I then queried to make sure that I only returned
completed palindromes."
And here's the solution that I came up with (267 ms):
with c as( select cast(word as varchar(max)) as st, cast('' as varchar(max)) as ed, 0 as consider from words where word = reverse(word) union all select ' ' + w1.word + ' ', w2.word + ' ', 1 from words as w1 join words as w2 on w1.word w2.word where w1.word like reverse(w2.word) + '%' or reverse(w2.word) like w1.word + '%' union all select st + word + ' ', ed, 0 from c join words on consider = 1 and st + ed not like '% ' + word + ' %' where replace(st + word + ed, ' ', '') = reverse(replace(st + word + ed, ' ', '')) union all select st + w1.word + ' ', w2.word + ' ' + ed, 1 from c join words as w1 on consider = 1 and st + ed not like '% ' + w1.word + ' %' join words as w2 on w1.word w2.word and st + ed not like '% ' + w2.word + ' %' where replace(st, ' ', '') + w1.word like reverse(w2.word + replace(ed, ' ', '')) + '%' or reverse(w2.word + replace(ed, ' ', '')) like replace(st, ' ', '') + w1.word + '%')select ltrim(rtrim(st + ed)) as palindromefrom cwhere replace(st + ed, ' ', '') = reverse(replace(st + ed, ' ', ''));
The logic in my solution is similar to the others who did not use the brute
force method.
The first CTE query (non recursive) simply isolates the single words that are
palindromes.
The second CTE query (also non recursive) generates two parts of sentences
(start part and end part) out of pairs of words that are either palindromes or
have potential to become palindromes. Start and end pairs have potential to
become a palindrome if start begins with the reverse of end, or the reverse of
end begins with the start.
The third CTE query (recursive) adds a word between start and end
only for words that do not yet appear in the sentence, and make the sentence
a palindrome.
The fourth CTE query (recursive) adds a pair of words; one after the start and
one before the end, only in cases that become palindromes or have the
potential to become palindromes.
Cheers,
--
BG
About the Author
You May Also Like