Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 274
Default Palindromes and repeats

On Oct 22, 10:17*am, Luciano Paulino da Silva
wrote:
Dear all,
Some time ago (http://groups.google.com.br/group/
microsoft.public.excel.worksheet.functions/browse_thread/thread/
6b068321053a5c90/c6dcff10540e4bc2?q=palindromes+excel+bernie&lnk=ol &)
Bernie Deitrick helped me to solve a problem related to palindromes
and repeats detection on a string of letters. At present, I need
perform some change on that macros in order to detect non-redundant
palindromes and repeats. In this way, for the sequence bellow my
solution it would be:

QGAGAAAAAAAAGGAGQGG

13 Palindromes detected
GAG
AGA
GAAAAAAAAG
AA
AAA
AAAA
AAAAA
AAAAAA
AAAAAAA
AAAAAAAA
AGGA
GG
GQG * * 1 * * * 3

Now the solution it would be:

QGAGAAAAAAAAGGAGQGG

13 Non-redundant Palindromes detected

GAG
GAAAAAAAAG
GG

The big palindromes should be preferred in the occurrences.
Thanks in advance,
Luciano


I'm not quite sure what you mean by "Non-redundant". I take it to mean
something like "maximal non-overlapping"

I didn't modify Bernie Deitrick's code (so it might have a different
input/output) but came up with the following:

'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''''''''''''''''''''''''''''''

Function Reverse(S As String) As String
Dim i As Long, n As Long
n = Len(S)
For i = 1 To n
Reverse = Reverse & Mid(S, n - i + 1, 1)
Next i
End Function

Function FindMaxInitPalin(SearchString As String) As String
'Given a string it returns the longest initial segment which is a
palindrome
Dim i As Long, j As Long
Dim init As String
For i = 1 To Len(SearchString)
init = Mid(SearchString, 1, i)
If init = Reverse(init) Then j = i
Next i
FindMaxInitPalin = Mid(SearchString, 1, j)
End Function

Function FindMaxPalins(SearchString As String) As Variant
'returns a variant array consisting of maximal nonoverlapping
palindromes
Dim Palindromes As Variant
Dim InputString As String
Dim chomp As String
Dim n As Long

InputString = SearchString
Do While Len(InputString) 0
chomp = FindMaxInitPalin(InputString)
InputString = Mid(InputString, Len(chomp) + 1)
If Len(chomp) 1 Then
If IsEmpty(Palindromes) Then
ReDim Palindromes(1 To 1)
Palindromes(1) = chomp
Else
n = UBound(Palindromes)
ReDim Preserve Palindromes(1 To n + 1)
Palindromes(n + 1) = chomp
End If
End If
Loop
FindMaxPalins = Palindromes
End Function

Sub Main()
Dim SearchString As String
Dim Results As Variant
Dim i As Long

SearchString = InputBox("Enter a string")
Results = FindMaxPalins(SearchString)
If IsEmpty(Results) Then
Range("A1").Value = "No palindromes found"
Else
Range("A1").Value = UBound(Results) & " maximal palidromes
found: "
For i = 1 To UBound(Results)
Range("A1").Offset(i).Value = Results(i)
Next i
End If
End Sub


'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''''''''

The function FindMaxPalins is the heart of the code. Main() is a
simple driver program that dumps the results in column A and can of
course be modified to put the output somewhere else. On your sample
input I get:

4 maximal palidromes found:
GAG
AAAAAAAA
GG
GQG

I notice that you didn't include GQG in your desired output. Was that
an oversight on your part? If not, I really don't know what you mean
by "nonredundant."

Hope that helps

-John Coleman
  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 274
Default Palindromes and repeats

On Oct 23, 8:59*am, John Coleman wrote:
On Oct 22, 10:17*am, Luciano Paulino da Silva





wrote:
Dear all,
Some time ago (http://groups.google.com.br/group/
microsoft.public.excel.worksheet.functions/browse_thread/thread/
6b068321053a5c90/c6dcff10540e4bc2?q=palindromes+excel+bernie&lnk=ol &)
Bernie Deitrick helped me to solve a problem related to palindromes
and repeats detection on a string of letters. At present, I need
perform some change on that macros in order to detect non-redundant
palindromes and repeats. In this way, for the sequence bellow my
solution it would be:


QGAGAAAAAAAAGGAGQGG


13 Palindromes detected
GAG
AGA
GAAAAAAAAG
AA
AAA
AAAA
AAAAA
AAAAAA
AAAAAAA
AAAAAAAA
AGGA
GG
GQG * * 1 * * * 3


Now the solution it would be:


QGAGAAAAAAAAGGAGQGG


13 Non-redundant Palindromes detected


GAG
GAAAAAAAAG
GG


The big palindromes should be preferred in the occurrences.
Thanks in advance,
Luciano


I'm not quite sure what you mean by "Non-redundant". I take it to mean
something like "maximal non-overlapping"

I didn't modify Bernie Deitrick's code (so it might have a different
input/output) but came up with the following:

'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''*'''''''''''''''''''''''' '''''''''''''''''''''''''''''''''''''

Function Reverse(S As String) As String
* * Dim i As Long, n As Long
* * n = Len(S)
* * For i = 1 To n
* * * * Reverse = Reverse & Mid(S, n - i + 1, 1)
* * Next i
End Function

Function FindMaxInitPalin(SearchString As String) As String
* * 'Given a string it returns the longest initial segment which is a
palindrome
* * Dim i As Long, j As Long
* * Dim init As String
* * For i = 1 To Len(SearchString)
* * * * init = Mid(SearchString, 1, i)
* * * * If init = Reverse(init) Then j = i
* * Next i
* * FindMaxInitPalin = Mid(SearchString, 1, j)
End Function

Function FindMaxPalins(SearchString As String) As Variant
'returns a variant array consisting of maximal nonoverlapping
palindromes
* * Dim Palindromes As Variant
* * Dim InputString As String
* * Dim chomp As String
* * Dim n As Long

* * InputString = SearchString
* * Do While Len(InputString) 0
* * * * chomp = FindMaxInitPalin(InputString)
* * * * InputString = Mid(InputString, Len(chomp) + 1)
* * * * If Len(chomp) 1 Then
* * * * * * If IsEmpty(Palindromes) Then
* * * * * * * * ReDim Palindromes(1 To 1)
* * * * * * * * Palindromes(1) = chomp
* * * * * * Else
* * * * * * * * n = UBound(Palindromes)
* * * * * * * * ReDim Preserve Palindromes(1 To n + 1)
* * * * * * * * Palindromes(n + 1) = chomp
* * * * * * End If
* * * * End If
* * Loop
* * FindMaxPalins = Palindromes
End Function

Sub Main()
* * Dim SearchString As String
* * Dim Results As Variant
* * Dim i As Long

* * SearchString = InputBox("Enter a string")
* * Results = FindMaxPalins(SearchString)
* * If IsEmpty(Results) Then
* * * * Range("A1").Value = "No palindromes found"
* * Else
* * * * Range("A1").Value = UBound(Results) & " maximal palidromes
found: "
* * * * For i = 1 To UBound(Results)
* * * * * * Range("A1").Offset(i).Value = Results(i)
* * * * Next i
* * End If
End Sub

'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''*'''''''''''''''''''''''' '''''''''''''''''''''''''''''''''''''''''''''''''' '*''''''''''''''

The function FindMaxPalins is the heart of the code. Main() is a
simple driver program that dumps the results in column A and can of
course be modified to put the output somewhere else. On your sample
input I get:

4 maximal palidromes found:
GAG
AAAAAAAA
GG
GQG

I notice that you didn't include GQG in your desired output. Was that
an oversight on your part? If not, I really don't know what you mean
by "nonredundant."

Hope that helps

-John Coleman- Hide quoted text -

- Show quoted text -


I looked at my post and saw that google had introduced a couple of
line breaks that would prevent proper parsing. In case you can't
figure it out, I removed the offending comments and rewrote the code
slightly at the end to get shorter lines:

Function Reverse(S As String) As String
Dim i As Long, n As Long
n = Len(S)
For i = 1 To n
Reverse = Reverse & Mid(S, n - i + 1, 1)
Next i
End Function

Function FindMaxInitPalin(SearchString As String) As String
Dim i As Long, j As Long
Dim init As String
For i = 1 To Len(SearchString)
init = Mid(SearchString, 1, i)
If init = Reverse(init) Then j = i
Next i
FindMaxInitPalin = Mid(SearchString, 1, j)
End Function

Function FindMaxPalins(SearchString As String) As Variant
Dim Palindromes As Variant
Dim InputString As String
Dim chomp As String
Dim n As Long

InputString = SearchString
Do While Len(InputString) 0
chomp = FindMaxInitPalin(InputString)
InputString = Mid(InputString, Len(chomp) + 1)
If Len(chomp) 1 Then
If IsEmpty(Palindromes) Then
ReDim Palindromes(1 To 1)
Palindromes(1) = chomp
Else
n = UBound(Palindromes)
ReDim Preserve Palindromes(1 To n + 1)
Palindromes(n + 1) = chomp
End If
End If
Loop
FindMaxPalins = Palindromes
End Function

Sub Main()
Dim SearchString As String
Dim Results As Variant
Dim i As Long, n As Long

SearchString = InputBox("Enter a string")
Results = FindMaxPalins(SearchString)
If IsEmpty(Results) Then
Range("A1").Value = "No palindromes found"
Else
n = UBound(Results)
Range("A1").Value = n & " maximal palidromes found: "
For i = 1 To n
Range("A1").Offset(i).Value = Results(i)
Next i
End If
End Sub

-John Coleman
  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 274
Default Palindromes and repeats

On Oct 23, 9:07*am, John Coleman wrote:
On Oct 23, 8:59*am, John Coleman wrote:





On Oct 22, 10:17*am, Luciano Paulino da Silva


wrote:
Dear all,
Some time ago (http://groups.google.com.br/group/
microsoft.public.excel.worksheet.functions/browse_thread/thread/
6b068321053a5c90/c6dcff10540e4bc2?q=palindromes+excel+bernie&lnk=ol &)
Bernie Deitrick helped me to solve a problem related to palindromes
and repeats detection on a string of letters. At present, I need
perform some change on that macros in order to detect non-redundant
palindromes and repeats. In this way, for the sequence bellow my
solution it would be:


QGAGAAAAAAAAGGAGQGG


13 Palindromes detected
GAG
AGA
GAAAAAAAAG
AA
AAA
AAAA
AAAAA
AAAAAA
AAAAAAA
AAAAAAAA
AGGA
GG
GQG * * 1 * * * 3


Now the solution it would be:


QGAGAAAAAAAAGGAGQGG


13 Non-redundant Palindromes detected


GAG
GAAAAAAAAG
GG


The big palindromes should be preferred in the occurrences.
Thanks in advance,
Luciano


I'm not quite sure what you mean by "Non-redundant". I take it to mean
something like "maximal non-overlapping"


I didn't modify Bernie Deitrick's code (so it might have a different
input/output) but came up with the following:


'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''**''''''''''''''''''''''' ''''''''''''''''''''''''''''''''''''''


Function Reverse(S As String) As String
* * Dim i As Long, n As Long
* * n = Len(S)
* * For i = 1 To n
* * * * Reverse = Reverse & Mid(S, n - i + 1, 1)
* * Next i
End Function


Function FindMaxInitPalin(SearchString As String) As String
* * 'Given a string it returns the longest initial segment which is a
palindrome
* * Dim i As Long, j As Long
* * Dim init As String
* * For i = 1 To Len(SearchString)
* * * * init = Mid(SearchString, 1, i)
* * * * If init = Reverse(init) Then j = i
* * Next i
* * FindMaxInitPalin = Mid(SearchString, 1, j)
End Function


Function FindMaxPalins(SearchString As String) As Variant
'returns a variant array consisting of maximal nonoverlapping
palindromes
* * Dim Palindromes As Variant
* * Dim InputString As String
* * Dim chomp As String
* * Dim n As Long


* * InputString = SearchString
* * Do While Len(InputString) 0
* * * * chomp = FindMaxInitPalin(InputString)
* * * * InputString = Mid(InputString, Len(chomp) + 1)
* * * * If Len(chomp) 1 Then
* * * * * * If IsEmpty(Palindromes) Then
* * * * * * * * ReDim Palindromes(1 To 1)
* * * * * * * * Palindromes(1) = chomp
* * * * * * Else
* * * * * * * * n = UBound(Palindromes)
* * * * * * * * ReDim Preserve Palindromes(1 To n + 1)
* * * * * * * * Palindromes(n + 1) = chomp
* * * * * * End If
* * * * End If
* * Loop
* * FindMaxPalins = Palindromes
End Function


Sub Main()
* * Dim SearchString As String
* * Dim Results As Variant
* * Dim i As Long


* * SearchString = InputBox("Enter a string")
* * Results = FindMaxPalins(SearchString)
* * If IsEmpty(Results) Then
* * * * Range("A1").Value = "No palindromes found"
* * Else
* * * * Range("A1").Value = UBound(Results) & " maximal palidromes
found: "
* * * * For i = 1 To UBound(Results)
* * * * * * Range("A1").Offset(i).Value = Results(i)
* * * * Next i
* * End If
End Sub


'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''**''''''''''''''''''''''' '''''''''''''''''''''''''''''''''''''''''''''''''' '*'*''''''''''''''


The function FindMaxPalins is the heart of the code. Main() is a
simple driver program that dumps the results in column A and can of
course be modified to put the output somewhere else. On your sample
input I get:


4 maximal palidromes found:
GAG
AAAAAAAA
GG
GQG


I notice that you didn't include GQG in your desired output. Was that
an oversight on your part? If not, I really don't know what you mean
by "nonredundant."


Hope that helps


-John Coleman- Hide quoted text -


- Show quoted text -


I looked at my post and saw that google had introduced a couple of
line breaks that would prevent proper parsing. In case you can't
figure it out, I removed the offending comments and rewrote the code
slightly at the end to get shorter lines:

Function Reverse(S As String) As String
* * Dim i As Long, n As Long
* * n = Len(S)
* * For i = 1 To n
* * * * Reverse = Reverse & Mid(S, n - i + 1, 1)
* * Next i
End Function

Function FindMaxInitPalin(SearchString As String) As String
* * Dim i As Long, j As Long
* * Dim init As String
* * For i = 1 To Len(SearchString)
* * * * init = Mid(SearchString, 1, i)
* * * * If init = Reverse(init) Then j = i
* * Next i
* * FindMaxInitPalin = Mid(SearchString, 1, j)
End Function

Function FindMaxPalins(SearchString As String) As Variant
* * Dim Palindromes As Variant
* * Dim InputString As String
* * Dim chomp As String
* * Dim n As Long

* * InputString = SearchString
* * Do While Len(InputString) 0
* * * * chomp = FindMaxInitPalin(InputString)
* * * * InputString = Mid(InputString, Len(chomp) + 1)
* * * * If Len(chomp) 1 Then
* * * * * * If IsEmpty(Palindromes) Then
* * * * * * * * ReDim Palindromes(1 To 1)
* * * * * * * * Palindromes(1) = chomp
* * * * * * Else
* * * * * * * * n = UBound(Palindromes)
* * * * * * * * ReDim Preserve Palindromes(1 To n + 1)
* * * * * * * * Palindromes(n + 1) = chomp
* * * * * * End If
* * * * End If
* * Loop
* * FindMaxPalins = Palindromes
End Function

Sub Main()
* * Dim SearchString As String
* * Dim Results As Variant
* * Dim i As Long, n As Long

* * SearchString = InputBox("Enter a string")
* * Results = FindMaxPalins(SearchString)
* * If IsEmpty(Results) Then
* * * * Range("A1").Value = "No palindromes found"
* * Else
* * * * n = UBound(Results)
* * * * Range("A1").Value = n & " maximal palidromes found: "
* * * * For i = 1 To n
* * * * * * Range("A1").Offset(i).Value = Results(i)
* * * * Next i
* * End If
End Sub

-John Coleman- Hide quoted text -

- Show quoted text -


Upon further reflection I realized that my algorithm was inefficient
in that Reverse() was starting from scratch at each function call
whereas in the way that it was actually used in FindMaxInitPalin(),
successive calls to Reverse() simply created strings that were one
character longer. I thus ditched Reverse() and modified FindMaxInit()
to keep a running copy of the reflection of initial segments. The
resulting code proved to be hundreds of times quicker when I tried it
on randomly generated strings of length 5000. If your usage is limited
to strings that you are typing by hand you may not notice the
difference, but if you are looping through large collections of
strings you might. Further optimizations are possible- for example, I
didn't attempt to exploit the fact that you can sometimes recognize
non-palindromes when slightly more than half of a string is scanned.

'''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''''''''''
'Revised code:

Function FindMaxInitPalin(SearchString As String) As String
Dim i As Long, j As Long
Dim init As String
Dim reflection As String
For i = 1 To Len(SearchString)
init = Mid(SearchString, 1, i)
reflection = Mid(init, i) & reflection
If init = reflection Then j = i
Next i
FindMaxInitPalin = Mid(SearchString, 1, j)
End Function

Function FindMaxPalins(SearchString As String) As Variant
Dim Palindromes As Variant
Dim InputString As String
Dim chomp As String
Dim n As Long

InputString = SearchString
Do While Len(InputString) 0
chomp = FindMaxInitPalin(InputString)
InputString = Mid(InputString, Len(chomp) + 1)
If Len(chomp) 1 Then
If IsEmpty(Palindromes) Then
ReDim Palindromes(1 To 1)
Palindromes(1) = chomp
Else
n = UBound(Palindromes)
ReDim Preserve Palindromes(1 To n + 1)
Palindromes(n + 1) = chomp
End If
End If
Loop
FindMaxPalins = Palindromes
End Function

Sub Main()
Dim SearchString As String
Dim Results As Variant
Dim i As Long, n As Long

SearchString = InputBox("Enter a string")
Results = FindMaxPalins(SearchString)
If IsEmpty(Results) Then
Range("A1").Value = "No palindromes found"
Else
n = UBound(Results)
Range("A1").Value = n & " maximal palidromes found: "
For i = 1 To n
Range("A1").Offset(i).Value = Results(i)
Next i
End If
End Sub
  #4   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 77
Default Palindromes and repeats

On 23 out, 12:45, John Coleman wrote:
On Oct 23, 9:07*am, John Coleman wrote:



On Oct 23, 8:59*am, John Coleman wrote:


On Oct 22, 10:17*am,LucianoPaulino da Silva


wrote:
Dear all,
Some time ago (http://groups.google.com.br/group/
microsoft.public.excel.worksheet.functions/browse_thread/thread/
6b068321053a5c90/c6dcff10540e4bc2?q=palindromes+excel+bernie&lnk=ol &)
Bernie Deitrick helped me to solve a problem related to palindromes
and repeats detection on a string of letters. At present, I need
perform some change on that macros in order to detect non-redundant
palindromes and repeats. In this way, for the sequence bellow my
solution it would be:


QGAGAAAAAAAAGGAGQGG


13 Palindromes detected
GAG
AGA
GAAAAAAAAG
AA
AAA
AAAA
AAAAA
AAAAAA
AAAAAAA
AAAAAAAA
AGGA
GG
GQG * * 1 * * * 3


Now the solution it would be:


QGAGAAAAAAAAGGAGQGG


13 Non-redundant Palindromes detected


GAG
GAAAAAAAAG
GG


The big palindromes should be preferred in the occurrences.
Thanks in advance,
Luciano


I'm not quite sure what you mean by "Non-redundant". I take it to mean
something like "maximal non-overlapping"


I didn't modify Bernie Deitrick's code (so it might have a different
input/output) but came up with the following:


'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''**''''''''''''''''''''''' ''''''''''''''''''''''''''''''''''''''


Function Reverse(S As String) As String
* * Dim i As Long, n As Long
* * n = Len(S)
* * For i = 1 To n
* * * * Reverse = Reverse & Mid(S, n - i + 1, 1)
* * Next i
End Function


Function FindMaxInitPalin(SearchString As String) As String
* * 'Given a string it returns the longest initial segment which is a
palindrome
* * Dim i As Long, j As Long
* * Dim init As String
* * For i = 1 To Len(SearchString)
* * * * init = Mid(SearchString, 1, i)
* * * * If init = Reverse(init) Then j = i
* * Next i
* * FindMaxInitPalin = Mid(SearchString, 1, j)
End Function


Function FindMaxPalins(SearchString As String) As Variant
'returns a variant array consisting of maximal nonoverlapping
palindromes
* * Dim Palindromes As Variant
* * Dim InputString As String
* * Dim chomp As String
* * Dim n As Long


* * InputString = SearchString
* * Do While Len(InputString) 0
* * * * chomp = FindMaxInitPalin(InputString)
* * * * InputString = Mid(InputString, Len(chomp) + 1)
* * * * If Len(chomp) 1 Then
* * * * * * If IsEmpty(Palindromes) Then
* * * * * * * * ReDim Palindromes(1 To 1)
* * * * * * * * Palindromes(1) = chomp
* * * * * * Else
* * * * * * * * n = UBound(Palindromes)
* * * * * * * * ReDim Preserve Palindromes(1 To n + 1)
* * * * * * * * Palindromes(n + 1) = chomp
* * * * * * End If
* * * * End If
* * Loop
* * FindMaxPalins = Palindromes
End Function


Sub Main()
* * Dim SearchString As String
* * Dim Results As Variant
* * Dim i As Long


* * SearchString = InputBox("Enter a string")
* * Results = FindMaxPalins(SearchString)
* * If IsEmpty(Results) Then
* * * * Range("A1").Value = "No palindromes found"
* * Else
* * * * Range("A1").Value = UBound(Results) & " maximal palidromes
found: "
* * * * For i = 1 To UBound(Results)
* * * * * * Range("A1").Offset(i).Value = Results(i)
* * * * Next i
* * End If
End Sub


'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''**''''''''''''''''''''''' '''''''''''''''''''''''''''''''''''''''''''''''''' '*'*''''''''''''''


The function FindMaxPalins is the heart of the code. Main() is a
simple driver program that dumps the results in column A and can of
course be modified to put the output somewhere else. On your sample
input I get:


4 maximal palidromes found:
GAG
AAAAAAAA
GG
GQG


I notice that you didn't include GQG in your desired output. Was that
an oversight on your part? If not, I really don't know what you mean
by "nonredundant."


Hope that helps


-John Coleman- Hide quoted text -


- Show quoted text -


I looked at my post and saw that google had introduced a couple of
line breaks that would prevent proper parsing. In case you can't
figure it out, I removed the offending comments and rewrote the code
slightly at the end to get shorter lines:


Function Reverse(S As String) As String
* * Dim i As Long, n As Long
* * n = Len(S)
* * For i = 1 To n
* * * * Reverse = Reverse & Mid(S, n - i + 1, 1)
* * Next i
End Function


Function FindMaxInitPalin(SearchString As String) As String
* * Dim i As Long, j As Long
* * Dim init As String
* * For i = 1 To Len(SearchString)
* * * * init = Mid(SearchString, 1, i)
* * * * If init = Reverse(init) Then j = i
* * Next i
* * FindMaxInitPalin = Mid(SearchString, 1, j)
End Function


Function FindMaxPalins(SearchString As String) As Variant
* * Dim Palindromes As Variant
* * Dim InputString As String
* * Dim chomp As String
* * Dim n As Long


* * InputString = SearchString
* * Do While Len(InputString) 0
* * * * chomp = FindMaxInitPalin(InputString)
* * * * InputString = Mid(InputString, Len(chomp) + 1)
* * * * If Len(chomp) 1 Then
* * * * * * If IsEmpty(Palindromes) Then
* * * * * * * * ReDim Palindromes(1 To 1)
* * * * * * * * Palindromes(1) = chomp
* * * * * * Else
* * * * * * * * n = UBound(Palindromes)
* * * * * * * * ReDim Preserve Palindromes(1 To n + 1)
* * * * * * * * Palindromes(n + 1) = chomp
* * * * * * End If
* * * * End If
* * Loop
* * FindMaxPalins = Palindromes
End Function


Sub Main()
* * Dim SearchString As String
* * Dim Results As Variant
* * Dim i As Long, n As Long


* * SearchString = InputBox("Enter a string")
* * Results = FindMaxPalins(SearchString)
* * If IsEmpty(Results) Then
* * * * Range("A1").Value = "No palindromes found"
* * Else
* * * * n = UBound(Results)
* * * * Range("A1").Value = n & " maximal palidromes found: "
* * * * For i = 1 To n
* * * * * * Range("A1").Offset(i).Value = Results(i)
* * * * Next i
* * End If
End Sub


-John Coleman- Hide quoted text -


- Show quoted text -


Upon further reflection I realized that my algorithm was inefficient
in that Reverse() was starting from scratch at each function call
whereas in the way that it was actually used in FindMaxInitPalin(),
successive calls to Reverse() simply created strings that were one
character longer. I thus ditched Reverse() and modified FindMaxInit()
to keep a running copy of the reflection of initial segments. The
resulting code proved to be hundreds of times quicker when I tried it
on randomly generated strings of length 5000. If your usage is limited
to strings that you are typing by hand you may not notice the
difference, but if you are looping through large collections of
strings you might. Further optimizations are possible- for example, I
didn't attempt to exploit the fact that you can sometimes recognize
non-palindromes when slightly more than half of a string is scanned.

'''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''''''''''
'Revised code:

Function FindMaxInitPalin(SearchString As String) As String
* * Dim i As Long, j As Long
* * Dim init As String
* * Dim reflection As String
* * For i = 1 To Len(SearchString)
* * * * init = Mid(SearchString, 1, i)
* * * * reflection = Mid(init, i) & reflection
* * * * If init = reflection Then j = i
* * Next i
* * FindMaxInitPalin = Mid(SearchString, 1, j)
End Function

Function FindMaxPalins(SearchString As String) As Variant
* * Dim Palindromes As Variant
* * Dim InputString As String
* * Dim chomp As String
* * Dim n As Long

* * InputString = SearchString
* * Do While Len(InputString) 0
* * * * chomp = FindMaxInitPalin(InputString)
* * * * InputString = Mid(InputString, Len(chomp) + 1)
* * * * If Len(chomp) 1 Then
* * * * * * If IsEmpty(Palindromes) Then
* * * * * * * * ReDim Palindromes(1 To 1)
* * * * * * * * Palindromes(1) = chomp
* * * * * * Else
* * * * * * * * n = UBound(Palindromes)
* * * * * * * * ReDim Preserve Palindromes(1 To n + 1)
* * * * * * * * Palindromes(n + 1) = chomp
* * * * * * End If
* * * * End If
* * Loop
* * FindMaxPalins = Palindromes
End Function

Sub Main()
* * Dim SearchString As String
* * Dim Results As Variant
* * Dim i As Long, n As Long

* * SearchString = InputBox("Enter a string")
* * Results = FindMaxPalins(SearchString)
* * If IsEmpty(Results) Then
* * * * Range("A1").Value = "No palindromes found"
* * Else
* * * * n = UBound(Results)
* * * * Range("A1").Value = n & " maximal palidromes found: "
* * * * For i = 1 To n
* * * * * * Range("A1").Offset(i).Value = Results(i)
* * * * Next i
* * End If
End Sub


Dear John Coleman,
Thank you very much for your fantastic code. It is almost perfect for
my situation.
However, I need that instead of a same detected palindrome is being
repeated all the time it appear in the string, it should be shown once
a time in A column, and that B column should be shown the frequence of
occcurrences of that specific palindrome. Do you know some way to do
that?
Thank you,
Luciano
  #5   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 274
Default Palindromes and repeats


Dear John Coleman,
Thank you very much for your fantastic code. It is almost perfect for
my situation.
However, I need that instead of a same detected palindrome is being
repeated all the time it appear in the string, it should be shown once
a time in A column, and that B column should be shown the frequence of
occcurrences of that specific palindrome. Do you know some way to do
that?
Thank you,
Luciano



Dear Luciano,

This should work:

'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''

Function FindMaxInitPalin(SearchString As String) As String
Dim i As Long, j As Long
Dim init As String
Dim reflection As String
For i = 1 To Len(SearchString)
init = Mid(SearchString, 1, i)
reflection = Mid(init, i) & reflection
If init = reflection Then j = i
Next i
FindMaxInitPalin = Mid(SearchString, 1, j)
End Function

Function FindMaxPalins(SearchString As String) As Variant
Dim Palindromes As Object
Dim InputString As String
Dim chomp As String
Dim n As Long

Set Palindromes = CreateObject("Scripting.Dictionary")
InputString = SearchString
Do While Len(InputString) 0
chomp = FindMaxInitPalin(InputString)
InputString = Mid(InputString, Len(chomp) + 1)
If Len(chomp) 1 Then
If Not Palindromes.Exists(chomp) Then
Palindromes.Add chomp, 1
Else
Palindromes.Item(chomp) = Palindromes.Item(chomp) + 1
End If
End If
Loop
If Palindromes.count 0 Then Set FindMaxPalins = Palindromes
End Function

Sub Main()
Dim SearchString As String
Dim Results As Variant
Dim i As Long, n As Long
Dim palindrome As Variant

On Error GoTo no_palindromes
SearchString = InputBox("Enter a string")
Set Results = FindMaxPalins(SearchString)
n = Results.count
Range("A1").Value = n & " distinct palindromes found"
Range("A2").Value = "Palindrome"
Range("B2").Value = "Frequency"
For Each palindrome In Results.Keys
i = i + 1
Range("A2").Offset(i).Value = palindrome
Range("B2").Offset(i).Value = Results.Item(palindrome)
Next palindrome
Exit Sub
no_palindromes:
Range("A1").Value = "No palindromes found"
End Sub

'''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''''''''''''''''''''''''''''''''''''
Also, there is no need to stick to an inputbox for entering the
string. It is easy enough to modify main() so that it gets its input
from a cell

Out of curiosity, what sort of things are you doing with this code?
Seems interesting.

-John Coleman


  #6   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 77
Default Palindromes and repeats

On 24 out, 02:10, John Coleman wrote:
Dear John Coleman,
Thank you very much for your fantastic code. It is almost perfect for
my situation.
However, I need that instead of a same detected palindrome is being
repeated all the time it appear in the string, it should be shown once
a time in A column, and that B column should be shown the frequence of
occcurrences of that specific palindrome. Do you know some way to do
that?
Thank you,
Luciano


DearLuciano,

This should work:

'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''

Function FindMaxInitPalin(SearchString As String) As String
* * Dim i As Long, j As Long
* * Dim init As String
* * Dim reflection As String
* * For i = 1 To Len(SearchString)
* * * * init = Mid(SearchString, 1, i)
* * * * reflection = Mid(init, i) & reflection
* * * * If init = reflection Then j = i
* * Next i
* * FindMaxInitPalin = Mid(SearchString, 1, j)
End Function

Function FindMaxPalins(SearchString As String) As Variant
* * Dim Palindromes As Object
* * Dim InputString As String
* * Dim chomp As String
* * Dim n As Long

* * Set Palindromes = CreateObject("Scripting.Dictionary")
* * InputString = SearchString
* * Do While Len(InputString) 0
* * * * chomp = FindMaxInitPalin(InputString)
* * * * InputString = Mid(InputString, Len(chomp) + 1)
* * * * If Len(chomp) 1 Then
* * * * * * If Not Palindromes.Exists(chomp) Then
* * * * * * * * Palindromes.Add chomp, 1
* * * * * * Else
* * * * * * * * Palindromes.Item(chomp) = Palindromes.Item(chomp) + 1
* * * * * * End If
* * * * End If
* * Loop
* * If Palindromes.count 0 Then Set FindMaxPalins = Palindromes
End Function

Sub Main()
* * Dim SearchString As String
* * Dim Results As Variant
* * Dim i As Long, n As Long
* * Dim palindrome As Variant

* * On Error GoTo no_palindromes
* * SearchString = InputBox("Enter a string")
* * Set Results = FindMaxPalins(SearchString)
* * n = Results.count
* * Range("A1").Value = n & " distinct palindromes found"
* * Range("A2").Value = "Palindrome"
* * Range("B2").Value = "Frequency"
* * For Each palindrome In Results.Keys
* * * * i = i + 1
* * * * Range("A2").Offset(i).Value = palindrome
* * * * Range("B2").Offset(i).Value = Results.Item(palindrome)
* * Next palindrome
* * Exit Sub
no_palindromes:
* * Range("A1").Value = "No palindromes found"
End Sub

'''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''''''''''''''''''''''''''''''''''''
Also, there is no need to stick to an inputbox for entering the
string. It is easy enough to modify main() so that it gets its input
from a cell

Out of curiosity, what sort of things are you doing with this code?
Seems interesting.

-John Coleman


Dear John Coleman,
Thank you very much for your help. Let me try explain the aim of the
present study. Let me first introduce myself. My name is Luciano
Paulino da Silva, I`m a scientific researcher in the area of
bionanotechnology. In some of our studies we have characterized the
sequences of proteins and peptides that are formed by strings of 20
possible aminoacid residues commonly represented by letters. We always
needs a lot of statistical and mathematical comparisons among strings.
In this way, we have for some situations even thousands of these
strings and we are interested in analyze them from databanks. Some
time ago, I realized that some of these protein sequences have inside
them the presence of palindromes and repeats and such structures are
very important for their biological functions. In this way, this code
and others are important to test some situations in order to in some
days perform these analyses in large scale. In this way, if it would
be your interest I would be pleased if you accept collaborate with us.
Concerning your code, at present we need some additional things like:
1-The string shoud be typed in A1 cell instead use an inputbox;
2-In addition to palindromes, an identitical approach will have to be
performed to repeats in the strings;

Thanks in advance,
Luciano
  #7   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 77
Default Palindromes and repeats

On 24 out, 02:10, John Coleman wrote:
Dear John Coleman,
Thank you very much for your fantastic code. It is almost perfect for
my situation.
However, I need that instead of a same detected palindrome is being
repeated all the time it appear in the string, it should be shown once
a time in A column, and that B column should be shown the frequence of
occcurrences of that specific palindrome. Do you know some way to do
that?
Thank you,
Luciano


DearLuciano,

This should work:

'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''

Function FindMaxInitPalin(SearchString As String) As String
* * Dim i As Long, j As Long
* * Dim init As String
* * Dim reflection As String
* * For i = 1 To Len(SearchString)
* * * * init = Mid(SearchString, 1, i)
* * * * reflection = Mid(init, i) & reflection
* * * * If init = reflection Then j = i
* * Next i
* * FindMaxInitPalin = Mid(SearchString, 1, j)
End Function

Function FindMaxPalins(SearchString As String) As Variant
* * Dim Palindromes As Object
* * Dim InputString As String
* * Dim chomp As String
* * Dim n As Long

* * Set Palindromes = CreateObject("Scripting.Dictionary")
* * InputString = SearchString
* * Do While Len(InputString) 0
* * * * chomp = FindMaxInitPalin(InputString)
* * * * InputString = Mid(InputString, Len(chomp) + 1)
* * * * If Len(chomp) 1 Then
* * * * * * If Not Palindromes.Exists(chomp) Then
* * * * * * * * Palindromes.Add chomp, 1
* * * * * * Else
* * * * * * * * Palindromes.Item(chomp) = Palindromes.Item(chomp) + 1
* * * * * * End If
* * * * End If
* * Loop
* * If Palindromes.count 0 Then Set FindMaxPalins = Palindromes
End Function

Sub Main()
* * Dim SearchString As String
* * Dim Results As Variant
* * Dim i As Long, n As Long
* * Dim palindrome As Variant

* * On Error GoTo no_palindromes
* * SearchString = InputBox("Enter a string")
* * Set Results = FindMaxPalins(SearchString)
* * n = Results.count
* * Range("A1").Value = n & " distinct palindromes found"
* * Range("A2").Value = "Palindrome"
* * Range("B2").Value = "Frequency"
* * For Each palindrome In Results.Keys
* * * * i = i + 1
* * * * Range("A2").Offset(i).Value = palindrome
* * * * Range("B2").Offset(i).Value = Results.Item(palindrome)
* * Next palindrome
* * Exit Sub
no_palindromes:
* * Range("A1").Value = "No palindromes found"
End Sub

'''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''''''''''''''''''''''''''''''''''''
Also, there is no need to stick to an inputbox for entering the
string. It is easy enough to modify main() so that it gets its input
from a cell

Out of curiosity, what sort of things are you doing with this code?
Seems interesting.

-John Coleman


And another thing that I forgot to say is that in addition to the
frequency, it is necessary a column C containing the number of letters
in the palindrome found listed in each line.
Thank you
Luciano
  #8   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 274
Default Palindromes and repeats

On Oct 24, 6:16*am, Luciano Paulino da Silva
wrote:
On 24 out, 02:10, John Coleman wrote:





Dear John Coleman,
Thank you very much for your fantastic code. It is almost perfect for
my situation.
However, I need that instead of a same detected palindrome is being
repeated all the time it appear in the string, it should be shown once
a time in A column, and that B column should be shown the frequence of
occcurrences of that specific palindrome. Do you know some way to do
that?
Thank you,
Luciano


DearLuciano,


This should work:


'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''


Function FindMaxInitPalin(SearchString As String) As String
* * Dim i As Long, j As Long
* * Dim init As String
* * Dim reflection As String
* * For i = 1 To Len(SearchString)
* * * * init = Mid(SearchString, 1, i)
* * * * reflection = Mid(init, i) & reflection
* * * * If init = reflection Then j = i
* * Next i
* * FindMaxInitPalin = Mid(SearchString, 1, j)
End Function


Function FindMaxPalins(SearchString As String) As Variant
* * Dim Palindromes As Object
* * Dim InputString As String
* * Dim chomp As String
* * Dim n As Long


* * Set Palindromes = CreateObject("Scripting.Dictionary")
* * InputString = SearchString
* * Do While Len(InputString) 0
* * * * chomp = FindMaxInitPalin(InputString)
* * * * InputString = Mid(InputString, Len(chomp) + 1)
* * * * If Len(chomp) 1 Then
* * * * * * If Not Palindromes.Exists(chomp) Then
* * * * * * * * Palindromes.Add chomp, 1
* * * * * * Else
* * * * * * * * Palindromes.Item(chomp) = Palindromes..Item(chomp) + 1
* * * * * * End If
* * * * End If
* * Loop
* * If Palindromes.count 0 Then Set FindMaxPalins = Palindromes
End Function


Sub Main()
* * Dim SearchString As String
* * Dim Results As Variant
* * Dim i As Long, n As Long
* * Dim palindrome As Variant


* * On Error GoTo no_palindromes
* * SearchString = InputBox("Enter a string")
* * Set Results = FindMaxPalins(SearchString)
* * n = Results.count
* * Range("A1").Value = n & " distinct palindromes found"
* * Range("A2").Value = "Palindrome"
* * Range("B2").Value = "Frequency"
* * For Each palindrome In Results.Keys
* * * * i = i + 1
* * * * Range("A2").Offset(i).Value = palindrome
* * * * Range("B2").Offset(i).Value = Results.Item(palindrome)
* * Next palindrome
* * Exit Sub
no_palindromes:
* * Range("A1").Value = "No palindromes found"
End Sub


'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''*'''''''''''''''''
Also, there is no need to stick to an inputbox for entering the
string. It is easy enough to modify main() so that it gets its input
from a cell


Out of curiosity, what sort of things are you doing with this code?
Seems interesting.


-John Coleman


Dear John Coleman,
Thank you very much for your help. Let me try explain the aim of the
present study. Let me first introduce myself. My name is Luciano
Paulino da Silva, I`m a scientific researcher in the area of
bionanotechnology. In some of our studies we have characterized the
sequences of proteins and peptides that are formed by strings of 20
possible aminoacid residues commonly represented by letters. We always
needs a lot of statistical and mathematical comparisons among strings.
In this way, we have for some situations even thousands of these
strings and we are interested in analyze them from databanks. Some
time ago, I realized that some of these protein sequences have inside
them the presence of palindromes and repeats and such structures are
very important for their biological functions. In this way, this code
and others are important to test some situations in order to in some
days perform these analyses in large scale. In this way, if it would
be your interest I would be pleased if you accept collaborate with us.
Concerning your code, at present we need some additional things like:
1-The string shoud be typed in A1 cell instead use an inputbox;
2-In addition to palindromes, an identitical approach will have to be
performed to repeats in the strings;

Thanks in advance,
Luciano- Hide quoted text -

- Show quoted text -


Dear Luciano,

I suspected that your interests were biological. I am a mathematician
who likes to program as a hobby (and sometimes as an aid to research
or teaching). I have a friend in the biology department who sought my
aid a while back in writing a program which generated random protein
strings with certain properties, and I recognized G,A,Q from your
original post as possibly being abbreviations for glycine, etc. I am
on sabbatical this semester, so have some time on my hands and would
be willing to help you out some (but am currently writing a paper - so
I don't have unlimitted time), though for more serious analysis you
should probably recruit a specialist in bioinformatics who understands
string processing and pattern matching much more than I do. Feel free
to email me if you want. My address is jcoleman followed by @ followed
by franciscan dot edu (don't want to make life too easy for robot
spammers).

What do you mean by repeats? Do you mean strings like GAQGAQ (where
the repeated sequence is adjacent) or GAQCGAQ (where the repeated
sequence can be separated by intervening letters)? And how would you
handle GAQGAQGAQ? Also - do you want all repeats, or do you want
repeats which are in some sense maximal (similar to the palindrome
code I wrote)? Please clarify.

I wrote a new main() to take its input from A1 and report lengths in
column C. The other two functions in the code were unchanged , so just
delete main() and replace it with:


Sub Main()
Dim SearchString As String
Dim Results As Variant
Dim i As Long, n As Long
Dim palindrome As Variant
Dim R As Range 'output anchor

On Error GoTo no_palindromes
Set Results = FindMaxPalins(Range("A1").Value)
Set R = Range("A3")
n = Results.count
R.Value = n & " distinct palindromes found"
R.Offset(1).Value = "Palindrome"
R.Offset(1, 1).Value = "Frequency"
R.Offset(1, 2).Value = "Length"
For Each palindrome In Results.Keys
i = i + 1
R.Offset(i + 1).Value = palindrome
R.Offset(i + 1, 1).Value = Results.Item(palindrome)
R.Offset(i + 1, 2).Value = Len(palindrome)
Next palindrome
Exit Sub
no_palindromes:
R.Value = "No palindromes found"
End Sub

Hope that helps

-John Coleman
  #9   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 77
Default Palindromes and repeats

On 24 out, 10:33, John Coleman wrote:
On Oct 24, 6:16*am,LucianoPaulino da Silva



wrote:
On 24 out, 02:10, John Coleman wrote:


Dear John Coleman,
Thank you very much for your fantastic code. It is almost perfect for
my situation.
However, I need that instead of a same detected palindrome is being
repeated all the time it appear in the string, it should be shown once
a time in A column, and that B column should be shown the frequence of
occcurrences of that specific palindrome. Do you know some way to do
that?
Thank you,
Luciano


DearLuciano,


This should work:


'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''


Function FindMaxInitPalin(SearchString As String) As String
* * Dim i As Long, j As Long
* * Dim init As String
* * Dim reflection As String
* * For i = 1 To Len(SearchString)
* * * * init = Mid(SearchString, 1, i)
* * * * reflection = Mid(init, i) & reflection
* * * * If init = reflection Then j = i
* * Next i
* * FindMaxInitPalin = Mid(SearchString, 1, j)
End Function


Function FindMaxPalins(SearchString As String) As Variant
* * Dim Palindromes As Object
* * Dim InputString As String
* * Dim chomp As String
* * Dim n As Long


* * Set Palindromes = CreateObject("Scripting.Dictionary")
* * InputString = SearchString
* * Do While Len(InputString) 0
* * * * chomp = FindMaxInitPalin(InputString)
* * * * InputString = Mid(InputString, Len(chomp) + 1)
* * * * If Len(chomp) 1 Then
* * * * * * If Not Palindromes.Exists(chomp) Then
* * * * * * * * Palindromes.Add chomp, 1
* * * * * * Else
* * * * * * * * Palindromes.Item(chomp) = Palindromes.Item(chomp) + 1
* * * * * * End If
* * * * End If
* * Loop
* * If Palindromes.count 0 Then Set FindMaxPalins = Palindromes
End Function


Sub Main()
* * Dim SearchString As String
* * Dim Results As Variant
* * Dim i As Long, n As Long
* * Dim palindrome As Variant


* * On Error GoTo no_palindromes
* * SearchString = InputBox("Enter a string")
* * Set Results = FindMaxPalins(SearchString)
* * n = Results.count
* * Range("A1").Value = n & " distinct palindromes found"
* * Range("A2").Value = "Palindrome"
* * Range("B2").Value = "Frequency"
* * For Each palindrome In Results.Keys
* * * * i = i + 1
* * * * Range("A2").Offset(i).Value = palindrome
* * * * Range("B2").Offset(i).Value = Results.Item(palindrome)
* * Next palindrome
* * Exit Sub
no_palindromes:
* * Range("A1").Value = "No palindromes found"
End Sub


'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''*'''''''''''''''''
Also, there is no need to stick to an inputbox for entering the
string. It is easy enough to modify main() so that it gets its input
from a cell


Out of curiosity, what sort of things are you doing with this code?
Seems interesting.


-John Coleman


Dear John Coleman,
Thank you very much for your help. Let me try explain the aim of the
present study. Let me first introduce myself. My name isLuciano
Paulino da Silva, I`m a scientific researcher in the area of
bionanotechnology. In some of our studies we have characterized the
sequences of proteins and peptides that are formed by strings of 20
possible aminoacid residues commonly represented by letters. We always
needs a lot of statistical and mathematical comparisons among strings.
In this way, we have for some situations even thousands of these
strings and we are interested in analyze them from databanks. Some
time ago, I realized that some of these protein sequences have inside
them the presence of palindromes and repeats and such structures are
very important for their biological functions. In this way, this code
and others are important to test some situations in order to in some
days perform these analyses in large scale. In this way, if it would
be your interest I would be pleased if you accept collaborate with us.
Concerning your code, at present we need some additional things like:
1-The string shoud be typed in A1 cell instead use an inputbox;
2-In addition to palindromes, an identitical approach will have to be
performed to repeats in the strings;


Thanks in advance,
Luciano- Hide quoted text -


- Show quoted text -


DearLuciano,

I suspected that your interests were biological. I am a mathematician
who likes to program as a hobby (and sometimes as an aid to research
or teaching). I have a friend in the biology department who sought my
aid a while back in writing a program which generated random protein
strings with certain properties, and I recognized G,A,Q from your
original post as possibly being abbreviations for glycine, etc. I am
on sabbatical this semester, so have some time on my hands and would
be willing to help you out some (but am currently writing a paper - so
I don't have unlimitted time), though for more serious analysis you
should probably recruit a specialist in bioinformatics who understands
string processing and pattern matching much more than I do. Feel free
to email me if you want. My address is jcoleman followed by @ followed
by franciscan dot edu (don't want to make life too easy for robot
spammers).

What do you mean by repeats? Do you mean strings like GAQGAQ (where
the repeated sequence is adjacent) or GAQCGAQ (where the repeated
sequence can be separated by intervening letters)? And how would you
handle GAQGAQGAQ? Also - do you want all repeats, or do you want
repeats which are in some sense maximal (similar to the palindrome
code I wrote)? Please clarify.

I wrote a new main() to take its input from A1 and report lengths in
column C. The other two functions in the code were unchanged , so just
delete main() and replace it with:

Sub Main()
* * Dim SearchString As String
* * Dim Results As Variant
* * Dim i As Long, n As Long
* * Dim palindrome As Variant
* * Dim R As Range 'output anchor

* * On Error GoTo no_palindromes
* * Set Results = FindMaxPalins(Range("A1").Value)
* * Set R = Range("A3")
* * n = Results.count
* * R.Value = n & " distinct palindromes found"
* * R.Offset(1).Value = "Palindrome"
* * R.Offset(1, 1).Value = "Frequency"
* * R.Offset(1, 2).Value = "Length"
* * For Each palindrome In Results.Keys
* * * * i = i + 1
* * * * R.Offset(i + 1).Value = palindrome
* * * * R.Offset(i + 1, 1).Value = Results.Item(palindrome)
* * * * R.Offset(i + 1, 2).Value = Len(palindrome)
* * Next palindrome
* * Exit Sub
no_palindromes:
* * R.Value = "No palindromes found"
End Sub

Hope that helps

-John Coleman


Dear John Coleman,
Thak you very much for your code. It is perfect for this investigation
approach.
That`s nice, I`m a biologist but among my graduate students there are
a mathematician because your area is very important to our studies.
That`s interesting. I have also a smal VBA-based program to generate
random polypeptides and this is one of our interests.
I will prepare a small text telling you our interests in some string
processing and you will be able to tell us if could help us.

Yes, the repeat sequence could be adjacent or separated by intervening
letters. Both situations are possible on protein sequences. In the
case of GAQGAQGAQ it would be 3 GAQ repeats, using the same logic of
the palindromes. Yes, it is similar to the palindromes, and even some
palindromes are repeats but not all repeats are palindromes...
Thanks in advance,
Luciano

  #10   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 274
Default Palindromes and repeats

On Oct 24, 2:36*pm, Luciano Paulino da Silva
wrote:
On 24 out, 10:33, John Coleman wrote:





On Oct 24, 6:16*am,LucianoPaulino da Silva


wrote:
On 24 out, 02:10, John Coleman wrote:


Dear John Coleman,
Thank you very much for your fantastic code. It is almost perfect for
my situation.
However, I need that instead of a same detected palindrome is being
repeated all the time it appear in the string, it should be shown once
a time in A column, and that B column should be shown the frequence of
occcurrences of that specific palindrome. Do you know some way to do
that?
Thank you,
Luciano


DearLuciano,


This should work:


'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''


Function FindMaxInitPalin(SearchString As String) As String
* * Dim i As Long, j As Long
* * Dim init As String
* * Dim reflection As String
* * For i = 1 To Len(SearchString)
* * * * init = Mid(SearchString, 1, i)
* * * * reflection = Mid(init, i) & reflection
* * * * If init = reflection Then j = i
* * Next i
* * FindMaxInitPalin = Mid(SearchString, 1, j)
End Function


Function FindMaxPalins(SearchString As String) As Variant
* * Dim Palindromes As Object
* * Dim InputString As String
* * Dim chomp As String
* * Dim n As Long


* * Set Palindromes = CreateObject("Scripting.Dictionary")
* * InputString = SearchString
* * Do While Len(InputString) 0
* * * * chomp = FindMaxInitPalin(InputString)
* * * * InputString = Mid(InputString, Len(chomp) + 1)
* * * * If Len(chomp) 1 Then
* * * * * * If Not Palindromes.Exists(chomp) Then
* * * * * * * * Palindromes.Add chomp, 1
* * * * * * Else
* * * * * * * * Palindromes.Item(chomp) = Palindromes.Item(chomp) + 1
* * * * * * End If
* * * * End If
* * Loop
* * If Palindromes.count 0 Then Set FindMaxPalins = Palindromes
End Function


Sub Main()
* * Dim SearchString As String
* * Dim Results As Variant
* * Dim i As Long, n As Long
* * Dim palindrome As Variant


* * On Error GoTo no_palindromes
* * SearchString = InputBox("Enter a string")
* * Set Results = FindMaxPalins(SearchString)
* * n = Results.count
* * Range("A1").Value = n & " distinct palindromes found"
* * Range("A2").Value = "Palindrome"
* * Range("B2").Value = "Frequency"
* * For Each palindrome In Results.Keys
* * * * i = i + 1
* * * * Range("A2").Offset(i).Value = palindrome
* * * * Range("B2").Offset(i).Value = Results.Item(palindrome)
* * Next palindrome
* * Exit Sub
no_palindromes:
* * Range("A1").Value = "No palindromes found"
End Sub


'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''**'''''''''''''''''
Also, there is no need to stick to an inputbox for entering the
string. It is easy enough to modify main() so that it gets its input
from a cell


Out of curiosity, what sort of things are you doing with this code?
Seems interesting.


-John Coleman


Dear John Coleman,
Thank you very much for your help. Let me try explain the aim of the
present study. Let me first introduce myself. My name isLuciano
Paulino da Silva, I`m a scientific researcher in the area of
bionanotechnology. In some of our studies we have characterized the
sequences of proteins and peptides that are formed by strings of 20
possible aminoacid residues commonly represented by letters. We always
needs a lot of statistical and mathematical comparisons among strings..
In this way, we have for some situations even thousands of these
strings and we are interested in analyze them from databanks. Some
time ago, I realized that some of these protein sequences have inside
them the presence of palindromes and repeats and such structures are
very important for their biological functions. In this way, this code
and others are important to test some situations in order to in some
days perform these analyses in large scale. In this way, if it would
be your interest I would be pleased if you accept collaborate with us..
Concerning your code, at present we need some additional things like:
1-The string shoud be typed in A1 cell instead use an inputbox;
2-In addition to palindromes, an identitical approach will have to be
performed to repeats in the strings;


Thanks in advance,
Luciano- Hide quoted text -


- Show quoted text -


DearLuciano,


I suspected that your interests were biological. I am a mathematician
who likes to program as a hobby (and sometimes as an aid to research
or teaching). I have a friend in the biology department who sought my
aid a while back in writing a program which generated random protein
strings with certain properties, and I recognized G,A,Q from your
original post as possibly being abbreviations for glycine, etc. I am
on sabbatical this semester, so have some time on my hands and would
be willing to help you out some (but am currently writing a paper - so
I don't have unlimitted time), though for more serious analysis you
should probably recruit a specialist in bioinformatics who understands
string processing and pattern matching much more than I do. Feel free
to email me if you want. My address is jcoleman followed by @ followed
by franciscan dot edu (don't want to make life too easy for robot
spammers).


What do you mean by repeats? Do you mean strings like GAQGAQ (where
the repeated sequence is adjacent) or GAQCGAQ (where the repeated
sequence can be separated by intervening letters)? And how would you
handle GAQGAQGAQ? Also - do you want all repeats, or do you want
repeats which are in some sense maximal (similar to the palindrome
code I wrote)? Please clarify.


I wrote a new main() to take its input from A1 and report lengths in
column C. The other two functions in the code were unchanged , so just
delete main() and replace it with:


Sub Main()
* * Dim SearchString As String
* * Dim Results As Variant
* * Dim i As Long, n As Long
* * Dim palindrome As Variant
* * Dim R As Range 'output anchor


* * On Error GoTo no_palindromes
* * Set Results = FindMaxPalins(Range("A1").Value)
* * Set R = Range("A3")
* * n = Results.count
* * R.Value = n & " distinct palindromes found"
* * R.Offset(1).Value = "Palindrome"
* * R.Offset(1, 1).Value = "Frequency"
* * R.Offset(1, 2).Value = "Length"
* * For Each palindrome In Results.Keys
* * * * i = i + 1
* * * * R.Offset(i + 1).Value = palindrome
* * * * R.Offset(i + 1, 1).Value = Results.Item(palindrome)
* * * * R.Offset(i + 1, 2).Value = Len(palindrome)
* * Next palindrome
* * Exit Sub
no_palindromes:
* * R.Value = "No palindromes found"
End Sub


Hope that helps


-John Coleman


Dear John Coleman,
Thak you very much for your code. It is perfect for this investigation
approach.
That`s nice, I`m a biologist but among my graduate students there are
a mathematician because your area is very important to our studies.
That`s interesting. I have also a smal VBA-based program to generate
random polypeptides and this is one of our interests.
I will prepare a small text telling you our interests in some string
processing and you will be able to tell us if could help us.

Yes, the repeat sequence could be adjacent or separated by intervening
letters. Both situations are possible on protein sequences. In the
case of GAQGAQGAQ it would be 3 GAQ repeats, using the same logic of
the palindromes. Yes, it is similar to the palindromes, and even some
palindromes are repeats but not all repeats are palindromes...
Thanks in advance,
Luciano- Hide quoted text -

- Show quoted text -


Dear Luciano,

I noticed a potential problem with the code that I've written for you.
The problem is it handles overlapping palindromes. For example, if you
input ABBACCABCC, my code first detects the ABBA then starts looking
for palindromes in the remaining string (CCABCC). That is how I
interpreted you in your second post when you said "I forgot to say
that it would be selected the palindromes as big as detected from the
order they appear in the text. So, if a big palindrome is detected the
next one will be the following after it in the string even when there
are a palindrome inside it. " But notice that this way of handling
things misses BACCAB which begins in the third position since it
overlaps with ABBA. Similar questions will arise in repeat detection -
so I want to clarify things before proceeding further. What would you
like the palindrome output to be in the case of ABBACCABCC?

-John


  #11   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 77
Default Palindromes and repeats

On 24 out, 19:32, John Coleman wrote:
On Oct 24, 2:36*pm,LucianoPaulino da Silva



wrote:
On 24 out, 10:33, John Coleman wrote:


On Oct 24, 6:16*am,LucianoPaulino da Silva


wrote:
On 24 out, 02:10, John Coleman wrote:


Dear John Coleman,
Thank you very much for your fantastic code. It is almost perfect for
my situation.
However, I need that instead of a same detected palindrome is being
repeated all the time it appear in the string, it should be shown once
a time in A column, and that B column should be shown the frequence of
occcurrences of that specific palindrome. Do you know some way to do
that?
Thank you,
Luciano


DearLuciano,


This should work:


'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''


Function FindMaxInitPalin(SearchString As String) As String
* * Dim i As Long, j As Long
* * Dim init As String
* * Dim reflection As String
* * For i = 1 To Len(SearchString)
* * * * init = Mid(SearchString, 1, i)
* * * * reflection = Mid(init, i) & reflection
* * * * If init = reflection Then j = i
* * Next i
* * FindMaxInitPalin = Mid(SearchString, 1, j)
End Function


Function FindMaxPalins(SearchString As String) As Variant
* * Dim Palindromes As Object
* * Dim InputString As String
* * Dim chomp As String
* * Dim n As Long


* * Set Palindromes = CreateObject("Scripting.Dictionary")
* * InputString = SearchString
* * Do While Len(InputString) 0
* * * * chomp = FindMaxInitPalin(InputString)
* * * * InputString = Mid(InputString, Len(chomp) + 1)
* * * * If Len(chomp) 1 Then
* * * * * * If Not Palindromes.Exists(chomp) Then
* * * * * * * * Palindromes.Add chomp, 1
* * * * * * Else
* * * * * * * * Palindromes.Item(chomp) = Palindromes.Item(chomp) + 1
* * * * * * End If
* * * * End If
* * Loop
* * If Palindromes.count 0 Then Set FindMaxPalins = Palindromes
End Function


Sub Main()
* * Dim SearchString As String
* * Dim Results As Variant
* * Dim i As Long, n As Long
* * Dim palindrome As Variant


* * On Error GoTo no_palindromes
* * SearchString = InputBox("Enter a string")
* * Set Results = FindMaxPalins(SearchString)
* * n = Results.count
* * Range("A1").Value = n & " distinct palindromes found"
* * Range("A2").Value = "Palindrome"
* * Range("B2").Value = "Frequency"
* * For Each palindrome In Results.Keys
* * * * i = i + 1
* * * * Range("A2").Offset(i).Value = palindrome
* * * * Range("B2").Offset(i).Value = Results.Item(palindrome)
* * Next palindrome
* * Exit Sub
no_palindromes:
* * Range("A1").Value = "No palindromes found"
End Sub


'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''**'''''''''''''''''
Also, there is no need to stick to an inputbox for entering the
string. It is easy enough to modify main() so that it gets its input
from a cell


Out of curiosity, what sort of things are you doing with this code?
Seems interesting.


-John Coleman


Dear John Coleman,
Thank you very much for your help. Let me try explain the aim of the
present study. Let me first introduce myself. My name isLuciano
Paulino da Silva, I`m a scientific researcher in the area of
bionanotechnology. In some of our studies we have characterized the
sequences of proteins and peptides that are formed by strings of 20
possible aminoacid residues commonly represented by letters. We always
needs a lot of statistical and mathematical comparisons among strings.
In this way, we have for some situations even thousands of these
strings and we are interested in analyze them from databanks. Some
time ago, I realized that some of these protein sequences have inside
them the presence of palindromes and repeats and such structures are
very important for their biological functions. In this way, this code
and others are important to test some situations in order to in some
days perform these analyses in large scale. In this way, if it would
be your interest I would be pleased if you accept collaborate with us.
Concerning your code, at present we need some additional things like:
1-The string shoud be typed in A1 cell instead use an inputbox;
2-In addition to palindromes, an identitical approach will have to be
performed to repeats in the strings;


Thanks in advance,
Luciano- Hide quoted text -


- Show quoted text -


DearLuciano,


I suspected that your interests were biological. I am a mathematician
who likes to program as a hobby (and sometimes as an aid to research
or teaching). I have a friend in the biology department who sought my
aid a while back in writing a program which generated random protein
strings with certain properties, and I recognized G,A,Q from your
original post as possibly being abbreviations for glycine, etc. I am
on sabbatical this semester, so have some time on my hands and would
be willing to help you out some (but am currently writing a paper - so
I don't have unlimitted time), though for more serious analysis you
should probably recruit a specialist in bioinformatics who understands
string processing and pattern matching much more than I do. Feel free
to email me if you want. My address is jcoleman followed by @ followed
by franciscan dot edu (don't want to make life too easy for robot
spammers).


What do you mean by repeats? Do you mean strings like GAQGAQ (where
the repeated sequence is adjacent) or GAQCGAQ (where the repeated
sequence can be separated by intervening letters)? And how would you
handle GAQGAQGAQ? Also - do you want all repeats, or do you want
repeats which are in some sense maximal (similar to the palindrome
code I wrote)? Please clarify.


I wrote a new main() to take its input from A1 and report lengths in
column C. The other two functions in the code were unchanged , so just
delete main() and replace it with:


Sub Main()
* * Dim SearchString As String
* * Dim Results As Variant
* * Dim i As Long, n As Long
* * Dim palindrome As Variant
* * Dim R As Range 'output anchor


* * On Error GoTo no_palindromes
* * Set Results = FindMaxPalins(Range("A1").Value)
* * Set R = Range("A3")
* * n = Results.count
* * R.Value = n & " distinct palindromes found"
* * R.Offset(1).Value = "Palindrome"
* * R.Offset(1, 1).Value = "Frequency"
* * R.Offset(1, 2).Value = "Length"
* * For Each palindrome In Results.Keys
* * * * i = i + 1
* * * * R.Offset(i + 1).Value = palindrome
* * * * R.Offset(i + 1, 1).Value = Results.Item(palindrome)
* * * * R.Offset(i + 1, 2).Value = Len(palindrome)
* * Next palindrome
* * Exit Sub
no_palindromes:
* * R.Value = "No palindromes found"
End Sub


Hope that helps


-John Coleman


Dear John Coleman,
Thak you very much for your code. It is perfect for this investigation
approach.
That`s nice, I`m a biologist but among my graduate students there are
a mathematician because your area is very important to our studies.
That`s interesting. I have also a smal VBA-based program to generate
random polypeptides and this is one of our interests.
I will prepare a small text telling you our interests in some string
processing and you will be able to tell us if could help us.


Yes, the repeat sequence could be adjacent or separated by intervening
letters. Both situations are possible on protein sequences. In the
case of GAQGAQGAQ it would be 3 GAQ repeats, using the same logic of
the palindromes. Yes, it is similar to the palindromes, and even some
palindromes are repeats but not all repeats are palindromes...
Thanks in advance,
Luciano- Hide quoted text -


- Show quoted text -


DearLuciano,

I noticed a potential problem with the code that I've written for you.
The problem is it handles overlapping palindromes. For example, if you
input ABBACCABCC, my code first detects the ABBA then starts looking
for palindromes in the remaining string (CCABCC). That is how I
interpreted you in your second post when you said "I forgot to say
that it would be selected the palindromes as big as detected from the
order they appear in the text. So, if a big palindrome is detected the
next one will be the following after it in the string even when there
are a palindrome inside it. " But notice that this way of handling
things misses BACCAB which begins in the third position since it
overlaps with ABBA. Similar questions will arise in repeat detection -
so I want to clarify things before proceeding further. What would you
like the palindrome output to be in the case of ABBACCABCC?

-John


Dear John,
Thank you for your attention. You are correct and I have realized it
some hours ago. We must always detect the biggest palindrome linearly
according to their occurrence on the string following by smallest
palindromes excluding that palindromes that were inside the initially
detected palindrome. In this way, the output for your example it would
be:

Input: ABBACCABCC

Output:

BACCAB
CC

But anycase it is possible in the future that we require test other
situations. Thanks in advance!
Luciano
  #12   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 274
Default Palindromes and repeats

[snip]
DearLuciano,


I noticed a potential problem with the code that I've written for you.
The problem is it handles overlapping palindromes. For example, if you
input ABBACCABCC, my code first detects the ABBA then starts looking
for palindromes in the remaining string (CCABCC). That is how I
interpreted you in your second post when you said "I forgot to say
that it would be selected the palindromes as big as detected from the
order they appear in the text. So, if a big palindrome is detected the
next one will be the following after it in the string even when there
are a palindrome inside it. " But notice that this way of handling
things misses BACCAB which begins in the third position since it
overlaps with ABBA. Similar questions will arise in repeat detection -
so I want to clarify things before proceeding further. What would you
like the palindrome output to be in the case of ABBACCABCC?


-John


Dear John,
Thank you for your attention. You are correct and I have realized it
some hours ago. We must always detect the biggest palindrome linearly
according to their occurrence on the string following by smallest
palindromes excluding that palindromes that were inside the initially
detected palindrome. In this way, the output for your example it would
be:

Input: ABBACCABCC

Output:

BACCAB
CC

But anycase it is possible in the future that we require test other
situations. Thanks in advance!
Luciano


So you *don't* want ABBA as part of the output?

What would the output be for BAABABBACCABCC?

-John

  #13   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 77
Default Palindromes and repeats

On 24 out, 20:10, John Coleman wrote:
[snip]



DearLuciano,


I noticed a potential problem with the code that I've written for you.
The problem is it handles overlapping palindromes. For example, if you
input ABBACCABCC, my code first detects the ABBA then starts looking
for palindromes in the remaining string (CCABCC). That is how I
interpreted you in your second post when you said "I forgot to say
that it would be selected the palindromes as big as detected from the
order they appear in the text. So, if a big palindrome is detected the
next one will be the following after it in the string even when there
are a palindrome inside it. " But notice that this way of handling
things misses BACCAB which begins in the third position since it
overlaps with ABBA. Similar questions will arise in repeat detection -
so I want to clarify things before proceeding further. What would you
like the palindrome output to be in the case of ABBACCABCC?


-John


Dear John,
Thank you for your attention. You are correct and I have realized it
some hours ago. We must always detect the biggest palindrome linearly
according to their occurrence on the string following by smallest
palindromes excluding that palindromes that were inside the initially
detected palindrome. In this way, the output for your example it would
be:


Input: ABBACCABCC


Output:


BACCAB
CC


But anycase it is possible in the future that we require test other
situations. Thanks in advance!
Luciano


So you *don't* want ABBA as part of the output?

What would the output be for BAABABBACCABCC?

-John


Dear John,
Since ABBA woukd be smaller than BACCAB, at present it would not be
detected.
In your example:
Input: BAABABBACCABCC

Output:
BAAB
BACCAB
CC
  #14   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 77
Default Palindromes and repeats

On 24 out, 20:39, Luciano Paulino da Silva
wrote:
On 24 out, 20:10, John Coleman wrote:



[snip]


DearLuciano,


I noticed a potential problem with the code that I've written for you.
The problem is it handles overlapping palindromes. For example, if you
input ABBACCABCC, my code first detects the ABBA then starts looking
for palindromes in the remaining string (CCABCC). That is how I
interpreted you in your second post when you said "I forgot to say
that it would be selected the palindromes as big as detected from the
order they appear in the text. So, if a big palindrome is detected the
next one will be the following after it in the string even when there
are a palindrome inside it. " But notice that this way of handling
things misses BACCAB which begins in the third position since it
overlaps with ABBA. Similar questions will arise in repeat detection -
so I want to clarify things before proceeding further. What would you
like the palindrome output to be in the case of ABBACCABCC?


-John


Dear John,
Thank you for your attention. You are correct and I have realized it
some hours ago. We must always detect the biggest palindrome linearly
according to their occurrence on the string following by smallest
palindromes excluding that palindromes that were inside the initially
detected palindrome. In this way, the output for your example it would
be:


Input: ABBACCABCC


Output:


BACCAB
CC


But anycase it is possible in the future that we require test other
situations. Thanks in advance!
Luciano


So you *don't* want ABBA as part of the output?


What would the output be for BAABABBACCABCC?


-John


Dear John,
Since ABBA woukd be smaller than BACCAB, at present it would not be
detected.
In your example:
Input: BAABABBACCABCC

Output:
BAAB
BACCAB
CC


Dear John,
I was thinking about the possibility that we could perform the choice
of a rule before perform palindromes or repeats detection. Since
detection of all palindromes including that inside other palindromes,
and maximal palindromes, and all other rules we could think about. Do
you think that it would be possible?
Thanks in advance,
Luciano
  #15   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 274
Default Palindromes and repeats

On Oct 24, 6:39*pm, Luciano Paulino da Silva
wrote:
On 24 out, 20:10, John Coleman wrote:





[snip]


DearLuciano,


I noticed a potential problem with the code that I've written for you.
The problem is it handles overlapping palindromes. For example, if you
input ABBACCABCC, my code first detects the ABBA then starts looking
for palindromes in the remaining string (CCABCC). That is how I
interpreted you in your second post when you said "I forgot to say
that it would be selected the palindromes as big as detected from the
order they appear in the text. So, if a big palindrome is detected the
next one will be the following after it in the string even when there
are a palindrome inside it. " But notice that this way of handling
things misses BACCAB which begins in the third position since it
overlaps with ABBA. Similar questions will arise in repeat detection -
so I want to clarify things before proceeding further. What would you
like the palindrome output to be in the case of ABBACCABCC?


-John


Dear John,
Thank you for your attention. You are correct and I have realized it
some hours ago. We must always detect the biggest palindrome linearly
according to their occurrence on the string following by smallest
palindromes excluding that palindromes that were inside the initially
detected palindrome. In this way, the output for your example it would
be:


Input: ABBACCABCC


Output:


BACCAB
CC


But anycase it is possible in the future that we require test other
situations. Thanks in advance!
Luciano


So you *don't* want ABBA as part of the output?


What would the output be for BAABABBACCABCC?


-John


Dear John,
Since ABBA woukd be smaller than BACCAB, at present it would not be
detected.
In your example:
Input: BAABABBACCABCC

Output:
BAAB
BACCAB
CC- Hide quoted text -

- Show quoted text -


Dear Luciano,

Ok - how does the following algorithm sound? First find the longest
palindrome in the string, taking the leftmost if there is a tie
(BACCACB in the above example). Remove it from the string, leaving 0,1
or 2 remaining substrings (BAABAB and CC in the above example). Repeat
this process on these substrings until only substrings of length < 2
remain. Note that for input ABAB this would output ABA but not BAB.

-John


  #16   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 77
Default Palindromes and repeats

On 24 out, 21:36, John Coleman wrote:
On Oct 24, 6:39*pm,LucianoPaulino da Silva



wrote:
On 24 out, 20:10, John Coleman wrote:


[snip]


DearLuciano,


I noticed a potential problem with the code that I've written for you.
The problem is it handles overlapping palindromes. For example, if you
input ABBACCABCC, my code first detects the ABBA then starts looking
for palindromes in the remaining string (CCABCC). That is how I
interpreted you in your second post when you said "I forgot to say
that it would be selected the palindromes as big as detected from the
order they appear in the text. So, if a big palindrome is detected the
next one will be the following after it in the string even when there
are a palindrome inside it. " But notice that this way of handling
things misses BACCAB which begins in the third position since it
overlaps with ABBA. Similar questions will arise in repeat detection -
so I want to clarify things before proceeding further. What would you
like the palindrome output to be in the case of ABBACCABCC?


-John


Dear John,
Thank you for your attention. You are correct and I have realized it
some hours ago. We must always detect the biggest palindrome linearly
according to their occurrence on the string following by smallest
palindromes excluding that palindromes that were inside the initially
detected palindrome. In this way, the output for your example it would
be:


Input: ABBACCABCC


Output:


BACCAB
CC


But anycase it is possible in the future that we require test other
situations. Thanks in advance!
Luciano


So you *don't* want ABBA as part of the output?


What would the output be for BAABABBACCABCC?


-John


Dear John,
Since ABBA woukd be smaller than BACCAB, at present it would not be
detected.
In your example:
Input: BAABABBACCABCC


Output:
BAAB
BACCAB
CC- Hide quoted text -


- Show quoted text -


DearLuciano,

Ok - how does the following algorithm sound? First find the longest
palindrome in the string, taking the leftmost if there is a tie
(BACCACB in the above example). Remove it from the string, leaving 0,1
or 2 remaining substrings (BAABAB and CC in the above example). Repeat
this process on these substrings until only substrings of length < 2
remain. Note that for input ABAB this would output ABA but not BAB.

-John


Dear John,
Perfect, it is exactly this! It sounds good. After that we could think
other algorithms (rules).
Thanks in advance,
Luciano
  #17   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 274
Default Palindromes and repeats

[snip]
Dear John,
Perfect, it is exactly this! It sounds good. After that we could think
other algorithms (rules).
Thanks in advance,
Luciano- Hide quoted text -


Dear Luciano,
See if this works for you. As before, it uses Main() as the driver
sub, so you should delete all of my old code or create a new workbook
and put this in a new code module so as
to avoid possible name conflicts. I've tried it on a number of strings
and it seems to work as intended, but you should give it a full
battery of tests as well in case I overlooked something.

'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''

Dim palindromes As Object
Dim centers As Variant
Dim SearchString As String
Dim PalinCount As Long
Dim anchor As Range

Sub Initialize()
'initializes module-level variables
'centers contains, for each possible
'center of a palindrome (which may fall
'between two characters), the indices and
'length of the largest palindrome centered there.
'For arithmetical convience, centers will
'be represented by integers between 3 and
'2n-1. palindromes will be a dictionary
'object to store found palindromes and
'their frequencies

Dim i As Long, j As Long, k As Long, n As Long

Set palindromes = CreateObject("Scripting.Dictionary")
Set anchor = Range("A3")
SearchString = Range("A1").Value
n = Len(SearchString)
PalinCount = 0
ReDim centers(3 To 2 * n - 1, 1 To 3)

For i = 3 To 2 * n - 1
If i Mod 2 = 0 Then
'searching for palindromes like BAB
j = i / 2 - 1
k = j + 2
Else
'searching for palindromes like BAAB
j = Int(i / 2)
k = j + 1
End If
Do While j = 1 And k <= n And _
Mid(SearchString, IIf(j 0, j, 1), 1) = _
Mid(SearchString, k, 1)
'(kludge for lack of short-cicuit evaluation in VBA)
j = j - 1
k = k + 1
Loop
'last pass through the loop caused trouble, so correct:
j = j + 1
k = k - 1
'(this has weird effect of making j k (j = k+1)
'if center cuts between two characters but no
'palindrome was found. Nevertheless, this gives
'correct length of 0 in this case, so ok)
centers(i, 1) = k - j + 1 'length
centers(i, 2) = j
centers(i, 3) = k
Next i
End Sub
Sub FindLargest(j As Long, k As Long)
'finds largest palindrome in range j-k
'after finding leftmost such
'it recursively calls itself on left and right remainders,
'printing the string if needed between calls

Dim i As Long, MinI As Long, MaxI As Long
Dim delta As Long
Dim MaxLen As Long, MaxStart As Long, MaxEnd As Long
Dim NewJ As Long, NewK As Long
Dim palindrome As String

MinI = 2 * j + 1
MaxI = 2 * k - 1

For i = MinI To MaxI
If centers(i, 1) 1 Then
'first contract radius of palindrome
'to make it fit in j-k if needed
If centers(i, 2) < j Or centers(i, 3) k Then
delta = j - centers(i, 2)
If delta < centers(i, 3) - k Then
delta = centers(i, 3) - k
End If
centers(i, 2) = centers(i, 2) + delta
centers(i, 3) = centers(i, 3) - delta
centers(i, 1) = centers(i, 3) - centers(i, 2) + 1
End If
If centers(i, 1) MaxLen Then
MaxLen = centers(i, 1)
MaxStart = centers(i, 2)
MaxEnd = centers(i, 3)
End If
End If
Next i

If MaxLen <= 1 Then Exit Sub
'else
'first process any string remaining to left
NewK = MaxStart - 1
NewJ = j
If NewK - NewJ 0 Then FindLargest NewJ, NewK
'process current node
palindrome = Mid(SearchString, MaxStart, MaxLen)
If Not palindromes.Exists(palindrome) Then
'found new palindrome - display it
PalinCount = PalinCount + 1
anchor.Offset(PalinCount) = palindrome
palindromes.Add palindrome, 1
Else
palindromes.Item(palindrome) = palindromes.Item(palindrome) + 1
End If
'now process any remaining right substring
NewJ = MaxEnd + 1
NewK = k
If NewK - NewJ 0 Then FindLargest NewJ, NewK
End Sub

Sub Main()
Dim i As Long, palindrome As String

If Len(Range("A1").Value) < 2 Then
MsgBox "A string of length 1 must be entered in A1"
Exit Sub
End If
Initialize
anchor.CurrentRegion.ClearContents
FindLargest 1, Len(SearchString)
If palindromes.count = 0 Then
anchor.Value = "No palindromes found"
Else
anchor.Value = "Palindrome"
anchor.Offset(0, 1) = "Frequency"
anchor.Offset(0, 2) = "Length"
For i = 1 To PalinCount
palindrome = anchor.Offset(i).Value
anchor.Offset(i, 1).Value = _
palindromes.Item(palindrome)
anchor.Offset(i, 2).Value = Len(palindrome)
Next i
End If
End Sub


'''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''''''''''

Hope this works for you.

-John
  #18   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 77
Default Palindromes and repeats

On 25 out, 19:48, John Coleman wrote:
[snip]

Dear John,
Perfect, it is exactly this! It sounds good. After that we could think
other algorithms (rules).
Thanks in advance,
Luciano- Hide quoted text -


DearLuciano,
See if this works for you. As before, it uses Main() as the driver
sub, so you should delete all of my old code or create a new workbook
and put this in a new code module so as
to avoid possible name conflicts. I've tried it on a number of strings
and it seems to work as intended, but you should give it a full
battery of tests as well in case I overlooked something.

'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''

Dim palindromes As Object
Dim centers As Variant
Dim SearchString As String
Dim PalinCount As Long
Dim anchor As Range

Sub Initialize()
'initializes module-level variables
'centers contains, for each possible
'center of a palindrome (which may fall
'between two characters), the indices and
'length of the largest palindrome centered there.
'For arithmetical convience, centers will
'be represented by integers between 3 and
'2n-1. palindromes will be a dictionary
'object to store found palindromes and
'their frequencies

Dim i As Long, j As Long, k As Long, n As Long

Set palindromes = CreateObject("Scripting.Dictionary")
Set anchor = Range("A3")
SearchString = Range("A1").Value
n = Len(SearchString)
PalinCount = 0
ReDim centers(3 To 2 * n - 1, 1 To 3)

For i = 3 To 2 * n - 1
* * If i Mod 2 = 0 Then
* * * * 'searching for palindromes like BAB
* * * * j = i / 2 - 1
* * * * k = j + 2
* * Else
* * * * 'searching for palindromes like BAAB
* * * * j = Int(i / 2)
* * * * k = j + 1
* * End If
* * Do While j = 1 And k <= n And _
* * * Mid(SearchString, IIf(j 0, j, 1), 1) = _
* * * Mid(SearchString, k, 1)
* * * '(kludge for lack of short-cicuit evaluation in VBA)
* * * * j = j - 1
* * * * k = k + 1
* * Loop
* * 'last pass through the loop caused trouble, so correct:
* * j = j + 1
* * k = k - 1
* * '(this has weird effect of making j k (j = k+1)
* * 'if center cuts between two characters but no
* * 'palindrome was found. Nevertheless, this gives
* * 'correct length of 0 in this case, so ok)
* * centers(i, 1) = k - j + 1 'length
* * centers(i, 2) = j
* * centers(i, 3) = k
Next i
End Sub
Sub FindLargest(j As Long, k As Long)
'finds largest palindrome in range j-k
'after finding leftmost such
'it recursively calls itself on left and right remainders,
'printing the string if needed between calls

Dim i As Long, MinI As Long, MaxI As Long
Dim delta As Long
Dim MaxLen As Long, MaxStart As Long, MaxEnd As Long
Dim NewJ As Long, NewK As Long
Dim palindrome As String

MinI = 2 * j + 1
MaxI = 2 * k - 1

For i = MinI To MaxI
* * If centers(i, 1) 1 Then
* * * * 'first contract radius of palindrome
* * * * 'to make it fit in j-k if needed
* * * * If centers(i, 2) < j Or centers(i, 3) k Then
* * * * * * delta = j - centers(i, 2)
* * * * * * If delta < centers(i, 3) - k Then
* * * * * * * * delta = centers(i, 3) - k
* * * * * * End If
* * * * * * centers(i, 2) = centers(i, 2) + delta
* * * * * * centers(i, 3) = centers(i, 3) - delta
* * * * * * centers(i, 1) = centers(i, 3) - centers(i, 2) + 1
* * * * End If
* * * * If centers(i, 1) MaxLen Then
* * * * * * MaxLen = centers(i, 1)
* * * * * * MaxStart = centers(i, 2)
* * * * * * MaxEnd = centers(i, 3)
* * * * End If
* * End If
Next i

If MaxLen <= 1 Then Exit Sub
'else
'first process any string remaining to left
NewK = MaxStart - 1
NewJ = j
If NewK - NewJ 0 Then FindLargest NewJ, NewK
'process current node
palindrome = Mid(SearchString, MaxStart, MaxLen)
If Not palindromes.Exists(palindrome) Then
* * 'found new palindrome - display it
* * PalinCount = PalinCount + 1
* * anchor.Offset(PalinCount) = palindrome
* * palindromes.Add palindrome, 1
Else
* * palindromes.Item(palindrome) = palindromes.Item(palindrome) + 1
End If
'now process any remaining right substring
NewJ = MaxEnd + 1
NewK = k
If NewK - NewJ 0 Then FindLargest NewJ, NewK
End Sub

Sub Main()
Dim i As Long, palindrome As String

If Len(Range("A1").Value) < 2 Then
* * MsgBox "A string of length 1 must be entered in A1"
* * Exit Sub
End If
Initialize
anchor.CurrentRegion.ClearContents
FindLargest 1, Len(SearchString)
If palindromes.count = 0 Then
* * anchor.Value = "No palindromes found"
Else
* * anchor.Value = "Palindrome"
* * anchor.Offset(0, 1) = "Frequency"
* * anchor.Offset(0, 2) = "Length"
* * For i = 1 To PalinCount
* * * * palindrome = anchor.Offset(i).Value
* * * * anchor.Offset(i, 1).Value = _
* * * * * * palindromes.Item(palindrome)
* * * * anchor.Offset(i, 2).Value = Len(palindrome)
* * Next i
End If
End Sub

'''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''''''''''

Hope this works for you.

-John


Dear John,
I have performed some tests and it sounds good. I just did not
understand well the initialize module. I will perform an additional
battery of tests and tell you. Do you have an idea about how could we
perform similar test for serach repeats?
Thanks in advance,
Luciano
  #19   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 274
Default Palindromes and repeats

[snip]
Dear John,
I have performed some tests and it sounds good. I just did not
understand well the initialize module. I will perform an additional
battery of tests and tell you. Do you have an idea about how could we
perform similar test for serach repeats?
Thanks in advance,
Luciano- Hide quoted text -


Dear Luciano,

The algorithm I suggested yesterday is naturally recursive, but I
feared that a naive recursive implementation would slow to a crawl for
larger strings since it would be repeatedly rediscovering the same
palindromes again and again. The centers array in the initialization
sub allows for a partial memoization ( http://en.wikipedia.org/wiki/Memoization
) for greater efficiency. I decided to focus on the "center" of a
palindrome since it is easier to first of all find the largest
palindrome centered at a given position (which is either at a
character or midway between two characters) and, more importantly, it
seemed to be easier to adjust for the fact that remaining palindromes
must be shortened as larger palindromes are removed earlier in the
call-sequence. If you really want to get a feel for the information
that centers agai, type something into A1 then run:

Sub Display()
'displays initialized centers
Initialize
Dim i As Long, j As Long, k As Long, l As Long
With anchor
.Value = "i"
.Offset(0, 1).Value = "i/2"
.Offset(0, 2).Value = "Palindrome"
.Offset(0, 3).Value = "Length"
.Offset(0, 4).Value = "j"
.Offset(0, 5).Value = "k"
For i = LBound(centers) To UBound(centers)
l = centers(i, 1)
j = centers(i, 2)
k = centers(i, 3)
.Offset(i - 2).Value = i
.Offset(i - 2, 1).Value = i / 2
If l 0 Then .Offset(i - 2, 2).Value = Mid(SearchString, j,
l)
.Offset(i - 2, 3).Value = l
.Offset(i - 2, 4).Value = j
.Offset(i - 2, 5).Value = k
Next i
End With

End Sub

'''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''''
The second column is the crucial one. Note j and k are always
symmetric about that number. If i/2 is an integer then
any palindrome discovered has odd length and if i/2 is like 3.5 then
it is of even length.

hth

-John

p.s. - I'll think about the finding repeats problem a bit tomorrow,
but it might be several days before I get anything since I'm going to
turn my attention back to my paper as well.


  #20   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 274
Default Palindromes and repeats

p.s.

In general I don't like writing code that makes assumptions about the
layout of a spreadsheet. When I was testing my code it became tedious
always manually clearing the output region before rerunning it, so I
included the line

anchor.CurrentRegion.ClearContents

(in the middle of Main())

But if you have data in column D or in A2 (connecting A3 and A1) that
line could wipe out something that
you want to keep. You might want to replace that line with:

Range(anchor, anchor.End(xlDown).Offset(0, 2)).ClearContents

This has the effect of clearing Main's last output but never anything
larger (unless maybe if for some odd reason you have data further down
in Columns A:C , but such data is liable to be overwritten anyway).




  #21   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 77
Default Palindromes and repeats

On 25 out, 21:35, John Coleman wrote:
[snip]

Dear John,
I have performed some tests and it sounds good. I just did not
understand well the initialize module. I will perform an additional
battery of tests and tell you. Do you have an idea about how could we
perform similar test for serach repeats?
Thanks in advance,
Luciano- Hide quoted text -


DearLuciano,

The algorithm I suggested yesterday is naturally recursive, but I
feared that a naive recursive implementation would slow to a crawl for
larger strings since it would be repeatedly rediscovering the same
palindromes again and again. The centers array in the initialization
sub allows for a partial memoization (http://en.wikipedia.org/wiki/Memoization
) for greater efficiency. I decided to focus on the "center" of a
palindrome since it is easier to first of all find the largest
palindrome centered at a given position (which is either at a
character or midway between two characters) and, more importantly, it
seemed to be easier to adjust for the fact that remaining palindromes
must be shortened as larger palindromes are removed earlier in the
call-sequence. If you really want to get a feel for the information
that centers agai, type something into A1 then run:

Sub Display()
'displays initialized centers
Initialize
Dim i As Long, j As Long, k As Long, l As Long
With anchor
* * .Value = "i"
* * .Offset(0, 1).Value = "i/2"
* * .Offset(0, 2).Value = "Palindrome"
* * .Offset(0, 3).Value = "Length"
* * .Offset(0, 4).Value = "j"
* * .Offset(0, 5).Value = "k"
* * For i = LBound(centers) To UBound(centers)
* * * * l = centers(i, 1)
* * * * j = centers(i, 2)
* * * * k = centers(i, 3)
* * * * .Offset(i - 2).Value = i
* * * * .Offset(i - 2, 1).Value = i / 2
* * * * If l 0 Then .Offset(i - 2, 2).Value = Mid(SearchString, j,
l)
* * * * .Offset(i - 2, 3).Value = l
* * * * .Offset(i - 2, 4).Value = j
* * * * .Offset(i - 2, 5).Value = k
* * Next i
End With

End Sub

'''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''''
The second column is the crucial one. Note j and k are always
symmetric about that number. If i/2 is an integer then
any palindrome discovered has odd length and if i/2 is like 3.5 then
it is of even length.

hth

-John

p.s. - I'll think about the finding repeats problem a bit tomorrow,
but it might be several days before I get anything since I'm going to
turn my attention back to my paper as well.


Dear John,
I have tried use your new optimized code but it fail due to an error.
I`m trying understand what is happening.
Thanks in advance,
Luciano
  #22   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 274
Default Palindromes and repeats

On Oct 26, 8:05*pm, Luciano Paulino da Silva
wrote:
On 25 out, 21:35, John Coleman wrote:





[snip]


Dear John,
I have performed some tests and it sounds good. I just did not
understand well the initialize module. I will perform an additional
battery of tests and tell you. Do you have an idea about how could we
perform similar test for serach repeats?
Thanks in advance,
Luciano- Hide quoted text -


DearLuciano,


The algorithm I suggested yesterday is naturally recursive, but I
feared that a naive recursive implementation would slow to a crawl for
larger strings since it would be repeatedly rediscovering the same
palindromes again and again. The centers array in the initialization
sub allows for a partial memoization (http://en.wikipedia.org/wiki/Memoization
) for greater efficiency. I decided to focus on the "center" of a
palindrome since it is easier to first of all find the largest
palindrome centered at a given position (which is either at a
character or midway between two characters) and, more importantly, it
seemed to be easier to adjust for the fact that remaining palindromes
must be shortened as larger palindromes are removed earlier in the
call-sequence. If you really want to get a feel for the information
that centers agai, type something into A1 then run:


Sub Display()
'displays initialized centers
Initialize
Dim i As Long, j As Long, k As Long, l As Long
With anchor
* * .Value = "i"
* * .Offset(0, 1).Value = "i/2"
* * .Offset(0, 2).Value = "Palindrome"
* * .Offset(0, 3).Value = "Length"
* * .Offset(0, 4).Value = "j"
* * .Offset(0, 5).Value = "k"
* * For i = LBound(centers) To UBound(centers)
* * * * l = centers(i, 1)
* * * * j = centers(i, 2)
* * * * k = centers(i, 3)
* * * * .Offset(i - 2).Value = i
* * * * .Offset(i - 2, 1).Value = i / 2
* * * * If l 0 Then .Offset(i - 2, 2).Value = Mid(SearchString, j,
l)
* * * * .Offset(i - 2, 3).Value = l
* * * * .Offset(i - 2, 4).Value = j
* * * * .Offset(i - 2, 5).Value = k
* * Next i
End With


End Sub


'''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''''
The second column is the crucial one. Note j and k are always
symmetric about that number. If i/2 is an integer then
any palindrome discovered has odd length and if i/2 is like 3.5 then
it is of even length.


hth


-John


p.s. - I'll think about the finding repeats problem a bit tomorrow,
but it might be several days before I get anything since I'm going to
turn my attention back to my paper as well.


Dear John,
I have tried use your new optimized code but it fail due to an error.
I`m trying understand what is happening.
Thanks in advance,
Luciano- Hide quoted text -

- Show quoted text -


Dear Luciano,

What input led to an error, and what output did you expect? By "error"
did you mean incorrect output or did you mean some sort of run-time
error? If the later, what is the error code and what line is
hightlighted if you go into debug mode?

-John
  #23   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 274
Default Palindromes and repeats

On Oct 25, 6:05*pm, Luciano Paulino da Silva
wrote:
[snip]
I have performed some tests and it sounds good. I just did not
understand well the initialize module. I will perform an additional
battery of tests and tell you. Do you have an idea about how could we
perform similar test for serach repeats?
Thanks in advance,
Luciano- Hide quoted text -

- Show quoted text -


Dear Luciano,

I became interested in the repeat problem. I wrote the following code.
It lists all substrings of length = 2that are repeated at least once
(with the repeat not overlapping itself) but doesn't list repeats
which have the property that each occurence of those repeats is
contained inside of the occurence of a larger repeat since, for
example, in ABCDABC the fact that AB is repeated is implicit in the
fact that ABC is repeated (on the other hand, my criteria imply that
AB isn't listed even in ABCABDABCAB, although it might be of
significance to note that AB is an "internal" repeat inside of the
repeat ABCAB. On the other hand, you wouldn't want to list a whole
bunch of strings like SS, SSS, SSSS in SSSSSSSSSSTSSSSSSSSSS, hence my
current approach). I list the repeat, its length, frequency and
starting position of the first occurence. It is sorted by order of
descending length and (as a tie-breaker) in order of ascending
position:

'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''''''''''''''''''''''' ''''

Sub FindRepeats()

Dim repeatDict As Object
Dim repeatArray As Variant
Dim repeat As Variant
Dim ext As Variant 'extension
Dim i As Long, j As Long, k As Long
Dim n As Long
Dim myLen As Long, maxLen As Long
Dim SearchString As String
Dim occurences As String
Dim anchor As Range, sortRange As Range
Dim start As Double
Dim A As Variant
Dim RepeatFound As Boolean

'start = Timer
Set anchor = Range("A3")
Range(anchor, anchor.End(xlDown).Offset(0, 3)).ClearContents
Application.ScreenUpdating = False
Set repeatDict = CreateObject("Scripting.Dictionary")
SearchString = Range("A1").Value
n = Len(SearchString)
'load dictionary:
'initialize with repeats of length 2 since longer
'repeats are extensions of shorter ones
myLen = 2
With repeatDict 'this block isn't indented since confusion
'is unlikely
For i = 1 To n - myLen + 1
repeat = Mid(SearchString, i, myLen)
If .Exists(repeat) Then
.Item(repeat) = .Item(repeat) & "," & i
Else
.Add repeat, i
'i is first occurence
End If
Next i
'clean up dict - deleting "repeats" that only
'occur once or only repeat by overlapping first occurence
For Each repeat In .Keys
A = Split(.Item(repeat), ",")
If UBound(A) = 0 Then
.Remove repeat
ElseIf CInt(A(UBound(A))) < A(0) + myLen Then
.Remove repeat
Else
RepeatFound = True
End If
Next repeat
myLen = 3
Do While myLen <= Int(n / 2) And RepeatFound
RepeatFound = False ' have to find additional repeats
For Each repeat In .Keys
If Len(repeat) + 1 = myLen Then
A = Split(.Item(repeat), ",")
For i = LBound(A) To UBound(A)
j = A(i)
ext = Mid(SearchString, j, myLen)
If Len(ext) = myLen Then 'might have been cut off
If .Exists(ext) Then
.Item(ext) = .Item(ext) & "," & j
Else
.Add ext, j
End If
End If
Next i
End If
Next repeat
'cleanup dictionary
For Each repeat In .Keys
If Len(repeat) = myLen Then
A = Split(.Item(repeat), ",")
If UBound(A) = 0 Then
.Remove repeat
ElseIf CInt(A(UBound(A))) < A(0) + myLen Then
.Remove repeat
Else
RepeatFound = True
End If
End If
Next repeat
myLen = myLen + 1
Loop

'Now clean up the dictionary another way
'delete from it all repeats with the property that
'each occurences of that repeat is a substring of
'an occurence of a larger repeat
'Note that the maximum size of a
'repeat is myLen - 1, hence the maximum size of
'a repeat which can be contained in another is myLen-2

maxLen = myLen - 2
For myLen = 2 To maxLen
For Each repeat In .Keys
If Len(repeat) = myLen Then
A = Split(.Item(repeat), ",")
RepeatFound = True 'to enter first pass
i = 0
Do While i <= UBound(A) And RepeatFound
j = A(i)
RepeatFound = False 'is the occurence at j
'part of a larger repeat?
'two ways this could be so
If j 1 Then
ext = Mid(SearchString, j - 1, myLen + 1)
If .Exists(ext) Then
occurences = .Item(ext)
If occurences Like (j - 1) & ",*" _
Or occurences Like "*," & (j - 1) & ",*" _
Or occurences Like "*," & (j - 1) Then
RepeatFound = True
End If
End If
End If
If j < n Then
ext = Mid(SearchString, j, myLen + 1)
If .Exists(ext) Then
occurences = .Item(ext)
If occurences Like j & ",*" _
Or occurences Like "*," & j & ",*" _
Or occurences Like "*," & j Then
RepeatFound = True
End If
End If
End If
i = i + 1
Loop
If RepeatFound Then .Remove repeat
End If
Next repeat
Next myLen

'transfer data to spreadsheet
If repeatDict.count = 0 Then
anchor.Value = "No repeats found"
Exit Sub
End If
anchor.Value = "Repeat"
anchor.Offset(0, 1).Value = "Length"
anchor.Offset(0, 2).Value = "Frequency"
anchor.Offset(0, 3).Value = "First Index"
i = 1
For Each repeat In .Keys
anchor.Offset(i, 0) = repeat
anchor.Offset(i, 1) = Len(repeat)
A = Split(.Item(repeat), ",")
anchor.Offset(i, 2) = UBound(A) + 1
anchor.Offset(i, 3) = A(0)
i = i + 1
Next repeat
Set sortRange = Range(anchor.Offset(1), _
anchor.Offset(.count, 3))
sortRange.Sort Key1:=anchor.Offset(1, 1), _
Order1:=xlDescending, _
Key2:=anchor.Offset(1, 3), _
Order2:=xlAscending
End With
Set repeatDict = Nothing
Application.ScreenUpdating = True
'MsgBox Timer - start
End Sub

'''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''''''''''''''''''''''' ''''

Hope that helps

-John

p.s. Did you clarify what the issue with the palindrome program is?
  #24   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 77
Default Palindromes and repeats

On 26 out, 23:27, John Coleman wrote:
On Oct 26, 8:05*pm,LucianoPaulino da Silva



wrote:
On 25 out, 21:35, John Coleman wrote:


[snip]


Dear John,
I have performed some tests and it sounds good. I just did not
understand well the initialize module. I will perform an additional
battery of tests and tell you. Do you have an idea about how could we
perform similar test for serach repeats?
Thanks in advance,
Luciano- Hide quoted text -


DearLuciano,


The algorithm I suggested yesterday is naturally recursive, but I
feared that a naive recursive implementation would slow to a crawl for
larger strings since it would be repeatedly rediscovering the same
palindromes again and again. The centers array in the initialization
sub allows for a partial memoization (http://en.wikipedia.org/wiki/Memoization
) for greater efficiency. I decided to focus on the "center" of a
palindrome since it is easier to first of all find the largest
palindrome centered at a given position (which is either at a
character or midway between two characters) and, more importantly, it
seemed to be easier to adjust for the fact that remaining palindromes
must be shortened as larger palindromes are removed earlier in the
call-sequence. If you really want to get a feel for the information
that centers agai, type something into A1 then run:


Sub Display()
'displays initialized centers
Initialize
Dim i As Long, j As Long, k As Long, l As Long
With anchor
* * .Value = "i"
* * .Offset(0, 1).Value = "i/2"
* * .Offset(0, 2).Value = "Palindrome"
* * .Offset(0, 3).Value = "Length"
* * .Offset(0, 4).Value = "j"
* * .Offset(0, 5).Value = "k"
* * For i = LBound(centers) To UBound(centers)
* * * * l = centers(i, 1)
* * * * j = centers(i, 2)
* * * * k = centers(i, 3)
* * * * .Offset(i - 2).Value = i
* * * * .Offset(i - 2, 1).Value = i / 2
* * * * If l 0 Then .Offset(i - 2, 2).Value = Mid(SearchString, j,
l)
* * * * .Offset(i - 2, 3).Value = l
* * * * .Offset(i - 2, 4).Value = j
* * * * .Offset(i - 2, 5).Value = k
* * Next i
End With


End Sub


'''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''''
The second column is the crucial one. Note j and k are always
symmetric about that number. If i/2 is an integer then
any palindrome discovered has odd length and if i/2 is like 3.5 then
it is of even length.


hth


-John


p.s. - I'll think about the finding repeats problem a bit tomorrow,
but it might be several days before I get anything since I'm going to
turn my attention back to my paper as well.


Dear John,
I have tried use your new optimized code but it fail due to an error.
I`m trying understand what is happening.
Thanks in advance,
Luciano- Hide quoted text -


- Show quoted text -


DearLuciano,

What input led to an error, and what output did you expect? By "error"
did you mean incorrect output or did you mean some sort of run-time
error? If the later, what is the error code and what line is
hightlighted if you go into debug mode?

-John


Dear John,
I performed the folloing input string:

GQGAGQGAGAAAAAAGGAGQGGYGGLGNQGAGRGGQGAAAAAAGGAGQGG YGGLGSQGAGRGGLGGQGAGAAAAAAGGAGQGGYGGLGGQGAGQGAGQGG YGGLGIQGSGRGGLGGQGAGAAAAAAGGAGQGGLGGQGAGQGAGAAAAAA GGVRQGGYGGLGSQGAGRGGQGAGAAAAAAGGAGQGGYGGLGGQGVGRGG LGGQGAGAAAAGGAGQGGYGGVGSGASAASAAASRLSSPQASSRVSSAVS NLVASGPTNSAALSSTISNVVSQIGASNPGLSGCDVLIQALLEVVSALIQ ILGSSSIGQVNYGSAGQATQIVGQSVYQALG

and the output of the sub Display was:

i i/2 Palindrome Length j k
3 1.5 7 0 2 1
4 2 GQG 3 1 3
5 2.5 3 0 3 2
6 3 G 1 3 3
7 3.5 2 0 4 3
8 4 GQGAGQG 7 1 7
9 4.5 13 0 5 4
10 5 G 1 5 5
11 5.5 3 0 6 5
12 6 GAGQGAG 7 3 9
13 6.5 3 0 7 6
14 7 G 1 7 7
15 7.5 6 0 8 7
16 8 GAG 3 7 9
17 8.5 2 0 9 8
18 9 AGA 3 8 10
19 9.5 2 0 10 9
20 10 A 1 10 10
21 10.5 AA 2 10 11
22 11 AAA 3 10 12
23 11.5 AAAA 4 10 13
24 12 AAAAA 5 10 14
25 12.5 GAAAAAAG 8 9 16
26 13 AAAAA 5 11 15
27 13.5 AAAA 4 12 15
28 14 AAA 3 13 15
29 14.5 AA 2 14 15
30 15 A 1 15 15
31 15.5 0 16 15
32 16 G 1 16 16
33 16.5 AGGA 4 15 18
34 17 G 1 17 17
35 17.5 0 18 17
36 18 GAG 3 17 19
37 18.5 0 19 18
38 19 G 1 19 19
39 19.5 0 20 19
40 20 GQG 3 19 21
41 20.5 0 21 20
42 21 G 1 21 21
43 21.5 GG 2 21 22
44 22 G 1 22 22
45 22.5 0 23 22
46 23 GGYGG 5 21 25
47 23.5 0 24 23
48 24 G 1 24 24
49 24.5 GG 2 24 25
50 25 G 1 25 25
51 25.5 0 26 25
52 26 GLG 3 25 27
53 26.5 0 27 26
54 27 G 1 27 27
55 27.5 0 28 27
56 28 N 1 28 28
57 28.5 0 29 28
58 29 Q 1 29 29
59 29.5 0 30 29
60 30 G 1 30 30
61 30.5 0 31 30
62 31 GAG 3 30 32
63 31.5 0 32 31
64 32 G 1 32 32
65 32.5 0 33 32
66 33 GRG 3 32 34
67 33.5 0 34 33
68 34 G 1 34 34
69 34.5 GG 2 34 35
70 35 G 1 35 35
71 35.5 0 36 35
72 36 GQG 3 35 37
73 36.5 0 37 36
74 37 G 1 37 37
75 37.5 0 38 37
76 38 A 1 38 38
77 38.5 AA 2 38 39
78 39 AAA 3 38 40
79 39.5 AAAA 4 38 41
80 40 AAAAA 5 38 42
81 40.5 GAAAAAAG 8 37 44
82 41 AAAAA 5 39 43
83 41.5 AAAA 4 40 43
84 42 AAA 3 41 43
85 42.5 AA 2 42 43
86 43 A 1 43 43
87 43.5 0 44 43
88 44 G 1 44 44
89 44.5 AGGA 4 43 46
90 45 G 1 45 45
91 45.5 0 46 45
92 46 GAG 3 45 47
93 46.5 0 47 46
94 47 G 1 47 47
95 47.5 0 48 47
96 48 GQG 3 47 49
97 48.5 0 49 48
98 49 G 1 49 49
99 49.5 GG 2 49 50
100 50 G 1 50 50
101 50.5 0 51 50
102 51 GGYGG 5 49 53
103 51.5 0 52 51
104 52 G 1 52 52
105 52.5 GG 2 52 53
106 53 G 1 53 53
107 53.5 0 54 53
108 54 GLG 3 53 55
109 54.5 0 55 54
110 55 G 1 55 55
111 55.5 0 56 55
112 56 S 1 56 56
113 56.5 0 57 56
114 57 Q 1 57 57
115 57.5 0 58 57
116 58 G 1 58 58
117 58.5 0 59 58
118 59 GAG 3 58 60
119 59.5 0 60 59
120 60 G 1 60 60
121 60.5 0 61 60
122 61 GRG 3 60 62
123 61.5 0 62 61
124 62 G 1 62 62
125 62.5 GG 2 62 63
126 63 G 1 63 63
127 63.5 0 64 63
128 64 GGLGG 5 62 66
129 64.5 0 65 64
130 65 G 1 65 65
131 65.5 GG 2 65 66
132 66 G 1 66 66
133 66.5 0 67 66
134 67 GQG 3 66 68
135 67.5 0 68 67
136 68 G 1 68 68
137 68.5 0 69 68
138 69 GAG 3 68 70
139 69.5 0 70 69
140 70 AGA 3 69 71
141 70.5 0 71 70
142 71 A 1 71 71
143 71.5 AA 2 71 72
144 72 AAA 3 71 73
145 72.5 AAAA 4 71 74
146 73 AAAAA 5 71 75
147 73.5 GAAAAAAG 8 70 77
148 74 AAAAA 5 72 76
149 74.5 AAAA 4 73 76
150 75 AAA 3 74 76
151 75.5 AA 2 75 76
152 76 A 1 76 76
153 76.5 0 77 76
154 77 G 1 77 77
155 77.5 AGGA 4 76 79
156 78 G 1 78 78
157 78.5 0 79 78
158 79 GAG 3 78 80
159 79.5 0 80 79
160 80 G 1 80 80
161 80.5 0 81 80
162 81 GQG 3 80 82
163 81.5 0 82 81
164 82 G 1 82 82
165 82.5 GG 2 82 83
166 83 G 1 83 83
167 83.5 0 84 83
168 84 GGYGG 5 82 86
169 84.5 0 85 84
170 85 G 1 85 85
171 85.5 GG 2 85 86
172 86 G 1 86 86
173 86.5 0 87 86
174 87 GGLGG 5 85 89
175 87.5 0 88 87
176 88 G 1 88 88
177 88.5 GG 2 88 89
178 89 G 1 89 89
179 89.5 0 90 89
180 90 GQG 3 89 91
181 90.5 0 91 90
182 91 G 1 91 91
183 91.5 0 92 91
184 92 GQGAGQG 7 89 95
185 92.5 0 93 92
186 93 G 1 93 93
187 93.5 0 94 93
188 94 GGQGAGQGAGQGG 13 88 100
189 94.5 0 95 94
190 95 G 1 95 95
191 95.5 0 96 95
192 96 GQGAGQG 7 93 99
193 96.5 0 97 96
194 97 G 1 97 97
195 97.5 0 98 97
196 98 GQG 3 97 99
197 98.5 0 99 98
198 99 G 1 99 99
199 99.5 GG 2 99 100
200 100 G 1 100 100
201 100.5 0 101 100
202 101 GGYGG 5 99 103
203 101.5 0 102 101
204 102 G 1 102 102
205 102.5 GG 2 102 103
206 103 G 1 103 103
207 103.5 0 104 103
208 104 GLG 3 103 105
209 104.5 0 105 104
210 105 G 1 105 105
211 105.5 0 106 105
212 106 I 1 106 106
213 106.5 0 107 106
214 107 Q 1 107 107
215 107.5 0 108 107
216 108 G 1 108 108
217 108.5 0 109 108
218 109 GSG 3 108 110
219 109.5 0 110 109
220 110 G 1 110 110
221 110.5 0 111 110
222 111 GRG 3 110 112
223 111.5 0 112 111
224 112 G 1 112 112
225 112.5 GG 2 112 113
226 113 G 1 113 113
227 113.5 0 114 113
228 114 GGLGG 5 112 116
229 114.5 0 115 114
230 115 G 1 115 115
231 115.5 GG 2 115 116
232 116 G 1 116 116
233 116.5 0 117 116
234 117 GQG 3 116 118
235 117.5 0 118 117
236 118 G 1 118 118
237 118.5 0 119 118
238 119 GAG 3 118 120
239 119.5 0 120 119
240 120 AGA 3 119 121
241 120.5 0 121 120
242 121 A 1 121 121
243 121.5 AA 2 121 122
244 122 AAA 3 121 123
245 122.5 AAAA 4 121 124
246 123 AAAAA 5 121 125
247 123.5 GAAAAAAG 8 120 127
248 124 AAAAA 5 122 126
249 124.5 AAAA 4 123 126
250 125 AAA 3 124 126
251 125.5 AA 2 125 126
252 126 A 1 126 126
253 126.5 0 127 126
254 127 G 1 127 127
255 127.5 AGGA 4 126 129
256 128 G 1 128 128
257 128.5 0 129 128
258 129 GAG 3 128 130
259 129.5 0 130 129
260 130 G 1 130 130
261 130.5 0 131 130
262 131 GQG 3 130 132
263 131.5 0 132 131
264 132 G 1 132 132
265 132.5 GG 2 132 133
266 133 G 1 133 133
267 133.5 0 134 133
268 134 GAGQGGLGGQGAG 13 128 140
269 134.5 0 135 134
270 135 G 1 135 135
271 135.5 GG 2 135 136
272 136 G 1 136 136
273 136.5 0 137 136
274 137 GQG 3 136 138
275 137.5 0 138 137
276 138 G 1 138 138
277 138.5 0 139 138
278 139 GQGAGQG 7 136 142
279 139.5 0 140 139
280 140 G 1 140 140
281 140.5 0 141 140
282 141 GAGQGAG 7 138 144
283 141.5 0 142 141
284 142 G 1 142 142
285 142.5 0 143 142
286 143 GAG 3 142 144
287 143.5 0 144 143
288 144 AGA 3 143 145
289 144.5 0 145 144
290 145 A 1 145 145
291 145.5 AA 2 145 146
292 146 AAA 3 145 147
293 146.5 AAAA 4 145 148
294 147 AAAAA 5 145 149
295 147.5 GAAAAAAG 8 144 151
296 148 AAAAA 5 146 150
297 148.5 AAAA 4 147 150
298 149 AAA 3 148 150
299 149.5 AA 2 149 150
300 150 A 1 150 150
301 150.5 0 151 150
302 151 G 1 151 151
303 151.5 GG 2 151 152
304 152 G 1 152 152
305 152.5 0 153 152
306 153 V 1 153 153
307 153.5 0 154 153
308 154 R 1 154 154
309 154.5 0 155 154
310 155 Q 1 155 155
311 155.5 0 156 155
312 156 G 1 156 156
313 156.5 GG 2 156 157
314 157 G 1 157 157
315 157.5 0 158 157
316 158 GGYGG 5 156 160
317 158.5 0 159 158
318 159 G 1 159 159
319 159.5 GG 2 159 160
320 160 G 1 160 160
321 160.5 0 161 160
322 161 GLG 3 160 162
323 161.5 0 162 161
324 162 G 1 162 162
325 162.5 0 163 162
326 163 S 1 163 163
327 163.5 0 164 163
328 164 Q 1 164 164
329 164.5 0 165 164
330 165 G 1 165 165
331 165.5 0 166 165
332 166 GAG 3 165 167
333 166.5 0 167 166
334 167 G 1 167 167
335 167.5 0 168 167
336 168 GRG 3 167 169
337 168.5 0 169 168
338 169 G 1 169 169
339 169.5 GG 2 169 170
340 170 G 1 170 170
341 170.5 0 171 170
342 171 GQG 3 170 172
343 171.5 0 172 171
344 172 G 1 172 172
345 172.5 0 173 172
346 173 GAG 3 172 174
347 173.5 0 174 173
348 174 AGA 3 173 175
349 174.5 0 175 174
350 175 A 1 175 175
351 175.5 AA 2 175 176
352 176 AAA 3 175 177
353 176.5 AAAA 4 175 178
354 177 AAAAA 5 175 179
355 177.5 GAAAAAAG 8 174 181
356 178 AAAAA 5 176 180
357 178.5 AAAA 4 177 180
358 179 AAA 3 178 180
359 179.5 AA 2 179 180
360 180 A 1 180 180
361 180.5 0 181 180
362 181 G 1 181 181
363 181.5 AGGA 4 180 183
364 182 G 1 182 182
365 182.5 0 183 182
366 183 GAG 3 182 184
367 183.5 0 184 183
368 184 G 1 184 184
369 184.5 0 185 184
370 185 GQG 3 184 186
371 185.5 0 186 185
372 186 G 1 186 186
373 186.5 GG 2 186 187
374 187 G 1 187 187
375 187.5 0 188 187
376 188 GGYGG 5 186 190
377 188.5 0 189 188
378 189 G 1 189 189
379 189.5 GG 2 189 190
380 190 G 1 190 190
381 190.5 0 191 190
382 191 GGLGG 5 189 193
383 191.5 0 192 191
384 192 G 1 192 192
385 192.5 GG 2 192 193
386 193 G 1 193 193
387 193.5 0 194 193
388 194 GQG 3 193 195
389 194.5 0 195 194
390 195 G 1 195 195
391 195.5 0 196 195
392 196 GVG 3 195 197
393 196.5 0 197 196
394 197 G 1 197 197
395 197.5 0 198 197
396 198 GRG 3 197 199
397 198.5 0 199 198
398 199 G 1 199 199
399 199.5 GG 2 199 200
400 200 G 1 200 200
401 200.5 0 201 200
402 201 GGLGG 5 199 203
403 201.5 0 202 201
404 202 G 1 202 202
405 202.5 GG 2 202 203
406 203 G 1 203 203
407 203.5 0 204 203
408 204 GQG 3 203 205
409 204.5 0 205 204
410 205 G 1 205 205
411 205.5 0 206 205
412 206 GAG 3 205 207
413 206.5 0 207 206
414 207 AGA 3 206 208
415 207.5 0 208 207
416 208 A 1 208 208
417 208.5 AA 2 208 209
418 209 AAA 3 208 210
419 209.5 GAAAAG 6 207 212
420 210 AAA 3 209 211
421 210.5 AA 2 210 211
422 211 A 1 211 211
423 211.5 0 212 211
424 212 G 1 212 212
425 212.5 AGGA 4 211 214
426 213 G 1 213 213
427 213.5 0 214 213
428 214 GAG 3 213 215
429 214.5 0 215 214
430 215 G 1 215 215
431 215.5 0 216 215
432 216 GQG 3 215 217
433 216.5 0 217 216
434 217 G 1 217 217
435 217.5 GG 2 217 218
436 218 G 1 218 218
437 218.5 0 219 218
438 219 GGYGG 5 217 221
439 219.5 0 220 219
440 220 G 1 220 220
441 220.5 GG 2 220 221
442 221 G 1 221 221
443 221.5 0 222 221
444 222 GVG 3 221 223
445 222.5 0 223 222
446 223 G 1 223 223
447 223.5 0 224 223
448 224 GSG 3 223 225
449 224.5 0 225 224
450 225 G 1 225 225
451 225.5 0 226 225
452 226 A 1 226 226
453 226.5 0 227 226
454 227 ASA 3 226 228
455 227.5 0 228 227
456 228 A 1 228 228
457 228.5 ASAASA 6 226 231
458 229 A 1 229 229
459 229.5 0 230 229
460 230 AASAA 5 228 232
461 230.5 0 231 230
462 231 A 1 231 231
463 231.5 AA 2 231 232
464 232 SAAAS 5 230 234
465 232.5 AA 2 232 233
466 233 A 1 233 233
467 233.5 0 234 233
468 234 S 1 234 234
469 234.5 0 235 234
470 235 R 1 235 235
471 235.5 0 236 235
472 236 L 1 236 236
473 236.5 0 237 236
474 237 S 1 237 237
475 237.5 SS 2 237 238
476 238 S 1 238 238
477 238.5 0 239 238
478 239 P 1 239 239
479 239.5 0 240 239
480 240 Q 1 240 240
481 240.5 0 241 240
482 241 A 1 241 241
483 241.5 0 242 241
484 242 S 1 242 242
485 242.5 SS 2 242 243
486 243 S 1 243 243
487 243.5 0 244 243
488 244 R 1 244 244
489 244.5 0 245 244
490 245 V 1 245 245
491 245.5 0 246 245
492 246 S 1 246 246
493 246.5 SS 2 246 247
494 247 S 1 247 247
495 247.5 0 248 247
496 248 A 1 248 248
497 248.5 0 249 248
498 249 V 1 249 249
499 249.5 0 250 249
500 250 S 1 250 250
501 250.5 0 251 250
502 251 N 1 251 251
503 251.5 0 252 251
504 252 L 1 252 252
505 252.5 0 253 252
506 253 V 1 253 253
507 253.5 0 254 253
508 254 A 1 254 254
509 254.5 0 255 254
510 255 S 1 255 255
511 255.5 0 256 255
512 256 G 1 256 256
513 256.5 0 257 256
514 257 P 1 257 257
515 257.5 0 258 257
516 258 T 1 258 258
517 258.5 0 259 258
518 259 N 1 259 259
519 259.5 0 260 259
520 260 S 1 260 260
521 260.5 0 261 260
522 261 A 1 261 261
523 261.5 AA 2 261 262
524 262 A 1 262 262
525 262.5 0 263 262
526 263 L 1 263 263
527 263.5 0 264 263
528 264 S 1 264 264
529 264.5 SS 2 264 265
530 265 S 1 265 265
531 265.5 0 266 265
532 266 T 1 266 266
533 266.5 0 267 266
534 267 I 1 267 267
535 267.5 0 268 267
536 268 S 1 268 268
537 268.5 0 269 268
538 269 N 1 269 269
539 269.5 0 270 269
540 270 V 1 270 270
541 270.5 VV 2 270 271
542 271 V 1 271 271
543 271.5 0 272 271
544 272 S 1 272 272
545 272.5 0 273 272
546 273 Q 1 273 273
547 273.5 0 274 273
548 274 I 1 274 274
549 274.5 0 275 274
550 275 G 1 275 275
551 275.5 0 276 275
552 276 A 1 276 276
553 276.5 0 277 276
554 277 S 1 277 277
555 277.5 0 278 277
556 278 N 1 278 278
557 278.5 0 279 278
558 279 P 1 279 279
559 279.5 0 280 279
560 280 G 1 280 280
561 280.5 0 281 280
562 281 L 1 281 281
563 281.5 0 282 281
564 282 S 1 282 282
565 282.5 0 283 282
566 283 G 1 283 283
567 283.5 0 284 283
568 284 C 1 284 284
569 284.5 0 285 284
570 285 D 1 285 285
571 285.5 0 286 285
572 286 V 1 286 286
573 286.5 0 287 286
574 287 L 1 287 287
575 287.5 0 288 287
576 288 I 1 288 288
577 288.5 0 289 288
578 289 Q 1 289 289
579 289.5 0 290 289
580 290 A 1 290 290
581 290.5 0 291 290
582 291 L 1 291 291
583 291.5 LL 2 291 292
584 292 L 1 292 292
585 292.5 0 293 292
586 293 E 1 293 293
587 293.5 0 294 293
588 294 V 1 294 294
589 294.5 VV 2 294 295
590 295 V 1 295 295
591 295.5 0 296 295
592 296 S 1 296 296
593 296.5 0 297 296
594 297 A 1 297 297
595 297.5 0 298 297
596 298 L 1 298 298
597 298.5 0 299 298
598 299 I 1 299 299
599 299.5 0 300 299
600 300 LIQIL 5 298 302
601 300.5 0 301 300
602 301 I 1 301 301
603 301.5 0 302 301
604 302 L 1 302 302
605 302.5 0 303 302
606 303 G 1 303 303
607 303.5 0 304 303
608 304 S 1 304 304
609 304.5 SS 2 304 305
610 305 SSS 3 304 306
611 305.5 SS 2 305 306
612 306 S 1 306 306
613 306.5 0 307 306
614 307 I 1 307 307
615 307.5 0 308 307
616 308 G 1 308 308
617 308.5 0 309 308
618 309 Q 1 309 309
619 309.5 0 310 309
620 310 V 1 310 310
621 310.5 0 311 310
622 311 N 1 311 311
623 311.5 0 312 311
624 312 Y 1 312 312
625 312.5 0 313 312
626 313 G 1 313 313
627 313.5 0 314 313
628 314 S 1 314 314
629 314.5 0 315 314
630 315 A 1 315 315
631 315.5 0 316 315
632 316 G 1 316 316
633 316.5 0 317 316
634 317 Q 1 317 317
635 317.5 0 318 317
636 318 A 1 318 318
637 318.5 0 319 318
638 319 T 1 319 319
639 319.5 0 320 319
640 320 Q 1 320 320
641 320.5 0 321 320
642 321 I 1 321 321
643 321.5 0 322 321
644 322 V 1 322 322
645 322.5 0 323 322
646 323 G 1 323 323
647 323.5 0 324 323
648 324 Q 1 324 324
649 324.5 0 325 324
650 325 S 1 325 325
651 325.5 0 326 325
652 326 V 1 326 326
653 326.5 0 327 326
654 327 Y 1 327 327
655 327.5 0 328 327
656 328 Q 1 328 328
657 328.5 0 329 328
658 329 A 1 329 329
659 329.5 0 330 329
660 330 L 1 330 330
661 330.5 0 331 330

Do you know what is happening?
Thanks in advance,
Luciano
  #25   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 274
Default Palindromes and repeats

On Oct 29, 2:08*pm, Luciano Paulino da Silva
wrote:
Dear John,
I performed the folloing input string:
GQGAGQGAGAAAAAAGGAGQGGYGGLGNQGAGRGGQGAAAAAAGGAGQGG YGGLGSQGAGRGGLGGQGAGAAAAA*AGGAGQGGYGGLGGQGAGQGAGQG GYGGLGIQGSGRGGLGGQGAGAAAAAAGGAGQGGLGGQGAGQGAGAAAAA A*GGVRQGGYGGLGSQGAGRGGQGAGAAAAAAGGAGQGGYGGLGGQGVGR GGLGGQGAGAAAAGGAGQGGYGGVGSG*ASAASAAASRLSSPQASSRVSS AVSNLVASGPTNSAALSSTISNVVSQIGASNPGLSGCDVLIQALLEVVSA LIQ*ILGSSSIGQVNYGSAGQATQIVGQSVYQALG


[snip]

Dear Luciano,

The purpose of sub Display() was not to replace Main() - it was simply
to show what sort of data Initialize() was collecting since you
indicated that you couldn't figure out what that subroutine was doing
(which isn't surprising since I made little effort to comment the code
and the algorithm is somewhat invovled). If you find the output of
Display() confusing then it is maybe more trouble than it is worth. I
wrote it mostly for myself for debugging purposes anyway. You can
delete it from the project with no ill-effect.

If you want to understand the output a bit more, consider the
following bit:

[snip]

18 * * *9 * * * AGA * * 3 * * * 8 * * * 10
19 * * *9.5 * * 2 * * * 0 * * * 10 * * *9
20 * * *10 * * *A * * * 1 * * * 10 * * *10
21 * * *10.5 * *AA * * *2 * * * 10 * * *11
22 * * *11 * * *AAA * * 3 * * * 10 * * *12
23 * * *11.5 * *AAAA * *4 * * * 10 * * *13
24 * * *12 * * *AAAAA * 5 * * * 10 * * *14
25 * * *12.5 * *GAAAAAAG * * * *8 * * * 9 * * * 16
26 * * *13 * * *AAAAA * 5 * * * 11 * * *15
27 * * *13.5 * *AAAA * *4 * * * 12 * * *15
28 * * *14 * * *AAA * * 3 * * * 13 * * *15
29 * * *14.5 * *AA * * *2 * * * 14 * * *15
30 * * *15 * * *A * * * 1 * * * 15 * * *15
31 * * *15.5 * * * * * *0 * * * 16 * * *15


[snip]

AGA is a plaindrome of length 3 which runs from string position 8 to
10. It is thus centered at 9 = 18/2

GAAAAAAG is a palindrome of length 8 which extends from 9 to 16.
Unlike AGA, it has no middle character. Instead, the "center" of the
palindrome falls between GAAA and AAAG, which I refer to as position.

Don't worry about this output too much. It is probably a better use of
time to give Main() a battery of tests. On the other hand, if either
you or someone on your team is going to maintain or modify the code,
then I could try to write up a brief description of the algorithm and
comment the code more.

Sorry for any confusion.

-John
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Palindromes and repeats Luciano Paulino da Silva Excel Programming 0 October 23rd 09 02:22 AM
Macro for detect palindromes and repeats in letters/numbers string Luciano Paulino da Silva Excel Worksheet Functions 33 April 16th 09 04:59 PM
Macro for detect palindromes and repeats in letters/numbers string Luciano Paulino da Silva New Users to Excel 5 April 14th 09 08:44 PM
Macro for detect and list palindromes and repeats in letters/numbersstring Luciano Paulino da Silva Excel Programming 5 April 14th 09 05:54 PM
Macro for detect palindromes and repeats in letters/numbersstring Luciano Paulino da Silva Excel Programming 0 April 14th 09 05:33 PM


All times are GMT +1. The time now is 05:16 PM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"