Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 77
Default Numbers of repeats of a string

Dear all,
I am looking to detect repeats (sequences of letters or numbers which
are repeating
at least twice within a string) in some strings containing between
20-5000 letters. Has somebody any idea how could I perform that using
an Excel macro? I would like that the string to be evaluated could be
on cell "A1" and that the detected repeats could be
listed bellow A3; and that the number of times
that they appear in the sentence could be listed bellow cells B3, as
the following small example:

QGAGGAAGGAGQ
4 Repeats detected Number
GA 3
AG 3
GAG 2
GG 2

Somebody could help me?
Thanks in advance,
Luciano
  #2   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 293
Default Numbers of repeats of a string

Hi Luciano,

You could use code like:
Sub TestString()
Dim BigStr As String
Dim MyStr As String
Dim TmpStr As String
BigStr = "The quick brown fox jumps over the lazy dog."
MyStr = InputBox("String to Find")
TmpStr = Replace(BigStr, MyStr, "", , , vbBinaryCompare)
MsgBox "Repeats of " & MyStr & " in Test String: " & (Len(BigStr) - Len(TmpStr)) / Len(MyStr)
End Sub

Using vbBinaryCompare forces a case-sensitive test (eg "the" is not counted in a search for "The"). If you don't want the test to be
case sensitive, use vbTextCompare instead.

--
Cheers
macropod
[MVP - Microsoft Word]


"Luciano Paulino da Silva" wrote in message
...
Dear all,
I am looking to detect repeats (sequences of letters or numbers which
are repeating
at least twice within a string) in some strings containing between
20-5000 letters. Has somebody any idea how could I perform that using
an Excel macro? I would like that the string to be evaluated could be
on cell "A1" and that the detected repeats could be
listed bellow A3; and that the number of times
that they appear in the sentence could be listed bellow cells B3, as
the following small example:

QGAGGAAGGAGQ
4 Repeats detected Number
GA 3
AG 3
GAG 2
GG 2

Somebody could help me?
Thanks in advance,
Luciano


  #3   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 77
Default Numbers of repeats of a string

Dear macropod,
I have tested your code, but it did not worked for my need,
Thank you anyway,
Luciano

On 14 abr, 21:32, "macropod" wrote:
Hi Luciano,

You could use code like:
Sub TestString()
Dim BigStr As String
Dim MyStr As String
Dim TmpStr As String
BigStr = "The quick brown fox jumps over the lazy dog."
MyStr = InputBox("String to Find")
TmpStr = Replace(BigStr, MyStr, "", , , vbBinaryCompare)
MsgBox "Repeats of " & MyStr & " in Test String: " & (Len(BigStr) - Len(TmpStr)) / Len(MyStr)
End Sub

Using vbBinaryCompare forces a case-sensitive test (eg "the" is not counted in a search for "The"). If you don't want the test to be
case sensitive, use vbTextCompare instead.

--
Cheers
macropod
[MVP - Microsoft Word]

"Luciano Paulino da Silva" wrote in ...

Dear all,
I am looking to detect repeats (sequences of letters or numbers which
are repeating
at least twice within a string) in some strings containing between
20-5000 letters. Has somebody any idea how could I perform that using
an Excel macro? I would like that the string to be evaluated could be
on cell "A1" and that the detected repeats could be
listed bellow A3; and that the number of times
that they appear in the sentence could be listed bellow cells B3, as
the following small example:


QGAGGAAGGAGQ
4 Repeats detected * * * * Number
GA * * * * * * * *3
AG * * * * * * * * * * *3
GAG * * * * * * * * * * 2
GG * * * * * * * * * * * * * * *2


Somebody could help me?
Thanks in advance,
Luciano


  #4   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 293
Default Numbers of repeats of a string

Hi Luciano,

You can implement the logic fvrom the code I posted earlier via a User-Defined Function:

Public Function Repeats(BigStr As Variant, MyStr As Variant, Compare As Boolean) As Integer
Dim TmpStr As String
TmpStr = Replace(BigStr, MyStr, "", , , Compare ^ 2)
Repeats = (Len(BigStr) - Len(TmpStr)) / Len(MyStr)
End Function

If you place the function in a standard vba module, you can use a formula like:
=Repeats(A1,A2,1) or =Repeats(A1,A2,0)
to retrieve the number of times the string in A2 is found in A1. The 1/0 parameter specifies a case-sensitive or case-insensitive
match, respectively.

--
Cheers
macropod
[MVP - Microsoft Word]


"Luciano Paulino da Silva" wrote in message
...
Dear macropod,
I have tested your code, but it did not worked for my need,
Thank you anyway,
Luciano

On 14 abr, 21:32, "macropod" wrote:
Hi Luciano,

You could use code like:
Sub TestString()
Dim BigStr As String
Dim MyStr As String
Dim TmpStr As String
BigStr = "The quick brown fox jumps over the lazy dog."
MyStr = InputBox("String to Find")
TmpStr = Replace(BigStr, MyStr, "", , , vbBinaryCompare)
MsgBox "Repeats of " & MyStr & " in Test String: " & (Len(BigStr) - Len(TmpStr)) / Len(MyStr)
End Sub

Using vbBinaryCompare forces a case-sensitive test (eg "the" is not counted in a search for "The"). If you don't want the test to
be
case sensitive, use vbTextCompare instead.

--
Cheers
macropod
[MVP - Microsoft Word]

"Luciano Paulino da Silva" wrote in
...

Dear all,
I am looking to detect repeats (sequences of letters or numbers which
are repeating
at least twice within a string) in some strings containing between
20-5000 letters. Has somebody any idea how could I perform that using
an Excel macro? I would like that the string to be evaluated could be
on cell "A1" and that the detected repeats could be
listed bellow A3; and that the number of times
that they appear in the sentence could be listed bellow cells B3, as
the following small example:


QGAGGAAGGAGQ
4 Repeats detected Number
GA 3
AG 3
GAG 2
GG 2


Somebody could help me?
Thanks in advance,
Luciano


  #5   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 77
Default Numbers of repeats of a string

Hi macropod,
In my case I will not looking for a specific repeat. My intention is
list all subsequences that are present in a full string in A1 (e.g.
for the string AADFDGAAV, the string AA appear 2 times and therefore
should be counted. At the right side of the subsequences (repeats)
listed it will be necessary to figure out the number of times that
each one of them appear. And in another cell the number of different
repeats that could be detected.
Did you understand?
Thanks in advance,
Luciano

On 15 abr, 00:23, "macropod" wrote:
Hi Luciano,

You can implement the logic fvrom the code I posted earlier via a User-Defined Function:

Public Function Repeats(BigStr As Variant, MyStr As Variant, Compare As Boolean) As Integer
Dim TmpStr As String
TmpStr = Replace(BigStr, MyStr, "", , , Compare ^ 2)
Repeats = (Len(BigStr) - Len(TmpStr)) / Len(MyStr)
End Function

If you place the function in a standard vba module, you can use a formula like:
=Repeats(A1,A2,1) or =Repeats(A1,A2,0)
to retrieve the number of times the string in A2 is found in A1. The 1/0 parameter specifies a case-sensitive or case-insensitive
match, respectively.

--
Cheers
macropod
[MVP - Microsoft Word]

"Luciano Paulino da Silva" wrote in ...
Dear macropod,
I have tested your code, but it did not worked for my need,
Thank you anyway,
Luciano

On 14 abr, 21:32, "macropod" wrote:

Hi Luciano,


You could use code like:
Sub TestString()
Dim BigStr As String
Dim MyStr As String
Dim TmpStr As String
BigStr = "The quick brown fox jumps over the lazy dog."
MyStr = InputBox("String to Find")
TmpStr = Replace(BigStr, MyStr, "", , , vbBinaryCompare)
MsgBox "Repeats of " & MyStr & " in Test String: " & (Len(BigStr) - Len(TmpStr)) / Len(MyStr)
End Sub


Using vbBinaryCompare forces a case-sensitive test (eg "the" is not counted in a search for "The"). If you don't want the test to
be
case sensitive, use vbTextCompare instead.


--
Cheers
macropod
[MVP - Microsoft Word]


"Luciano Paulino da Silva" wrote in
...


Dear all,
I am looking to detect repeats (sequences of letters or numbers which
are repeating
at least twice within a string) in some strings containing between
20-5000 letters. Has somebody any idea how could I perform that using
an Excel macro? I would like that the string to be evaluated could be
on cell "A1" and that the detected repeats could be
listed bellow A3; and that the number of times
that they appear in the sentence could be listed bellow cells B3, as
the following small example:


QGAGGAAGGAGQ
4 Repeats detected Number
GA 3
AG 3
GAG 2
GG 2


Somebody could help me?
Thanks in advance,
Luciano




  #6   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 806
Default Numbers of repeats of a string

Hello Luciano,

If I understand you correctly, the correct result for your example is:
AG 3
GA 3
AGG 2
AGGA 2
GAG 2
GG 2
GGA 2

You can get this if you array-enter into A3:B9:
=GSort(pfreq(TRANSPOSE(GenSubStrings(A1))),"DA","N S","21")

GSort you will find he
http://www.sulprobil.com/html/sort_vba.html

Pfreq is he
http://www.sulprobil.com/html/pfreq.html

Function GenSubStrings(s As String) As Variant
ReDim v(1 To 10000) As Variant
Dim i As Long, j As Long, k As Long

'Generate all substrings of s with length 2 to Len(s)-1
For i = 2 To Len(s) - 1
For j = 1 To Len(s) - i + 1
k = k + 1
v(k) = Mid(s, j, i)
Next j
Next i

ReDim Preserve v(1 To k) As Variant
GenSubStrings = v
End Function

Regards,
Bernd
  #7   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 77
Default Numbers of repeats of a string

Hi Bernd,
Your suggestion is excellent! However, I would like to list only the
subsequences that appear 2 times or more. That ones that were observed
only one time should not be listed.
Thanks in advance,
Luciano

On 15 abr, 07:42, Bernd P wrote:
Hello Luciano,

If I understand you correctly, the correct result for your example is:
AG * * *3
GA * * *3
AGG * * 2
AGGA * *2
GAG * * 2
GG * * *2
GGA * * 2

You can get this if you array-enter into A3:B9:
=GSort(pfreq(TRANSPOSE(GenSubStrings(A1))),"DA","N S","21")

GSort you will find hehttp://www.sulprobil.com/html/sort_vba.html

Pfreq is hehttp://www.sulprobil.com/html/pfreq.html

Function GenSubStrings(s As String) As Variant
ReDim v(1 To 10000) As Variant
Dim i As Long, j As Long, k As Long

'Generate all substrings of s with length 2 to Len(s)-1
For i = 2 To Len(s) - 1
* * For j = 1 To Len(s) - i + 1
* * * * k = k + 1
* * * * v(k) = Mid(s, j, i)
* * Next j
Next i

ReDim Preserve v(1 To k) As Variant
GenSubStrings = v
End Function

Regards,
Bernd


  #8   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 806
Default Numbers of repeats of a string

Hello Luciano,

That was left as homework exercise :-)

Here we a
Function ShowRepetitions(v As Variant) As Variant
Dim i As Long, j As Long

On Error Resume Next
Do While v(i, 2) 1
i = i + 1
Loop
If v(i, 2) = 1 Then i = i - 1
On Error GoTo 0

ReDim vR(1 To Application.Caller.Rows.Count, 1 To 2) As Variant

For j = 1 To Application.Caller.Rows.Count
If j i Then
vR(j, 1) = ""
vR(j, 2) = ""
Else
vR(j, 1) = v(j, 1)
vR(j, 2) = v(j, 2)
End If
Next j

ShowRepetitions = vR

End Function

Array-enter:
=showrepetitions(GSort(pfreq(TRANSPOSE(GenSubStrin gs
(A1))),"DA","NS","21"))

Regards,
Bernd
  #9   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 77
Default Numbers of repeats of a string

Dear Bernd,
Thank you very much for the code. It is working fine, but for some
situations the strings size is limiting its use since it takes several
minutes, and even hours to perform the calculation. Unfortunately 90%
of the cases I have strings bigger than 200 characters. :(
Thank you again,
Luciano

On 15 abr, 11:38, Bernd P wrote:
Hello Luciano,

That was left as homework exercise :-)

Here we a
Function ShowRepetitions(v As Variant) As Variant
Dim i As Long, j As Long

On Error Resume Next
Do While v(i, 2) 1
* * i = i + 1
Loop
If v(i, 2) = 1 Then i = i - 1
On Error GoTo 0

ReDim vR(1 To Application.Caller.Rows.Count, 1 To 2) As Variant

For j = 1 To Application.Caller.Rows.Count
* * If j i Then
* * * * vR(j, 1) = ""
* * * * vR(j, 2) = ""
* * Else
* * * * vR(j, 1) = v(j, 1)
* * * * vR(j, 2) = v(j, 2)
* * End If
Next j

ShowRepetitions = vR

End Function

Array-enter:
=showrepetitions(GSort(pfreq(TRANSPOSE(GenSubStrin gs
(A1))),"DA","NS","21"))

Regards,
Bernd


  #10   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 77
Default Numbers of repeats of a string

Dear Bernd,
I can not understand why, but I have always to put some string on cell
"A1" to have strings determined if not the code crashes and appear
errors on cells bellow A1.
Have you any idea about what is happening?
Thanks in advance,
Luciano

On 15 abr, 13:32, Luciano Paulino da Silva
wrote:
Dear Bernd,
Thank you very much for the code. It is working fine, but for some
situations the strings size is limiting its use since it takes several
minutes, and even hours to perform the calculation. Unfortunately 90%
of the cases I have strings bigger than 200 characters. :(
Thank you again,
Luciano

On 15 abr, 11:38, Bernd P wrote:

Hello Luciano,


That was left as homework exercise :-)


Here we a
Function ShowRepetitions(v As Variant) As Variant
Dim i As Long, j As Long


On Error Resume Next
Do While v(i, 2) 1
* * i = i + 1
Loop
If v(i, 2) = 1 Then i = i - 1
On Error GoTo 0


ReDim vR(1 To Application.Caller.Rows.Count, 1 To 2) As Variant


For j = 1 To Application.Caller.Rows.Count
* * If j i Then
* * * * vR(j, 1) = ""
* * * * vR(j, 2) = ""
* * Else
* * * * vR(j, 1) = v(j, 1)
* * * * vR(j, 2) = v(j, 2)
* * End If
Next j


ShowRepetitions = vR


End Function


Array-enter:
=showrepetitions(GSort(pfreq(TRANSPOSE(GenSubStrin gs
(A1))),"DA","NS","21"))


Regards,
Bernd




  #11   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 806
Default Numbers of repeats of a string

Hello Luciano,

The runtime depends heavily on the length of your strings and the
length of the substrings. If you can live with limited substrings
(length shorter than a certain threshold) then you can reduce the
runtime substantially.

Change the line in GenSubStrings
For i = 2 To Len(s) - 1
to
For i = 2 To worksheetfunction.min(10,Len(s) - 1)
for example.

The formula I gave you referred to A1. So your input is expected to be
in A1. Do you need this to be changed?

Regards,
Bernd
  #12   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 293
Default Numbers of repeats of a string

Hi Luciano,

Fair enough - I see Bernard P has given you some pointers.

Do note, though, that testing strings of 5000 letters for all possible repeat combinations can be extremely time consuming. After
all, in a string of 5000 letters, duplicate matches of as many as 4999 characters are possible (consider a string of 5000 'A's, for
example).

--
Cheers
macropod
[MVP - Microsoft Word]


"Luciano Paulino da Silva" wrote in message
...
Hi macropod,
In my case I will not looking for a specific repeat. My intention is
list all subsequences that are present in a full string in A1 (e.g.
for the string AADFDGAAV, the string AA appear 2 times and therefore
should be counted. At the right side of the subsequences (repeats)
listed it will be necessary to figure out the number of times that
each one of them appear. And in another cell the number of different
repeats that could be detected.
Did you understand?
Thanks in advance,
Luciano

On 15 abr, 00:23, "macropod" wrote:
Hi Luciano,

You can implement the logic fvrom the code I posted earlier via a User-Defined Function:

Public Function Repeats(BigStr As Variant, MyStr As Variant, Compare As Boolean) As Integer
Dim TmpStr As String
TmpStr = Replace(BigStr, MyStr, "", , , Compare ^ 2)
Repeats = (Len(BigStr) - Len(TmpStr)) / Len(MyStr)
End Function

If you place the function in a standard vba module, you can use a formula like:
=Repeats(A1,A2,1) or =Repeats(A1,A2,0)
to retrieve the number of times the string in A2 is found in A1. The 1/0 parameter specifies a case-sensitive or case-insensitive
match, respectively.

--
Cheers
macropod
[MVP - Microsoft Word]

"Luciano Paulino da Silva" wrote in
...
Dear macropod,
I have tested your code, but it did not worked for my need,
Thank you anyway,
Luciano

On 14 abr, 21:32, "macropod" wrote:

Hi Luciano,


You could use code like:
Sub TestString()
Dim BigStr As String
Dim MyStr As String
Dim TmpStr As String
BigStr = "The quick brown fox jumps over the lazy dog."
MyStr = InputBox("String to Find")
TmpStr = Replace(BigStr, MyStr, "", , , vbBinaryCompare)
MsgBox "Repeats of " & MyStr & " in Test String: " & (Len(BigStr) - Len(TmpStr)) / Len(MyStr)
End Sub


Using vbBinaryCompare forces a case-sensitive test (eg "the" is not counted in a search for "The"). If you don't want the test
to
be
case sensitive, use vbTextCompare instead.


--
Cheers
macropod
[MVP - Microsoft Word]


"Luciano Paulino da Silva" wrote in
...


Dear all,
I am looking to detect repeats (sequences of letters or numbers which
are repeating
at least twice within a string) in some strings containing between
20-5000 letters. Has somebody any idea how could I perform that using
an Excel macro? I would like that the string to be evaluated could be
on cell "A1" and that the detected repeats could be
listed bellow A3; and that the number of times
that they appear in the sentence could be listed bellow cells B3, as
the following small example:


QGAGGAAGGAGQ
4 Repeats detected Number
GA 3
AG 3
GAG 2
GG 2


Somebody could help me?
Thanks in advance,
Luciano



Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Generate random numbers 1-100 without any repeats? ExcelFan Excel Worksheet Functions 4 May 5th 23 07:46 PM
Macro for detect palindromes and repeats in letters/numbers string Luciano Paulino da Silva Excel Worksheet Functions 33 April 16th 09 04:59 PM
Macro for detect palindromes and repeats in letters/numbers string Luciano Paulino da Silva New Users to Excel 5 April 14th 09 08:44 PM
How can I count the number of times a letter repeats in a string? Wiley Excel Worksheet Functions 3 May 11th 06 06:53 PM
Column of Numbers in xls to String of Numbers in txt Spyder Excel Discussion (Misc queries) 1 March 15th 06 07:39 PM


All times are GMT +1. The time now is 11:43 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"