Home |
Search |
Today's Posts |
#8
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
On 24 out, 10:33, John Coleman wrote:
On Oct 24, 6:16*am,LucianoPaulino da Silva wrote: On 24 out, 02:10, John Coleman wrote: Dear John Coleman, Thank you very much for your fantastic code. It is almost perfect for my situation. However, I need that instead of a same detected palindrome is being repeated all the time it appear in the string, it should be shown once a time in A column, and that B column should be shown the frequence of occcurrences of that specific palindrome. Do you know some way to do that? Thank you, Luciano DearLuciano, This should work: '''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''''''' Function FindMaxInitPalin(SearchString As String) As String * * Dim i As Long, j As Long * * Dim init As String * * Dim reflection As String * * For i = 1 To Len(SearchString) * * * * init = Mid(SearchString, 1, i) * * * * reflection = Mid(init, i) & reflection * * * * If init = reflection Then j = i * * Next i * * FindMaxInitPalin = Mid(SearchString, 1, j) End Function Function FindMaxPalins(SearchString As String) As Variant * * Dim Palindromes As Object * * Dim InputString As String * * Dim chomp As String * * Dim n As Long * * Set Palindromes = CreateObject("Scripting.Dictionary") * * InputString = SearchString * * Do While Len(InputString) 0 * * * * chomp = FindMaxInitPalin(InputString) * * * * InputString = Mid(InputString, Len(chomp) + 1) * * * * If Len(chomp) 1 Then * * * * * * If Not Palindromes.Exists(chomp) Then * * * * * * * * Palindromes.Add chomp, 1 * * * * * * Else * * * * * * * * Palindromes.Item(chomp) = Palindromes.Item(chomp) + 1 * * * * * * End If * * * * End If * * Loop * * If Palindromes.count 0 Then Set FindMaxPalins = Palindromes End Function Sub Main() * * Dim SearchString As String * * Dim Results As Variant * * Dim i As Long, n As Long * * Dim palindrome As Variant * * On Error GoTo no_palindromes * * SearchString = InputBox("Enter a string") * * Set Results = FindMaxPalins(SearchString) * * n = Results.count * * Range("A1").Value = n & " distinct palindromes found" * * Range("A2").Value = "Palindrome" * * Range("B2").Value = "Frequency" * * For Each palindrome In Results.Keys * * * * i = i + 1 * * * * Range("A2").Offset(i).Value = palindrome * * * * Range("B2").Offset(i).Value = Results.Item(palindrome) * * Next palindrome * * Exit Sub no_palindromes: * * Range("A1").Value = "No palindromes found" End Sub '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''*''''''''''''''''' Also, there is no need to stick to an inputbox for entering the string. It is easy enough to modify main() so that it gets its input from a cell Out of curiosity, what sort of things are you doing with this code? Seems interesting. -John Coleman Dear John Coleman, Thank you very much for your help. Let me try explain the aim of the present study. Let me first introduce myself. My name isLuciano Paulino da Silva, I`m a scientific researcher in the area of bionanotechnology. In some of our studies we have characterized the sequences of proteins and peptides that are formed by strings of 20 possible aminoacid residues commonly represented by letters. We always needs a lot of statistical and mathematical comparisons among strings. In this way, we have for some situations even thousands of these strings and we are interested in analyze them from databanks. Some time ago, I realized that some of these protein sequences have inside them the presence of palindromes and repeats and such structures are very important for their biological functions. In this way, this code and others are important to test some situations in order to in some days perform these analyses in large scale. In this way, if it would be your interest I would be pleased if you accept collaborate with us. Concerning your code, at present we need some additional things like: 1-The string shoud be typed in A1 cell instead use an inputbox; 2-In addition to palindromes, an identitical approach will have to be performed to repeats in the strings; Thanks in advance, Luciano- Hide quoted text - - Show quoted text - DearLuciano, I suspected that your interests were biological. I am a mathematician who likes to program as a hobby (and sometimes as an aid to research or teaching). I have a friend in the biology department who sought my aid a while back in writing a program which generated random protein strings with certain properties, and I recognized G,A,Q from your original post as possibly being abbreviations for glycine, etc. I am on sabbatical this semester, so have some time on my hands and would be willing to help you out some (but am currently writing a paper - so I don't have unlimitted time), though for more serious analysis you should probably recruit a specialist in bioinformatics who understands string processing and pattern matching much more than I do. Feel free to email me if you want. My address is jcoleman followed by @ followed by franciscan dot edu (don't want to make life too easy for robot spammers). What do you mean by repeats? Do you mean strings like GAQGAQ (where the repeated sequence is adjacent) or GAQCGAQ (where the repeated sequence can be separated by intervening letters)? And how would you handle GAQGAQGAQ? Also - do you want all repeats, or do you want repeats which are in some sense maximal (similar to the palindrome code I wrote)? Please clarify. I wrote a new main() to take its input from A1 and report lengths in column C. The other two functions in the code were unchanged , so just delete main() and replace it with: Sub Main() * * Dim SearchString As String * * Dim Results As Variant * * Dim i As Long, n As Long * * Dim palindrome As Variant * * Dim R As Range 'output anchor * * On Error GoTo no_palindromes * * Set Results = FindMaxPalins(Range("A1").Value) * * Set R = Range("A3") * * n = Results.count * * R.Value = n & " distinct palindromes found" * * R.Offset(1).Value = "Palindrome" * * R.Offset(1, 1).Value = "Frequency" * * R.Offset(1, 2).Value = "Length" * * For Each palindrome In Results.Keys * * * * i = i + 1 * * * * R.Offset(i + 1).Value = palindrome * * * * R.Offset(i + 1, 1).Value = Results.Item(palindrome) * * * * R.Offset(i + 1, 2).Value = Len(palindrome) * * Next palindrome * * Exit Sub no_palindromes: * * R.Value = "No palindromes found" End Sub Hope that helps -John Coleman Dear John Coleman, Thak you very much for your code. It is perfect for this investigation approach. That`s nice, I`m a biologist but among my graduate students there are a mathematician because your area is very important to our studies. That`s interesting. I have also a smal VBA-based program to generate random polypeptides and this is one of our interests. I will prepare a small text telling you our interests in some string processing and you will be able to tell us if could help us. Yes, the repeat sequence could be adjacent or separated by intervening letters. Both situations are possible on protein sequences. In the case of GAQGAQGAQ it would be 3 GAQ repeats, using the same logic of the palindromes. Yes, it is similar to the palindromes, and even some palindromes are repeats but not all repeats are palindromes... Thanks in advance, Luciano |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
Palindromes and repeats | Excel Programming | |||
Macro for detect palindromes and repeats in letters/numbers string | Excel Worksheet Functions | |||
Macro for detect palindromes and repeats in letters/numbers string | New Users to Excel | |||
Macro for detect and list palindromes and repeats in letters/numbersstring | Excel Programming | |||
Macro for detect palindromes and repeats in letters/numbersstring | Excel Programming |