Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
I've written an excel function aim to change a RNA sequence to its reverse complement (e.g. from "ACGUUGUA" to "UACAACGU") as: Function ReverseComplement(Rcell As Range, Optional IsText As Boolean) Dim i As Integer Dim strReverseSeq As String Dim strSenseSeq As String strSenseSeq = Trim(Rcell) For i = 1 To Len(strSenseSeq) strReverseSeq = Mid(strSenseSeq, i, 1) & _ strReverseSeq Next i If IsText = False Then ReverseComplement = CLng(strReverseSeq) Else ReverseComplement = strReverseSeq End If strSeqA1 = Replace(strReverseSeq, "A", "1") strSeqC2 = Replace(strSeqA1, "C", "2") strSeqG3 = Replace(strSeqC2, "G", "3") strSeqT4 = Replace(strSeqG3, "U", "4") strSeq1T = Replace(strSeqT4, "1", "U") strSeq2G = Replace(strSeq1T, "2", "G") strSeq3C = Replace(strSeq2G, "3", "C") strSeq4A = Replace(strSeq3C, "4", "A") End Function The first half works fine, which reverse the sequence, but the second part doesn't work, which cannot make the comoplements Please help for debug. Much thanks. *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
As an example
Assume you had a reversed string like AAA your code would convert it like this AAA = 111 = UUU If that is what you want, then Ok. If the problem is you would want 111 as a result, then I think you need to go to an intermediate stage (two step process) AAA = MMM = 111 Where the middle stage would use unique identifiers so you wouldn't convert any value after it was converted to a final value. A to M, M to 1 C to N, N to 2 G to O, O to 3 U to P, P to 4 1 to Q, Q to U 2 to R, R to G 3 to S, S to C 4 to T, T to A -- Regards, Tom Ogilvy "M H" wrote in message ... I've written an excel function aim to change a RNA sequence to its reverse complement (e.g. from "ACGUUGUA" to "UACAACGU") as: Function ReverseComplement(Rcell As Range, Optional IsText As Boolean) Dim i As Integer Dim strReverseSeq As String Dim strSenseSeq As String strSenseSeq = Trim(Rcell) For i = 1 To Len(strSenseSeq) strReverseSeq = Mid(strSenseSeq, i, 1) & _ strReverseSeq Next i If IsText = False Then ReverseComplement = CLng(strReverseSeq) Else ReverseComplement = strReverseSeq End If strSeqA1 = Replace(strReverseSeq, "A", "1") strSeqC2 = Replace(strSeqA1, "C", "2") strSeqG3 = Replace(strSeqC2, "G", "3") strSeqT4 = Replace(strSeqG3, "U", "4") strSeq1T = Replace(strSeqT4, "1", "U") strSeq2G = Replace(strSeq1T, "2", "G") strSeq3C = Replace(strSeq2G, "3", "C") strSeq4A = Replace(strSeq3C, "4", "A") End Function The first half works fine, which reverse the sequence, but the second part doesn't work, which cannot make the comoplements Please help for debug. Much thanks. *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
Perhaps
Public Function ReverseComplement(sInput As String) Dim sTemp As String sTemp = UCase(Replace(Replace(Replace(Replace( _ sInput, "A", "u"), "C", "g"), "G", "c"), "U", "a")) ReverseComplement = sTemp End Function In article , M H wrote: I've written an excel function aim to change a RNA sequence to its reverse complement (e.g. from "ACGUUGUA" to "UACAACGU") as: Function ReverseComplement(Rcell As Range, Optional IsText As Boolean) Dim i As Integer Dim strReverseSeq As String Dim strSenseSeq As String strSenseSeq = Trim(Rcell) For i = 1 To Len(strSenseSeq) strReverseSeq = Mid(strSenseSeq, i, 1) & _ strReverseSeq Next i If IsText = False Then ReverseComplement = CLng(strReverseSeq) Else ReverseComplement = strReverseSeq End If strSeqA1 = Replace(strReverseSeq, "A", "1") strSeqC2 = Replace(strSeqA1, "C", "2") strSeqG3 = Replace(strSeqC2, "G", "3") strSeqT4 = Replace(strSeqG3, "U", "4") strSeq1T = Replace(strSeqT4, "1", "U") strSeq2G = Replace(strSeq1T, "2", "G") strSeq3C = Replace(strSeq2G, "3", "C") strSeq4A = Replace(strSeq3C, "4", "A") End Function The first half works fine, which reverse the sequence, but the second part doesn't work, which cannot make the comoplements Please help for debug. Much thanks. *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
And I'm not sure what you were after with the IsText, but if you want
ACGU to return 4321: Public Function ReverseComplement(sInput As String, _ Optional bText As Boolean = True) As String Dim sTemp As Variant If bText Then sTemp = UCase(Replace(Replace(Replace(Replace( _ sInput, "A", "u"), "C", "g"), "G", "c"), "U", "a")) Else sTemp = Replace(Replace(Replace(Replace( _ sInput, "A", "4"), "C", "3"), "G", "2"), "U", "1") End If ReverseComplement = sTemp End Function In article , JE McGimpsey wrote: Perhaps Public Function ReverseComplement(sInput As String) Dim sTemp As String sTemp = UCase(Replace(Replace(Replace(Replace( _ sInput, "A", "u"), "C", "g"), "G", "c"), "U", "a")) ReverseComplement = sTemp End Function |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
Nice one, JE! I think he also wants the characters in reverse order, so
perhaps this slight modification will take care of that: Public Function ReverseComplement(sInput As String) Dim sTemp As String sTemp = UCase(Replace(Replace(Replace(Replace( _ sInput, "A", "u"), "C", "g"), "G", "c"), "U", "a")) ReverseComplement = StrReverse(sTemp) End Function On Mon, 17 Jan 2005 11:43:12 -0700, JE McGimpsey wrote: Perhaps Public Function ReverseComplement(sInput As String) Dim sTemp As String sTemp = UCase(Replace(Replace(Replace(Replace( _ sInput, "A", "u"), "C", "g"), "G", "c"), "U", "a")) ReverseComplement = sTemp End Function In article , M H wrote: I've written an excel function aim to change a RNA sequence to its reverse complement (e.g. from "ACGUUGUA" to "UACAACGU") as: Function ReverseComplement(Rcell As Range, Optional IsText As Boolean) Dim i As Integer Dim strReverseSeq As String Dim strSenseSeq As String strSenseSeq = Trim(Rcell) For i = 1 To Len(strSenseSeq) strReverseSeq = Mid(strSenseSeq, i, 1) & _ strReverseSeq Next i If IsText = False Then ReverseComplement = CLng(strReverseSeq) Else ReverseComplement = strReverseSeq End If strSeqA1 = Replace(strReverseSeq, "A", "1") strSeqC2 = Replace(strSeqA1, "C", "2") strSeqG3 = Replace(strSeqC2, "G", "3") strSeqT4 = Replace(strSeqG3, "U", "4") strSeq1T = Replace(strSeqT4, "1", "U") strSeq2G = Replace(strSeq1T, "2", "G") strSeq3C = Replace(strSeq2G, "3", "C") strSeq4A = Replace(strSeq3C, "4", "A") End Function The first half works fine, which reverse the sequence, but the second part doesn't work, which cannot make the comoplements Please help for debug. Much thanks. *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
Thanks, Myrna - I missed that!
In article , Myrna Larson wrote: I think he also wants the characters in reverse order, so perhaps this slight modification will take care of that: |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
After looking more closely, (you didn't say what you error/problem was
exactly and I don't know anything about RNA), I see you have just failed to assign the result to the function name. Function ReverseComplement(Rcell As Range) Dim i As Integer Dim strReverseSeq As String Dim strSenseSeq As String strSenseSeq = Trim(Rcell) For i = 1 To Len(strSenseSeq) strReverseSeq = Mid(strSenseSeq, i, 1) & _ strReverseSeq Next i strSeqA1 = Replace(strReverseSeq, "A", "1") strSeqC2 = Replace(strSeqA1, "C", "2") strSeqG3 = Replace(strSeqC2, "G", "3") strSeqT4 = Replace(strSeqG3, "U", "4") strSeq1T = Replace(strSeqT4, "1", "U") strSeq2G = Replace(strSeq1T, "2", "G") strSeq3C = Replace(strSeq2G, "3", "C") strSeq4A = Replace(strSeq3C, "4", "A") ReverseComplement = strSeq4A End Function I don't see any reason for the istext argument as there is no reason to convert to clng. -- Regards, Tom Ogilvy "Tom Ogilvy" wrote in message ... As an example Assume you had a reversed string like AAA your code would convert it like this AAA = 111 = UUU If that is what you want, then Ok. If the problem is you would want 111 as a result, then I think you need to go to an intermediate stage (two step process) AAA = MMM = 111 Where the middle stage would use unique identifiers so you wouldn't convert any value after it was converted to a final value. A to M, M to 1 C to N, N to 2 G to O, O to 3 U to P, P to 4 1 to Q, Q to U 2 to R, R to G 3 to S, S to C 4 to T, T to A -- Regards, Tom Ogilvy "M H" wrote in message ... I've written an excel function aim to change a RNA sequence to its reverse complement (e.g. from "ACGUUGUA" to "UACAACGU") as: Function ReverseComplement(Rcell As Range, Optional IsText As Boolean) Dim i As Integer Dim strReverseSeq As String Dim strSenseSeq As String strSenseSeq = Trim(Rcell) For i = 1 To Len(strSenseSeq) strReverseSeq = Mid(strSenseSeq, i, 1) & _ strReverseSeq Next i If IsText = False Then ReverseComplement = CLng(strReverseSeq) Else ReverseComplement = strReverseSeq End If strSeqA1 = Replace(strReverseSeq, "A", "1") strSeqC2 = Replace(strSeqA1, "C", "2") strSeqG3 = Replace(strSeqC2, "G", "3") strSeqT4 = Replace(strSeqG3, "U", "4") strSeq1T = Replace(strSeqT4, "1", "U") strSeq2G = Replace(strSeq1T, "2", "G") strSeq3C = Replace(strSeq2G, "3", "C") strSeq4A = Replace(strSeq3C, "4", "A") End Function The first half works fine, which reverse the sequence, but the second part doesn't work, which cannot make the comoplements Please help for debug. Much thanks. *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#8
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
If the objective is to pack everything together, why stop there
Public Function ReverseComplement(sInput As String) ReverseComplement = _ StrReverse(UCase(Replace(Replace(Replace(Replace( _ sInput, "A", "u"), "C", "g"), "G", "c"), "U", "a"))) End Function --- Regards, Tom Ogilvy "Myrna Larson" wrote in message ... Nice one, JE! I think he also wants the characters in reverse order, so perhaps this slight modification will take care of that: Public Function ReverseComplement(sInput As String) Dim sTemp As String sTemp = UCase(Replace(Replace(Replace(Replace( _ sInput, "A", "u"), "C", "g"), "G", "c"), "U", "a")) ReverseComplement = StrReverse(sTemp) End Function On Mon, 17 Jan 2005 11:43:12 -0700, JE McGimpsey wrote: Perhaps Public Function ReverseComplement(sInput As String) Dim sTemp As String sTemp = UCase(Replace(Replace(Replace(Replace( _ sInput, "A", "u"), "C", "g"), "G", "c"), "U", "a")) ReverseComplement = sTemp End Function In article , M H wrote: I've written an excel function aim to change a RNA sequence to its reverse complement (e.g. from "ACGUUGUA" to "UACAACGU") as: Function ReverseComplement(Rcell As Range, Optional IsText As Boolean) Dim i As Integer Dim strReverseSeq As String Dim strSenseSeq As String strSenseSeq = Trim(Rcell) For i = 1 To Len(strSenseSeq) strReverseSeq = Mid(strSenseSeq, i, 1) & _ strReverseSeq Next i If IsText = False Then ReverseComplement = CLng(strReverseSeq) Else ReverseComplement = strReverseSeq End If strSeqA1 = Replace(strReverseSeq, "A", "1") strSeqC2 = Replace(strSeqA1, "C", "2") strSeqG3 = Replace(strSeqC2, "G", "3") strSeqT4 = Replace(strSeqG3, "U", "4") strSeq1T = Replace(strSeqT4, "1", "U") strSeq2G = Replace(strSeq1T, "2", "G") strSeq3C = Replace(strSeq2G, "3", "C") strSeq4A = Replace(strSeq3C, "4", "A") End Function The first half works fine, which reverse the sequence, but the second part doesn't work, which cannot make the comoplements Please help for debug. Much thanks. *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#9
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
The objective (in my case) was to avoid a series of explicit transition
steps (e.g, A-1-U, C-2-G,...), reducing 8 Replace's to 4 + a UCase(). Whether StrReverse is applied to an named or unnamed temporary variable is probably irrelevant. For clarity, if nothing else, I generally use temporary variables in the function, making an explicit assignment to the function's returned variable at the very end. In article , "Tom Ogilvy" wrote: If the objective is to pack everything together, why stop there Public Function ReverseComplement(sInput As String) ReverseComplement = _ StrReverse(UCase(Replace(Replace(Replace(Replace( _ sInput, "A", "u"), "C", "g"), "G", "c"), "U", "a"))) End Function |
#10
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
I guess my objective in this case is more to show the OP what steps are to be
taken. To demonstrate as a learning tool, I would have done the replacements as 4 steps (as the OP had done originally), a 5th line to to the UCase, a 6th for the StrReverse, and a 7th to assign to the function output. I don't know whether in the real world that is slower or faster than combining everything into one statement. On Mon, 17 Jan 2005 14:37:08 -0500, "Tom Ogilvy" wrote: If the objective is to pack everything together, why stop there Public Function ReverseComplement(sInput As String) ReverseComplement = _ StrReverse(UCase(Replace(Replace(Replace(Replace( _ sInput, "A", "u"), "C", "g"), "G", "c"), "U", "a"))) End Function --- Regards, Tom Ogilvy "Myrna Larson" wrote in message .. . Nice one, JE! I think he also wants the characters in reverse order, so perhaps this slight modification will take care of that: Public Function ReverseComplement(sInput As String) Dim sTemp As String sTemp = UCase(Replace(Replace(Replace(Replace( _ sInput, "A", "u"), "C", "g"), "G", "c"), "U", "a")) ReverseComplement = StrReverse(sTemp) End Function On Mon, 17 Jan 2005 11:43:12 -0700, JE McGimpsey wrote: Perhaps Public Function ReverseComplement(sInput As String) Dim sTemp As String sTemp = UCase(Replace(Replace(Replace(Replace( _ sInput, "A", "u"), "C", "g"), "G", "c"), "U", "a")) ReverseComplement = sTemp End Function In article , M H wrote: I've written an excel function aim to change a RNA sequence to its reverse complement (e.g. from "ACGUUGUA" to "UACAACGU") as: Function ReverseComplement(Rcell As Range, Optional IsText As Boolean) Dim i As Integer Dim strReverseSeq As String Dim strSenseSeq As String strSenseSeq = Trim(Rcell) For i = 1 To Len(strSenseSeq) strReverseSeq = Mid(strSenseSeq, i, 1) & _ strReverseSeq Next i If IsText = False Then ReverseComplement = CLng(strReverseSeq) Else ReverseComplement = strReverseSeq End If strSeqA1 = Replace(strReverseSeq, "A", "1") strSeqC2 = Replace(strSeqA1, "C", "2") strSeqG3 = Replace(strSeqC2, "G", "3") strSeqT4 = Replace(strSeqG3, "U", "4") strSeq1T = Replace(strSeqT4, "1", "U") strSeq2G = Replace(strSeq1T, "2", "G") strSeq3C = Replace(strSeq2G, "3", "C") strSeq4A = Replace(strSeq3C, "4", "A") End Function The first half works fine, which reverse the sequence, but the second part doesn't work, which cannot make the comoplements Please help for debug. Much thanks. *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#11
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
I was impressed not by JE's use of a single statement, but by replacing with a
lower case letter instead of a number, symbol or different letter altogether. The first approach requires 4 replacements plus the UCase$; the latter requires 8 replacements. On Mon, 17 Jan 2005 14:37:08 -0500, "Tom Ogilvy" wrote: If the objective is to pack everything together, why stop there Public Function ReverseComplement(sInput As String) ReverseComplement = _ StrReverse(UCase(Replace(Replace(Replace(Replace( _ sInput, "A", "u"), "C", "g"), "G", "c"), "U", "a"))) End Function --- Regards, Tom Ogilvy "Myrna Larson" wrote in message .. . Nice one, JE! I think he also wants the characters in reverse order, so perhaps this slight modification will take care of that: Public Function ReverseComplement(sInput As String) Dim sTemp As String sTemp = UCase(Replace(Replace(Replace(Replace( _ sInput, "A", "u"), "C", "g"), "G", "c"), "U", "a")) ReverseComplement = StrReverse(sTemp) End Function On Mon, 17 Jan 2005 11:43:12 -0700, JE McGimpsey wrote: Perhaps Public Function ReverseComplement(sInput As String) Dim sTemp As String sTemp = UCase(Replace(Replace(Replace(Replace( _ sInput, "A", "u"), "C", "g"), "G", "c"), "U", "a")) ReverseComplement = sTemp End Function In article , M H wrote: I've written an excel function aim to change a RNA sequence to its reverse complement (e.g. from "ACGUUGUA" to "UACAACGU") as: Function ReverseComplement(Rcell As Range, Optional IsText As Boolean) Dim i As Integer Dim strReverseSeq As String Dim strSenseSeq As String strSenseSeq = Trim(Rcell) For i = 1 To Len(strSenseSeq) strReverseSeq = Mid(strSenseSeq, i, 1) & _ strReverseSeq Next i If IsText = False Then ReverseComplement = CLng(strReverseSeq) Else ReverseComplement = strReverseSeq End If strSeqA1 = Replace(strReverseSeq, "A", "1") strSeqC2 = Replace(strSeqA1, "C", "2") strSeqG3 = Replace(strSeqC2, "G", "3") strSeqT4 = Replace(strSeqG3, "U", "4") strSeq1T = Replace(strSeqT4, "1", "U") strSeq2G = Replace(strSeq1T, "2", "G") strSeq3C = Replace(strSeq2G, "3", "C") strSeq4A = Replace(strSeq3C, "4", "A") End Function The first half works fine, which reverse the sequence, but the second part doesn't work, which cannot make the comoplements Please help for debug. Much thanks. *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#12
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
Hi,
sorry did not find the original posting so I answer this one. I see how you reverse the strand . I do not understand how you want to complement the strand. So I can not comment your solution. But have a solution for the problem if you want. Feel free to adapt it to your needs. 'This function converts a DNA string to its 'reverse complement 'Gys de Jongh 16-jun-1999 Function Rev(Forw As String) As String Dim ForwA(1 To 100) As String * 1 Dim RevA(1 To 100) As String * 1 N = Len(Forw) If N 100 Then Rev = "Input too long (max 100)" Exit Function End If For I = 1 To N ForwA(I) = Mid(Forw, I, 1) Next I For I = 1 To N Select Case ForwA(N - I + 1) Case "A" RevA(I) = "T" Case "a" RevA(I) = "t" 'From A to T Case "C" RevA(I) = "G" Case "c" RevA(I) = "g" 'From C to G Case "G" RevA(I) = "C" Case "g" RevA(I) = "c" 'From G to C Case "T" RevA(I) = "A" Case "t" RevA(I) = "a" 'From T to A Case Else RevA(I) = "N" 'Rest is N End Select Next I For I = 1 To N Rev = Rev & RevA(I) Next I End Function Short explanation for Tom : DNA contains the information for cells It is a double helix with 2 strings , molecules The two strings contain the same information They are redundant , one can be used to construct the other In each string A only pairs with T and C only with G But : the Sense strand reads from left to right and the anti Sense strand from right to left so : ACGTCCCTGAAATT <====Sense strand TGCAGGGACTTTAA <====anti Sense strand Finding the reverse complement means finding the other if one is known hth Gys "Tom Ogilvy" wrote in message ... As an example Assume you had a reversed string like AAA your code would convert it like this AAA = 111 = UUU If that is what you want, then Ok. If the problem is you would want 111 as a result, then I think you need to go to an intermediate stage (two step process) AAA = MMM = 111 Where the middle stage would use unique identifiers so you wouldn't convert any value after it was converted to a final value. A to M, M to 1 C to N, N to 2 G to O, O to 3 U to P, P to 4 1 to Q, Q to U 2 to R, R to G 3 to S, S to C 4 to T, T to A -- Regards, Tom Ogilvy "M H" wrote in message ... I've written an excel function aim to change a RNA sequence to its reverse complement (e.g. from "ACGUUGUA" to "UACAACGU") as: Function ReverseComplement(Rcell As Range, Optional IsText As Boolean) Dim i As Integer Dim strReverseSeq As String Dim strSenseSeq As String strSenseSeq = Trim(Rcell) For i = 1 To Len(strSenseSeq) strReverseSeq = Mid(strSenseSeq, i, 1) & _ strReverseSeq Next i If IsText = False Then ReverseComplement = CLng(strReverseSeq) Else ReverseComplement = strReverseSeq End If strSeqA1 = Replace(strReverseSeq, "A", "1") here strSeqA1 etc are just a not declared variables which don't seem to reach the output of the function ???? strSeqC2 = Replace(strSeqA1, "C", "2") strSeqG3 = Replace(strSeqC2, "G", "3") strSeqT4 = Replace(strSeqG3, "U", "4") strSeq1T = Replace(strSeqT4, "1", "U") strSeq2G = Replace(strSeq1T, "2", "G") strSeq3C = Replace(strSeq2G, "3", "C") strSeq4A = Replace(strSeq3C, "4", "A") End Function The first half works fine, which reverse the sequence, but the second part doesn't work, which cannot make the comoplements Please help for debug. Much thanks. *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#13
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
Hi,
sorry did not find the original posting so I answer this one. I see how you reverse the strand . I do not understand how you want to complement the strand. So I can not comment your solution. But have a solution for the problem if you want. Feel free to adapt it to your needs. 'This function converts a DNA string to its 'reverse complement 'Gys de Jongh 16-jun-1999 Function Rev(Forw As String) As String Dim ForwA(1 To 100) As String * 1 Dim RevA(1 To 100) As String * 1 N = Len(Forw) If N 100 Then Rev = "Input too long (max 100)" Exit Function End If For I = 1 To N ForwA(I) = Mid(Forw, I, 1) Next I For I = 1 To N Select Case ForwA(N - I + 1) Case "A" RevA(I) = "T" Case "a" RevA(I) = "t" 'From A to T Case "C" RevA(I) = "G" Case "c" RevA(I) = "g" 'From C to G Case "G" RevA(I) = "C" Case "g" RevA(I) = "c" 'From G to C Case "T" RevA(I) = "A" Case "t" RevA(I) = "a" 'From T to A Case Else RevA(I) = "N" 'Rest is N End Select Next I For I = 1 To N Rev = Rev & RevA(I) Next I End Function Short explanation for Tom : DNA contains the information for cells It is a double helix with 2 strings , molecules The two strings contain the same information They are redundant , one can be used to construct the other In each string A only pairs with T and C only with G But : the Sense strand reads from left to right and the anti Sense strand from right to left so : ACGTCCCTGAAATT <====Sense strand TGCAGGGACTTTAA <====anti Sense strand Finding the reverse complement means finding the other if one is known hth Gys "Tom Ogilvy" wrote in message ... As an example Assume you had a reversed string like AAA your code would convert it like this AAA = 111 = UUU If that is what you want, then Ok. If the problem is you would want 111 as a result, then I think you need to go to an intermediate stage (two step process) AAA = MMM = 111 Where the middle stage would use unique identifiers so you wouldn't convert any value after it was converted to a final value. A to M, M to 1 C to N, N to 2 G to O, O to 3 U to P, P to 4 1 to Q, Q to U 2 to R, R to G 3 to S, S to C 4 to T, T to A -- Regards, Tom Ogilvy "M H" wrote in message ... I've written an excel function aim to change a RNA sequence to its reverse complement (e.g. from "ACGUUGUA" to "UACAACGU") as: Function ReverseComplement(Rcell As Range, Optional IsText As Boolean) Dim i As Integer Dim strReverseSeq As String Dim strSenseSeq As String strSenseSeq = Trim(Rcell) For i = 1 To Len(strSenseSeq) strReverseSeq = Mid(strSenseSeq, i, 1) & _ strReverseSeq Next i If IsText = False Then ReverseComplement = CLng(strReverseSeq) Else ReverseComplement = strReverseSeq End If strSeqA1 = Replace(strReverseSeq, "A", "1") strSeqC2 = Replace(strSeqA1, "C", "2") strSeqG3 = Replace(strSeqC2, "G", "3") strSeqT4 = Replace(strSeqG3, "U", "4") strSeq1T = Replace(strSeqT4, "1", "U") strSeq2G = Replace(strSeq1T, "2", "G") strSeq3C = Replace(strSeq2G, "3", "C") strSeq4A = Replace(strSeq3C, "4", "A") End Function The first half works fine, which reverse the sequence, but the second part doesn't work, which cannot make the comoplements Please help for debug. Much thanks. *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#14
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
I believe By Complement, he means interchange A and T and C and G, as you (and
he) are doing. But why do you not want to use Replace, StrReverse and UCase$? The code is much shorter -- it can be reduced to just 7 lines. On Mon, 17 Jan 2005 23:56:04 +0100, "GysdeJongh" wrote: Hi, sorry did not find the original posting so I answer this one. I see how you reverse the strand . I do not understand how you want to complement the strand. So I can not comment your solution. But have a solution for the problem if you want. Feel free to adapt it to your needs. 'This function converts a DNA string to its 'reverse complement 'Gys de Jongh 16-jun-1999 Function Rev(Forw As String) As String Dim ForwA(1 To 100) As String * 1 Dim RevA(1 To 100) As String * 1 N = Len(Forw) If N 100 Then Rev = "Input too long (max 100)" Exit Function End If For I = 1 To N ForwA(I) = Mid(Forw, I, 1) Next I For I = 1 To N Select Case ForwA(N - I + 1) Case "A" RevA(I) = "T" Case "a" RevA(I) = "t" 'From A to T Case "C" RevA(I) = "G" Case "c" RevA(I) = "g" 'From C to G Case "G" RevA(I) = "C" Case "g" RevA(I) = "c" 'From G to C Case "T" RevA(I) = "A" Case "t" RevA(I) = "a" 'From T to A Case Else RevA(I) = "N" 'Rest is N End Select Next I For I = 1 To N Rev = Rev & RevA(I) Next I End Function Short explanation for Tom : DNA contains the information for cells It is a double helix with 2 strings , molecules The two strings contain the same information They are redundant , one can be used to construct the other In each string A only pairs with T and C only with G But : the Sense strand reads from left to right and the anti Sense strand from right to left so : ACGTCCCTGAAATT <====Sense strand TGCAGGGACTTTAA <====anti Sense strand Finding the reverse complement means finding the other if one is known hth Gys "Tom Ogilvy" wrote in message ... As an example Assume you had a reversed string like AAA your code would convert it like this AAA = 111 = UUU If that is what you want, then Ok. If the problem is you would want 111 as a result, then I think you need to go to an intermediate stage (two step process) AAA = MMM = 111 Where the middle stage would use unique identifiers so you wouldn't convert any value after it was converted to a final value. A to M, M to 1 C to N, N to 2 G to O, O to 3 U to P, P to 4 1 to Q, Q to U 2 to R, R to G 3 to S, S to C 4 to T, T to A -- Regards, Tom Ogilvy "M H" wrote in message ... I've written an excel function aim to change a RNA sequence to its reverse complement (e.g. from "ACGUUGUA" to "UACAACGU") as: Function ReverseComplement(Rcell As Range, Optional IsText As Boolean) Dim i As Integer Dim strReverseSeq As String Dim strSenseSeq As String strSenseSeq = Trim(Rcell) For i = 1 To Len(strSenseSeq) strReverseSeq = Mid(strSenseSeq, i, 1) & _ strReverseSeq Next i If IsText = False Then ReverseComplement = CLng(strReverseSeq) Else ReverseComplement = strReverseSeq End If strSeqA1 = Replace(strReverseSeq, "A", "1") strSeqC2 = Replace(strSeqA1, "C", "2") strSeqG3 = Replace(strSeqC2, "G", "3") strSeqT4 = Replace(strSeqG3, "U", "4") strSeq1T = Replace(strSeqT4, "1", "U") strSeq2G = Replace(strSeq1T, "2", "G") strSeq3C = Replace(strSeq2G, "3", "C") strSeq4A = Replace(strSeq3C, "4", "A") End Function The first half works fine, which reverse the sequence, but the second part doesn't work, which cannot make the comoplements Please help for debug. Much thanks. *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#15
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
Bad choice of words.
-- Regards, Tom Ogilvy "Myrna Larson" wrote in message ... I was impressed not by JE's use of a single statement, but by replacing with a lower case letter instead of a number, symbol or different letter altogether. The first approach requires 4 replacements plus the UCase$; the latter requires 8 replacements. On Mon, 17 Jan 2005 14:37:08 -0500, "Tom Ogilvy" wrote: If the objective is to pack everything together, why stop there Public Function ReverseComplement(sInput As String) ReverseComplement = _ StrReverse(UCase(Replace(Replace(Replace(Replace( _ sInput, "A", "u"), "C", "g"), "G", "c"), "U", "a"))) End Function --- Regards, Tom Ogilvy "Myrna Larson" wrote in message .. . Nice one, JE! I think he also wants the characters in reverse order, so perhaps this slight modification will take care of that: Public Function ReverseComplement(sInput As String) Dim sTemp As String sTemp = UCase(Replace(Replace(Replace(Replace( _ sInput, "A", "u"), "C", "g"), "G", "c"), "U", "a")) ReverseComplement = StrReverse(sTemp) End Function On Mon, 17 Jan 2005 11:43:12 -0700, JE McGimpsey wrote: Perhaps Public Function ReverseComplement(sInput As String) Dim sTemp As String sTemp = UCase(Replace(Replace(Replace(Replace( _ sInput, "A", "u"), "C", "g"), "G", "c"), "U", "a")) ReverseComplement = sTemp End Function In article , M H wrote: I've written an excel function aim to change a RNA sequence to its reverse complement (e.g. from "ACGUUGUA" to "UACAACGU") as: Function ReverseComplement(Rcell As Range, Optional IsText As Boolean) Dim i As Integer Dim strReverseSeq As String Dim strSenseSeq As String strSenseSeq = Trim(Rcell) For i = 1 To Len(strSenseSeq) strReverseSeq = Mid(strSenseSeq, i, 1) & _ strReverseSeq Next i If IsText = False Then ReverseComplement = CLng(strReverseSeq) Else ReverseComplement = strReverseSeq End If strSeqA1 = Replace(strReverseSeq, "A", "1") strSeqC2 = Replace(strSeqA1, "C", "2") strSeqG3 = Replace(strSeqC2, "G", "3") strSeqT4 = Replace(strSeqG3, "U", "4") strSeq1T = Replace(strSeqT4, "1", "U") strSeq2G = Replace(strSeq1T, "2", "G") strSeq3C = Replace(strSeq2G, "3", "C") strSeq4A = Replace(strSeq3C, "4", "A") End Function The first half works fine, which reverse the sequence, but the second part doesn't work, which cannot make the comoplements Please help for debug. Much thanks. *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#16
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
Thanks for the explanation.
-- Regards, Tom Ogilvy "GysdeJongh" wrote in message ... Hi, sorry did not find the original posting so I answer this one. I see how you reverse the strand . I do not understand how you want to complement the strand. So I can not comment your solution. But have a solution for the problem if you want. Feel free to adapt it to your needs. 'This function converts a DNA string to its 'reverse complement 'Gys de Jongh 16-jun-1999 Function Rev(Forw As String) As String Dim ForwA(1 To 100) As String * 1 Dim RevA(1 To 100) As String * 1 N = Len(Forw) If N 100 Then Rev = "Input too long (max 100)" Exit Function End If For I = 1 To N ForwA(I) = Mid(Forw, I, 1) Next I For I = 1 To N Select Case ForwA(N - I + 1) Case "A" RevA(I) = "T" Case "a" RevA(I) = "t" 'From A to T Case "C" RevA(I) = "G" Case "c" RevA(I) = "g" 'From C to G Case "G" RevA(I) = "C" Case "g" RevA(I) = "c" 'From G to C Case "T" RevA(I) = "A" Case "t" RevA(I) = "a" 'From T to A Case Else RevA(I) = "N" 'Rest is N End Select Next I For I = 1 To N Rev = Rev & RevA(I) Next I End Function Short explanation for Tom : DNA contains the information for cells It is a double helix with 2 strings , molecules The two strings contain the same information They are redundant , one can be used to construct the other In each string A only pairs with T and C only with G But : the Sense strand reads from left to right and the anti Sense strand from right to left so : ACGTCCCTGAAATT <====Sense strand TGCAGGGACTTTAA <====anti Sense strand Finding the reverse complement means finding the other if one is known hth Gys "Tom Ogilvy" wrote in message ... As an example Assume you had a reversed string like AAA your code would convert it like this AAA = 111 = UUU If that is what you want, then Ok. If the problem is you would want 111 as a result, then I think you need to go to an intermediate stage (two step process) AAA = MMM = 111 Where the middle stage would use unique identifiers so you wouldn't convert any value after it was converted to a final value. A to M, M to 1 C to N, N to 2 G to O, O to 3 U to P, P to 4 1 to Q, Q to U 2 to R, R to G 3 to S, S to C 4 to T, T to A -- Regards, Tom Ogilvy "M H" wrote in message ... I've written an excel function aim to change a RNA sequence to its reverse complement (e.g. from "ACGUUGUA" to "UACAACGU") as: Function ReverseComplement(Rcell As Range, Optional IsText As Boolean) Dim i As Integer Dim strReverseSeq As String Dim strSenseSeq As String strSenseSeq = Trim(Rcell) For i = 1 To Len(strSenseSeq) strReverseSeq = Mid(strSenseSeq, i, 1) & _ strReverseSeq Next i If IsText = False Then ReverseComplement = CLng(strReverseSeq) Else ReverseComplement = strReverseSeq End If strSeqA1 = Replace(strReverseSeq, "A", "1") here strSeqA1 etc are just a not declared variables which don't seem to reach the output of the function ???? strSeqC2 = Replace(strSeqA1, "C", "2") strSeqG3 = Replace(strSeqC2, "G", "3") strSeqT4 = Replace(strSeqG3, "U", "4") strSeq1T = Replace(strSeqT4, "1", "U") strSeq2G = Replace(strSeq1T, "2", "G") strSeq3C = Replace(strSeq2G, "3", "C") strSeq4A = Replace(strSeq3C, "4", "A") End Function The first half works fine, which reverse the sequence, but the second part doesn't work, which cannot make the comoplements Please help for debug. Much thanks. *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#17
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
Much thanks to JE's original idea, precise and concise, and Myrna's
essential amendment. Actually I realize my stupidity of missing one line as Tom suggested when I was on the bed, and I'm really amazed to see the how several brains led to the one-line code! This is finally what I've to achieve, with including the degenerate alphabets and distinguishing DNA from RNA: Public Function RevCompl(sInput As String) Dim sTemp As String sUInput = UCase(sInput) sTemp1 = Replace(sUInput, "T", "") 'If it is DNA If Len(sTemp1) < Len(sUInput) Then sTemp = Replace(Replace(Replace(Replace(Replace(Replace( _ Replace(Replace(Replace(Replace(Replace(Replace( _ sUInput, "R", "y"), "M", "k"), "Y", "r"), "K", "m"), _ "H", "d"), "D", "h"), "B", "v"), "V", "b"), _ "A", "t"), "C", "g"), "G", "c"), "T", "a") 'if it is RNA Else sTemp = Replace(Replace(Replace(Replace(Replace(Replace( _ Replace(Replace(Replace(Replace(Replace(Replace( _ sUInput, "R", "y"), "M", "k"), "Y", "r"), "K", "m"), _ "H", "d"), "D", "h"), "B", "v"), "V", "b"), _ "A", "u"), "C", "g"), "G", "c"), "U", "a") End If RevCompl = StrReverse(UCase(sTemp)) End Function rgds, Maurice *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#18
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
Hi,
you are right , thank you very much for this suggestion. The job is to reverse the string and replace A by T etc. It is called : calculating the Reverse Complement in bioinformatics. There is of course dedicated software. Look here if you like to see an example : http://www.mbio.ncsu.edu/BioEdit/bioedit.html I made this little piece of VBA to calculate the reverse complement for a very large number of short sequences. Your solution is more elegant and probably faster. It did not occur to me that it can be done that way because I am in genetics and statistics . I am not a programmer. Thanks again for your time Gys "Myrna Larson" wrote in message ... I believe By Complement, he means interchange A and T and C and G, as you (and he) are doing. But why do you not want to use Replace, StrReverse and UCase$? The code is much shorter -- it can be reduced to just 7 lines. On Mon, 17 Jan 2005 23:56:04 +0100, "GysdeJongh" wrote: Hi, sorry did not find the original posting so I answer this one. I see how you reverse the strand . I do not understand how you want to complement the strand. So I can not comment your solution. But have a solution for the problem if you want. Feel free to adapt it to your needs. 'This function converts a DNA string to its 'reverse complement 'Gys de Jongh 16-jun-1999 Function Rev(Forw As String) As String Dim ForwA(1 To 100) As String * 1 Dim RevA(1 To 100) As String * 1 N = Len(Forw) If N 100 Then Rev = "Input too long (max 100)" Exit Function End If For I = 1 To N ForwA(I) = Mid(Forw, I, 1) Next I For I = 1 To N Select Case ForwA(N - I + 1) Case "A" RevA(I) = "T" Case "a" RevA(I) = "t" 'From A to T Case "C" RevA(I) = "G" Case "c" RevA(I) = "g" 'From C to G Case "G" RevA(I) = "C" Case "g" RevA(I) = "c" 'From G to C Case "T" RevA(I) = "A" Case "t" RevA(I) = "a" 'From T to A Case Else RevA(I) = "N" 'Rest is N End Select Next I For I = 1 To N Rev = Rev & RevA(I) Next I End Function Short explanation for Tom : DNA contains the information for cells It is a double helix with 2 strings , molecules The two strings contain the same information They are redundant , one can be used to construct the other In each string A only pairs with T and C only with G But : the Sense strand reads from left to right and the anti Sense strand from right to left so : ACGTCCCTGAAATT <====Sense strand TGCAGGGACTTTAA <====anti Sense strand Finding the reverse complement means finding the other if one is known hth Gys "Tom Ogilvy" wrote in message ... As an example Assume you had a reversed string like AAA your code would convert it like this AAA = 111 = UUU If that is what you want, then Ok. If the problem is you would want 111 as a result, then I think you need to go to an intermediate stage (two step process) AAA = MMM = 111 Where the middle stage would use unique identifiers so you wouldn't convert any value after it was converted to a final value. A to M, M to 1 C to N, N to 2 G to O, O to 3 U to P, P to 4 1 to Q, Q to U 2 to R, R to G 3 to S, S to C 4 to T, T to A -- Regards, Tom Ogilvy "M H" wrote in message ... I've written an excel function aim to change a RNA sequence to its reverse complement (e.g. from "ACGUUGUA" to "UACAACGU") as: Function ReverseComplement(Rcell As Range, Optional IsText As Boolean) Dim i As Integer Dim strReverseSeq As String Dim strSenseSeq As String strSenseSeq = Trim(Rcell) For i = 1 To Len(strSenseSeq) strReverseSeq = Mid(strSenseSeq, i, 1) & _ strReverseSeq Next i If IsText = False Then ReverseComplement = CLng(strReverseSeq) Else ReverseComplement = strReverseSeq End If strSeqA1 = Replace(strReverseSeq, "A", "1") strSeqC2 = Replace(strSeqA1, "C", "2") strSeqG3 = Replace(strSeqC2, "G", "3") strSeqT4 = Replace(strSeqG3, "U", "4") strSeq1T = Replace(strSeqT4, "1", "U") strSeq2G = Replace(strSeq1T, "2", "G") strSeq3C = Replace(strSeq2G, "3", "C") strSeq4A = Replace(strSeq3C, "4", "A") End Function The first half works fine, which reverse the sequence, but the second part doesn't work, which cannot make the comoplements Please help for debug. Much thanks. *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#19
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
Hi,
very nice solution. I have one question : should there be a test on the lenght of the sInput srting or is there no 65k restriction ? Gys "M H" wrote in message ... Much thanks to JE's original idea, precise and concise, and Myrna's essential amendment. Actually I realize my stupidity of missing one line as Tom suggested when I was on the bed, and I'm really amazed to see the how several brains led to the one-line code! This is finally what I've to achieve, with including the degenerate alphabets and distinguishing DNA from RNA: Public Function RevCompl(sInput As String) Dim sTemp As String sUInput = UCase(sInput) sTemp1 = Replace(sUInput, "T", "") 'If it is DNA If Len(sTemp1) < Len(sUInput) Then sTemp = Replace(Replace(Replace(Replace(Replace(Replace( _ Replace(Replace(Replace(Replace(Replace(Replace( _ sUInput, "R", "y"), "M", "k"), "Y", "r"), "K", "m"), _ "H", "d"), "D", "h"), "B", "v"), "V", "b"), _ "A", "t"), "C", "g"), "G", "c"), "T", "a") 'if it is RNA Else sTemp = Replace(Replace(Replace(Replace(Replace(Replace( _ Replace(Replace(Replace(Replace(Replace(Replace( _ sUInput, "R", "y"), "M", "k"), "Y", "r"), "K", "m"), _ "H", "d"), "D", "h"), "B", "v"), "V", "b"), _ "A", "u"), "C", "g"), "G", "c"), "U", "a") End If RevCompl = StrReverse(UCase(sTemp)) End Function rgds, Maurice *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#20
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
It's probably the terminology that's used in the genetics/biology community.
On Mon, 17 Jan 2005 19:18:55 -0500, "Tom Ogilvy" wrote: Bad choice of words. |
#21
Posted to microsoft.public.excel.programming
|
|||
|
|||
A function of reverse complement
Thx ,
nice piece of code to do the check 32k is enough for most genes :) Gys "Peter T" <peter_t@discussions wrote in message ... Hi Gys Someone may advise what the ultimate limit is but try for your self: Sub test() Dim s As String, s2 As String, n As Long, s2 s = "ACGT" For n = 1 To 15 ' 15 len 64k, 17 len 512k s = s & s Next s2 = RevCompl(s) MsgBox Len(s) & " " & Left(s, 12) & vbCr & _ Len(s2) & " " & Left(s2, 12) End Sub But note cells can only store strings up to 32k. Regards, Peter T "GysdeJongh" wrote: Hi, very nice solution. I have one question : should there be a test on the lenght of the sInput srting or is there no 65k restriction ? Gys "M H" wrote in message ... Much thanks to JE's original idea, precise and concise, and Myrna's essential amendment. Actually I realize my stupidity of missing one line as Tom suggested when I was on the bed, and I'm really amazed to see the how several brains led to the one-line code! This is finally what I've to achieve, with including the degenerate alphabets and distinguishing DNA from RNA: Public Function RevCompl(sInput As String) Dim sTemp As String sUInput = UCase(sInput) sTemp1 = Replace(sUInput, "T", "") 'If it is DNA If Len(sTemp1) < Len(sUInput) Then sTemp = Replace(Replace(Replace(Replace(Replace(Replace( _ Replace(Replace(Replace(Replace(Replace(Replace( _ sUInput, "R", "y"), "M", "k"), "Y", "r"), "K", "m"), _ "H", "d"), "D", "h"), "B", "v"), "V", "b"), _ "A", "t"), "C", "g"), "G", "c"), "T", "a") 'if it is RNA Else sTemp = Replace(Replace(Replace(Replace(Replace(Replace( _ Replace(Replace(Replace(Replace(Replace(Replace( _ sUInput, "R", "y"), "M", "k"), "Y", "r"), "K", "m"), _ "H", "d"), "D", "h"), "B", "v"), "V", "b"), _ "A", "u"), "C", "g"), "G", "c"), "U", "a") End If RevCompl = StrReverse(UCase(sTemp)) End Function rgds, Maurice *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
typing a complement (i.e. notA) character in Excel | Excel Discussion (Misc queries) | |||
LINKEDRANGE function - a complement to the PULL function (for getting values from a closed workbook) | Excel Worksheet Functions | |||
How do I solve for complement rule of P(X > or = 4.0) in excel? | New Users to Excel | |||
Reverse MATCH Function | Excel Discussion (Misc queries) | |||
Reverse of Chr Function | Excel Programming |