Matches and mismatches in perl
Posted by sirisha (sirisha), 23 January 2008We are working on prion proteins. We extracted pattern signatures for mammals, aves, reptilia, pisces and amphibia.
We are supposed to write a progarm for developing a tool which should be able to find whether it is a prion or not and to which family it belongs to.
We were successful up to that point.
But now we have a small problem.
When we give an input sequence , the tool should match with the pattern and give the matches and mismatches in the output.
i will give you 2 small examples. if you cant get it pls let me know.
i will try to give a clear idea.
$a=APPLE; # let it be a pattern
$b=<STDIN>; # input sequence
suppose my input sequence ($b) is : MYAPPLES
Then the input ($b) will match with $a only from 3rd letter to 7th letter.( i.e APPLE)
i want to get out put as below:
The matching region is - - APPLE- (i.e mismatches should be shown as hiphen)
If the pattern is "MYAPPLE" and the input is "APPLEMY"
Then the output should be:
- -APPLE- -
i.e the first 2 hiphens represent gaps and last 2 hiphens represent mismatches.
$a="agaaaagavvgglggy" # a pattern signature
let the input seq is "ttttttttttagaaaagavtttggyttttttt";
here the input matches with $a only with the letters in bold : ttttttttttagaaaagavtttggyttttttt
i want to know what we can do to give input showing both matches and mismatches in the input.(mismatches in the seq should be shown as a hiphen)
That means the output should be like this:
The given sequence matches with the pattern at - - - - - - - - - - agaaaagav- - -ggy - - - - - -
Thanks in advance,
Posted by admin (Graham Ellis), 23 January 2008Thanks for that fuller explanation (I'm guessing you've started a new thread that continues your "Help in Perl" question) but I still don't understand the detail of how you decide what a match is - you could have come up with MY as the matching sequence just as easily as APPLE in the first sequence, and in the second case with the gap in the middle, I don't know what the rules are for the gap. As such, I can't point (yet) at a best solution / algorithmic approach.
Did Kevin's BioPerl suggestion help? That's the way I would go, unless you're researching the science beyond what it can provide.
Posted by KevinAD (KevinAD), 23 January 2008Looks like they have to program it Graham:
Posted by KevinAD (KevinAD), 23 January 2008siri,
Please post the code you have written so far to try and write this tool. I am willing to help but only if I see some effort on your part to write code. I would think using the index() and substr() functions in a recursive loop will do what you want.
Posted by KevinAD (KevinAD), 23 January 2008I think he may have gotten the answer he is looking for on another forum/blog. I see this same question posted by this same person on some other forums.
PH: 01144 1225 708225 • FAX: 01144 1225 793803 • EMAIL: firstname.lastname@example.org • WEB: http://www.wellho.net • SKYPE: wellho