Training, Open Source computer languages
PerlPHPPythonMySQLApache / TomcatTclRubyJavaC and C++LinuxCSS 
Search for:
Home Accessibility Courses Diary The Mouth Forum Resources Site Map About Us Contact
For 2023 - we are now fully retired from IT training.
We have made many, many friends over 25 years of teaching about Python, Tcl, Perl, PHP, Lua, Java, C and C++ - and MySQL, Linux and Solaris/SunOS too. Our training notes are now very much out of date, but due to upward compatability most of our examples remain operational and even relevant ad you are welcome to make us if them "as seen" and at your own risk.

Lisa and I (Graham) now live in what was our training centre in Melksham - happy to meet with former delegates here - but do check ahead before coming round. We are far from inactive - rather, enjoying the times that we are retired but still healthy enough in mind and body to be active!

I am also active in many other area and still look after a lot of web sites - you can find an index ((here))
how to get jus the dna sequnce

Posted by revtopo (revtopo), 19 July 2007
hi all there,

i have been tryin through bioperl to get the DNA sequence from EMBL(a database) through a set of codes. though i get with most of the DNA sequence there are some accesion ids which produce the entire genomic DNA sequence fromt he list. do any one have an odea of how to do with this.

the code that extracts the DNA sequence is:

sub parsing {

my ($uni_acc,$embl_xref,$line,@lines,@embl_xref);

  $uni_rec = $_;
  @lines = split/\n/,$uni_rec;

foreach $line (@lines)
 if ($line =~ /^AC\s+(\w+)/)
    $uni_acc = $1;
     elsif ($line =~ /^DR\s+EMBL;\s+(\w+)/)
              $embl_xref = $1;
              #print"$uni_acc\t $embl_xref\n";
              push (@embl_xref, $embl_xref);
              #print "@embl_xref";
return (@embl_xref);

sub getdata {
my @embl_xref = &parsing;

       $srs -> get_set_chunk_size(20);# to get the records in cnvenient numbers.

     $srs->get_records_with_accessions #to get recordds corresponding to accesion numbers.
         -db => 'embl',
         -AccNumbers => \@embl_xref,
       -file => 'embl_data.dbi'

return ($srs);



Posted by KevinAD (KevinAD), 19 July 2007
I am not sure how you expect meaningful help with this question unless a person is very familiar with the data you are working with. Having said that, I will take a guess, maybe you need to refactor your regexp:

elsif ($line =~ /^DR\s+EMBL;\s+(\w+)/)

looks like (\w+) is too greedy and capturing more data then you think it should.

This page is a thread posted to the opentalk forum at and archived here for reference. To jump to the archive index please follow this link.

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2023: Well House Manor • 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • FAX: 01144 1225 793803 • EMAIL: • WEB: • SKYPE: wellho