Training, Open Source computer languages
PerlPHPPythonMySQLApache / TomcatTclRubyJavaC and C++LinuxCSS 
Search for:
Home Accessibility Courses Diary The Mouth Forum Resources Site Map About Us Contact
 
For 2023 (and 2024 ...) - we are now fully retired from IT training.
We have made many, many friends over 25 years of teaching about Python, Tcl, Perl, PHP, Lua, Java, C and C++ - and MySQL, Linux and Solaris/SunOS too. Our training notes are now very much out of date, but due to upward compatability most of our examples remain operational and even relevant ad you are welcome to make us if them "as seen" and at your own risk.

Lisa and I (Graham) now live in what was our training centre in Melksham - happy to meet with former delegates here - but do check ahead before coming round. We are far from inactive - rather, enjoying the times that we are retired but still healthy enough in mind and body to be active!

I am also active in many other area and still look after a lot of web sites - you can find an index ((here))
reading multiple lines

Posted by deep (deep), 24 November 2007
Hello Folks,
I want to read one line at a time from File 1, read lines from File2, do some processing and write to a files.
Let me try using an example.
My File 1 looks like :
881.5372        221.3915        4       3       SIFLFKK A=0     C=0     D=0     E=0     F=2     G=0     H=0     I=1     K=2     L=1     M=0     N=0
    P=0     Q=0     R=0     S=1     T=0     V=0     W=0     Y=0
856.4441        286.4886        3       2       TSLFSFR A=0     C=0     D=0     E=0     F=2     G=0     H=0     I=0     K=0     L=1     M=0     N=0
    P=0     Q=0     R=1     S=2     T=1     V=0     W=0     Y=0
File 2 looks like:
7      8      881.5372      221.3915      4      3      XP_001418543.1      SIFLFKK      A=0      C=0      D=0      E=0      F=2      G=0      H=0      I=1      K=2      L=1      M=0      N=0      P=0      Q=0      R=0      S=1      T=0      V=0      W=0      Y=0
10      15      856.4441      286.4886      3      3      XP_001418056.1      DIYYRK      A=0      C=0      D=1      E=0      F=0      G=0      H=0      I=1      K=1      L=0      M=0      N=0      P=0      Q=0      R=1      S=0      T=0      V=0      W=0      Y=2
272      282      1285.6006      322.4073      4      3      XP_001422002.1      REHMVEMGLNA      A=1      C=0      D=0      E=2      F=0      G=1      H=1      I=0      K=0      L=1      M=2      N=1      P=0      Q=0      R=1      S=0      T=0      V=1      W=0      Y=0
38      87      5277.4026      1760.1414      3      2      XP_001420476.1      YSAALVDTNGCYASQTLEVEVSWTCETSTNTAVAAAFIAFAAFCAYSFGR      A=11      C=3      D=1      E=3      F=4      G=2      H=0      I=1      K=0      L=2      M=0      N=2      P=0      Q=1      R=1      S=5      T=6      V=4
     W=1      Y=3
2449      2462      1496.8094      375.2095      4      3      XP_001417092.1      TPQRPGAPVNVSFK      A=1      C=0      D=0      E=0      F=1      G=1      H=0      I=0      K=1      L=0      M=0      N=1      P=3      Q=1      R=1      S=1      T=1      V=2      W=0      Y=0
584      613      3135.4741      1568.7442      2      3      XP_001421161.1      AANMLSWAVNMAATKIGGPDDAHEPVDLQN      A=6      C=0      D=3      E=1      F=0      G=2      H=1      I=1      K=1      L=2      M=2      N=3      P=2      Q=1      R=0      S=1      T=1      V=2      W=1      Y=0
115      170      5985.1843      2993.5993      2      3      YP_636257.1      ELTWITGVIMAVCTVSFGVTGYSLPWDQVGYWAVKIVTGVPDAIPVVGPAIVELL      A=4      C=1      D=2      E=2      F=1      G=6      H=0      I=5      K=1      L=4      M=1      N=0      P=4      Q=1      R=1      S=2      T=5      V=1
1      W=3      Y=2

File1 is  smaller then File2.
I need to do is :
Read first line from File1 do processing on ALL the lines  in FIle2.
read second line and again do some processing on all the lines in FIle2.

I have been able to take 1 line from File1 and one line from File2, do processing, read second line from File 2 and second from File2 (In this case i have made the two files equal lines ).
I have been able to take the first line from File1 and processed All the lines in File2. However, I am just not able to processed any further, basically reading all the line in FIle 1until eof and process all the lines in FIle2.
Below is the code, This is a WORKING CODE that works for one line.   The code is huge as i am doing lots of processing, If any one can point out where I am wrong or can suggest
few things it would be of great help.
Thanks Guys


open INFILE1,"<$File1";
open INFILE2,"<$File2";

while ($line1 = <INFILE1>)
{
     @nonplant_fields = split /\s+/,$line1;
                 

     while ($line2 = <INFILE2>)
     {
           
           
           
           
           @plantfields =  split /\s+/,$line2;
           

     
         $count =0;
           $total_count =0;
           $P_count=0;
           $total_P_count=0;
           $NP_count =0;
           $total_NP_count =0;
       
                       $nonplant_pep_seq = $nonplant_fields[4];
                         $nonplant_pep_seq_length = length ($nonplant_pep_seq);
           
                   
                       $plant_pep_seq = $plantfields[7];
                     
                       $plant_pep_seq_length = length ($plant_pep_seq);
           
           
               
                 DO:            for ($i=5; $i<=24;$i++)
                                          {
                                                $NP_numbers = $nonplant_fields[$i];
                                                  #if ($line2 =~ /([ACDEFGHIKLMNPQRSTVWY]\=\d)+/)
                                           
                                                        if ($NP_numbers =~/\d/)
                                                     {
                                                           $nonplant_AA_count = $&;
                                                     }
                                                     
                    for ($j=$i+3; $j<=27; $j++)
                             {
                                 
                                               if ($plantfields[$j] =~ /\d/)
                                                  {
                                                        $plant_AA_count = $&;
and its goes on for a while...its big script. Hopefully the snap shot can give u an idea what I am doing wrong. Thanks.



Posted by admin (Graham Ellis), 24 November 2007
You need to rewind your second file each time you read a line from the first file, or store the second file in a list and keep traversing it from there.   I've put a sample up at

http://www.wellho.net/mouth/1442_Reading-a-file-multiple-times-file-pointers.html

to demonstrate the principle.

Posted by deep (deep), 24 November 2007
Thanks a lot Graham!! Tried everything, just forgot to reopen the files. Thanks again.

Posted by KevinAD (KevinAD), 24 November 2007
You could also use Tie::File I would think, but I am not sure how efficient it would be. Maybe use seek() too.

Posted by deep (deep), 26 November 2007
Thanks Kevin, I did try using "seek", i think some where I was messing it up. I am wondering would it be possible to use $. to get the current line number instead of "tell". Is it the case when each time I am opening the file, $. will be reset or will remember the last read line from the same file?



Posted by KevinAD (KevinAD), 26 November 2007
on 11/26/07 at 04:51:04, deep wrote:
Thanks Kevin, I did try using "seek", i think some where I was messing it up. I am wondering would it be possible to use $. to get the current line number instead of "tell". Is it the case when each time I am opening the file, $. will be reset or will remember the last read line from the same file?




I am pretty sure $. is reset when a file is opened (or maybe closed). You could try experimenting and see how it behaves.

Posted by admin (Graham Ellis), 26 November 2007
$. is the number of lines read since a file was last opened (hideously none-OO - latest file opening) whereas tell refers to the number of bytes into a file which is what you need for seek

Posted by deep (deep), 26 November 2007
Thanks Kevin,Graham for the suggestion(s).
I am having a bit problem with the script. I keep getting "out of Memory" error. I think I can narrow it down to the way I am processing the file

while ($line2 = <INFILE2>)
     {
           @plantfields = split /\s+/,$line2;
           $plant_prot_name = $plantfields[6];
           
           open INFILE1,"<$File1";
           while ($line1 = <INFILE1>)
           {
                 @nonplant_fields = split /\s+/,$line1;
push (@nonplant_pep_seq,$nonplant_fields[4]);


Can You suggest is some way I can optimize this part. I am looking into using "tell" and "seek". Hopefully it might do the trick.
I have designed it such a way because I am comparing and capturing many sets of data on fly from both the files. I am at a stage of rapid prototyping so at this moment not concentrating much of the performance/optimization. Just trying to get the logic right and see the results.

                 

Posted by admin (Graham Ellis), 27 November 2007
@nonplant_pep_seq  seems to cumulatives, - i.e. building up all the time - but isn't dependent on the first file in any way.  It's the only memory hog I can see that you have, and I don't see the point of doing it loits of times .... unless you are adding other things in there in the code you're not showing us.

Basically, you're blowing up a balloon until it bursts.   Better to start letting the air our into another file.

Posted by deep (deep), 27 November 2007
I see your point Graham, thanks. Later in the code all I am doing is manipulation of the data. I am emptying the array before the next read. Hopefully this will do the job.



This page is a thread posted to the opentalk forum at www.opentalk.org.uk and archived here for reference. To jump to the archive index please follow this link.

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2024: Well House Manor • 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • FAX: 01144 1225 793803 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho