| |||||||||||
| |||||||||||
reading multiple lines Posted by deep (deep), 24 November 2007 Hello Folks,I want to read one line at a time from File 1, read lines from File2, do some processing and write to a files. Let me try using an example. My File 1 looks like : 881.5372 221.3915 4 3 SIFLFKK A=0 C=0 D=0 E=0 F=2 G=0 H=0 I=1 K=2 L=1 M=0 N=0 P=0 Q=0 R=0 S=1 T=0 V=0 W=0 Y=0 856.4441 286.4886 3 2 TSLFSFR A=0 C=0 D=0 E=0 F=2 G=0 H=0 I=0 K=0 L=1 M=0 N=0 P=0 Q=0 R=1 S=2 T=1 V=0 W=0 Y=0 File 2 looks like: 7 8 881.5372 221.3915 4 3 XP_001418543.1 SIFLFKK A=0 C=0 D=0 E=0 F=2 G=0 H=0 I=1 K=2 L=1 M=0 N=0 P=0 Q=0 R=0 S=1 T=0 V=0 W=0 Y=0 10 15 856.4441 286.4886 3 3 XP_001418056.1 DIYYRK A=0 C=0 D=1 E=0 F=0 G=0 H=0 I=1 K=1 L=0 M=0 N=0 P=0 Q=0 R=1 S=0 T=0 V=0 W=0 Y=2 272 282 1285.6006 322.4073 4 3 XP_001422002.1 REHMVEMGLNA A=1 C=0 D=0 E=2 F=0 G=1 H=1 I=0 K=0 L=1 M=2 N=1 P=0 Q=0 R=1 S=0 T=0 V=1 W=0 Y=0 38 87 5277.4026 1760.1414 3 2 XP_001420476.1 YSAALVDTNGCYASQTLEVEVSWTCETSTNTAVAAAFIAFAAFCAYSFGR A=11 C=3 D=1 E=3 F=4 G=2 H=0 I=1 K=0 L=2 M=0 N=2 P=0 Q=1 R=1 S=5 T=6 V=4 W=1 Y=3 2449 2462 1496.8094 375.2095 4 3 XP_001417092.1 TPQRPGAPVNVSFK A=1 C=0 D=0 E=0 F=1 G=1 H=0 I=0 K=1 L=0 M=0 N=1 P=3 Q=1 R=1 S=1 T=1 V=2 W=0 Y=0 584 613 3135.4741 1568.7442 2 3 XP_001421161.1 AANMLSWAVNMAATKIGGPDDAHEPVDLQN A=6 C=0 D=3 E=1 F=0 G=2 H=1 I=1 K=1 L=2 M=2 N=3 P=2 Q=1 R=0 S=1 T=1 V=2 W=1 Y=0 115 170 5985.1843 2993.5993 2 3 YP_636257.1 ELTWITGVIMAVCTVSFGVTGYSLPWDQVGYWAVKIVTGVPDAIPVVGPAIVELL A=4 C=1 D=2 E=2 F=1 G=6 H=0 I=5 K=1 L=4 M=1 N=0 P=4 Q=1 R=1 S=2 T=5 V=1 1 W=3 Y=2 File1 is smaller then File2. I need to do is : Read first line from File1 do processing on ALL the lines in FIle2. read second line and again do some processing on all the lines in FIle2. I have been able to take 1 line from File1 and one line from File2, do processing, read second line from File 2 and second from File2 (In this case i have made the two files equal lines ). I have been able to take the first line from File1 and processed All the lines in File2. However, I am just not able to processed any further, basically reading all the line in FIle 1until eof and process all the lines in FIle2. Below is the code, This is a WORKING CODE that works for one line. The code is huge as i am doing lots of processing, If any one can point out where I am wrong or can suggest few things it would be of great help. Thanks Guys open INFILE1,"<$File1"; open INFILE2,"<$File2"; while ($line1 = <INFILE1>) { @nonplant_fields = split /\s+/,$line1; while ($line2 = <INFILE2>) { @plantfields = split /\s+/,$line2; $count =0; $total_count =0; $P_count=0; $total_P_count=0; $NP_count =0; $total_NP_count =0; $nonplant_pep_seq = $nonplant_fields[4]; $nonplant_pep_seq_length = length ($nonplant_pep_seq); $plant_pep_seq = $plantfields[7]; $plant_pep_seq_length = length ($plant_pep_seq); DO: for ($i=5; $i<=24;$i++) { $NP_numbers = $nonplant_fields[$i]; #if ($line2 =~ /([ACDEFGHIKLMNPQRSTVWY]\=\d)+/) if ($NP_numbers =~/\d/) { $nonplant_AA_count = $&; } for ($j=$i+3; $j<=27; $j++) { if ($plantfields[$j] =~ /\d/) { $plant_AA_count = $&; and its goes on for a while...its big script. Hopefully the snap shot can give u an idea what I am doing wrong. Thanks. Posted by admin (Graham Ellis), 24 November 2007 You need to rewind your second file each time you read a line from the first file, or store the second file in a list and keep traversing it from there. I've put a sample up athttp://www.wellho.net/mouth/1442_Reading-a-file-multiple-times-file-pointers.html to demonstrate the principle. Posted by deep (deep), 24 November 2007 Thanks a lot Graham!! Tried everything, just forgot to reopen the files. Thanks again.Posted by KevinAD (KevinAD), 24 November 2007 You could also use Tie::File I would think, but I am not sure how efficient it would be. Maybe use seek() too. Posted by deep (deep), 26 November 2007 Thanks Kevin, I did try using "seek", i think some where I was messing it up. I am wondering would it be possible to use $. to get the current line number instead of "tell". Is it the case when each time I am opening the file, $. will be reset or will remember the last read line from the same file?Posted by KevinAD (KevinAD), 26 November 2007 on 11/26/07 at 04:51:04, deep wrote:
I am pretty sure $. is reset when a file is opened (or maybe closed). You could try experimenting and see how it behaves. Posted by admin (Graham Ellis), 26 November 2007 $. is the number of lines read since a file was last opened (hideously none-OO - latest file opening) whereas tell refers to the number of bytes into a file which is what you need for seekPosted by deep (deep), 26 November 2007 Thanks Kevin,Graham for the suggestion(s).I am having a bit problem with the script. I keep getting "out of Memory" error. I think I can narrow it down to the way I am processing the file while ($line2 = <INFILE2>) { @plantfields = split /\s+/,$line2; $plant_prot_name = $plantfields[6]; open INFILE1,"<$File1"; while ($line1 = <INFILE1>) { @nonplant_fields = split /\s+/,$line1; push (@nonplant_pep_seq,$nonplant_fields[4]); Can You suggest is some way I can optimize this part. I am looking into using "tell" and "seek". Hopefully it might do the trick. I have designed it such a way because I am comparing and capturing many sets of data on fly from both the files. I am at a stage of rapid prototyping so at this moment not concentrating much of the performance/optimization. Just trying to get the logic right and see the results. Posted by admin (Graham Ellis), 27 November 2007 @nonplant_pep_seq seems to cumulatives, - i.e. building up all the time - but isn't dependent on the first file in any way. It's the only memory hog I can see that you have, and I don't see the point of doing it loits of times .... unless you are adding other things in there in the code you're not showing us.Basically, you're blowing up a balloon until it bursts. Better to start letting the air our into another file. Posted by deep (deep), 27 November 2007 I see your point Graham, thanks. Later in the code all I am doing is manipulation of the data. I am emptying the array before the next read. Hopefully this will do the job.This page is a thread posted to the opentalk forum
at www.opentalk.org.uk and
archived here for reference. To jump to the archive index please
follow this link.
|
| ||||||||||
PH: 01144 1225 708225 • FAX: 01144 1225 793803 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho |