| ||||||||||||||||
| ||||||||||||||||
counts ans summary the duplicates Posted by yamaha102 (yamaha102), 20 January 2008 Hi,I would like to ask help from you all. I have a csv as below INfile.txt _____________ Date,Product,Price 20070301,Car,A 20070415,Car,A 20070415,Car,B 20070416,Car,B 20070422,Boat,B 20071201,Boat,B 20071204,Car,A 20080101,Car,A 20080102,Boat,B 20080103,Boat,B 20080104,Car,A ________________ and i wish to get the output as following format: Outfile.txt ________________ 2007,Car,A -3 2007,Car,B -2 2007,Boat,B -2 2008,Car,A -2 2008,Boat,B -2 Please guide. Thanks. Posted by KevinAD (KevinAD), 20 January 2008 Well, what have you tried so far?Posted by yamaha102 (yamaha102), 21 January 2008 thanks for replying and guiding. at the moment i could able to produce such result: CarB 2 CarA 5 BoatB 4 Please cntinue to guide me. thanks $in_file = "INfile.txt"; $out_file = "OutFile.txt"; open (IN, "$in_file") or die "Can't open $in_file: $!\n"; open (OUT, ">$out_file") or die "Can't open $out_file: $!\n"; $header = <IN>; # skip over first line which is a header while ( $line = <IN> ) { chomp($line); ($Date, $Product, $Price) = split(',',$line); $Date= substr($Date,0); $Product= substr($Product,0); $Price=substr($Price,0); ($hostname,$hostname1,@rest)=($Product,$Price,); $counter{$hostname,$hostname1}++; } foreach $host (keys %counter) { printf OUT "%16s %4d\n",$host,$counter{$host}; } close IN; close OUT; Posted by admin (Graham Ellis), 21 January 2008 Quote:
That's a bit odd - the Substrs do nothing, and the list assignment just copies variables. Try giving the substr on $Date and extra parameter to tell it you want just four characters, then make up the compposite key in quotes and you should be close: ($Date, $Product, $Price) = split(',',$line); $Date= substr($Date,0,4); $counter{"$Date,$hostname,$hostname1"}++; Posted by yamaha102 (yamaha102), 21 January 2008 on 01/21/08 at 05:57:44, Graham Ellis wrote:
ok. i tried it..but it give me a silly character " "after the date and after the product as below 200801CarA 5 Thanks How to eliminate it? by the way do you have better and easier solution to solve the question? Posted by yamaha102 (yamaha102), 21 January 2008 besides,i have no idea on how to rearrange the above output file to become the final output as below Date,Car A, Car B, Boat B 2007,3,2,2 2008,2,0,2 Please guide me. Posted by KevinAD (KevinAD), 21 January 2008 yamaha102, Is this some type of school work? Graham, does this site have a policy about students posting school/class assignments? Posted by yamaha102 (yamaha102), 21 January 2008 on 01/21/08 at 08:28:11, KevinAD wrote:
Hi, it's no a school work. I just trying to extract data from a file and then display into a desired format. I think most of the newbie learning programming by example. It does apply to myself too. please continue to guide me on above questions. thanks Posted by admin (Graham Ellis), 21 January 2008 on 01/21/08 at 08:28:11, KevinAD wrote:
We are told that this isn't a class assigment ... and from reading the earlier posts I had alreday wondered about it, and decided it probably wasn't. But nice to have the confirmation. So the following comments do NOT apply to the current question. If a student posts up a complete assignment and says "I have to do this - will you write it for me" then the answer should be no. The student learns nothing from having the work done for them except to take advantage of other people, and will find the next assignment even harder. And it's probably that the question posted here should be posted anyway because it's someone else's copyright. Where students attempt a question and need help with some elements, then I'm very much more inclined to help and give them pointers, but still very much aware their objective is to learn how to write a program rather than to end up with a complete program for production use. I tend to suggest that their first source of assistance should be their tutor or other college support mechanism, as we're liable on this board to suggest a practical answer, but one that goes ahead and uses techniques and structures they haven't yet learned. Posted by yamaha102 (yamaha102), 21 January 2008 Hi Graham,Thanks for your input. i appreciate your intention. I am alone in perl programming,no mentor, nor guider... that's why i use forumers as my mentors. I wish to use perl to extract some useful format of my raw data for the analysis purpose. I do have a question... what for that we have the CPAN module out there? Don't it mean that we are taking advantage of other people by using their module? Sorry and thanks. Posted by KevinAD (KevinAD), 21 January 2008 on 01/21/08 at 12:00:02, Graham Ellis wrote:
I agree with all you said. Some forums just do not allow any student posting at all. I did not think that would be your policy and now I know it's not. Posted by admin (Graham Ellis), 21 January 2008 I am getting very confused as to how the text suddenly changed into numbers; I think you have changed something else other than those items we have suggested. As we have established that you have a requirement to get the code working rather than to learn as part of an assignment, I have put together a sample and tested it. Code:
Posted by yamaha102 (yamaha102), 22 January 2008 thanks to be patient with me. well.. my goal is to produce afinalOutput.txt (which is jz an different view of orientation from the outputfile.txt) (jz assume that there is no comma between product and price car, A===> car A) for the text suddenly changed into numbers because ---> look at the output file which carA in 2007 is 3 and so on Date,Car A, Car B, Boat B 2007,3,2,2 2008,2,0,2 I donno whether my approach is correct or not(from raw ->output ->finalOutput) The second post from me is the code to produce the output.txt. The Substrs I used to eliminate some of the string for my usage purpose. with ur guideline by adding $counter{"$Date,$hostname,$hostname1"}++; i managed to produce the output.txt basically the problem i am facing is \how am i going to "rotate - grouping the first column and put it at the left side and put the second column on the top.." the output to finalOutput Thanks. Posted by yamaha102 (yamaha102), 22 January 2008 it's similar to Data::Xtab http://search.cpan.org/dist/Data-Xtab/Xtab.pm but for the module i have the problem to get the desired output: (suppose i could get the output as below..) JAN FEB MAR APR MAY JUN A 4 0 0 0 0 107 B 3 39 0 0 0 0 C 0 0 0 0 8 0 my output: (missing A B C on the most left) JAN FEB MAR APR MAY JUN 4 0 0 0 0 107 3 39 0 0 0 0 0 0 0 0 8 0 Posted by admin (Graham Ellis), 22 January 2008 This looks like a change in specification. I'm sorry, but I don't understand how "just rotating the data" suddenly brings months in as well ... and it is very frustrating for those of us who are trying to help you to solve the original question as posted just to be told that the goalposts have moved.At the risk of writing what sounds very obvious, can I suggest that you work out a specification (description) of what you need to do as a whole and give an overview of that if you need any further help. By all means ask questions to help you step along the way as well, but - PLEASE - the full picture as well. It may turn out that the steps along the way can be missed out, and knowing the whole picture will certainly help avoid us (the people trying to help) feeling that out time is being wasted. Posted by yamaha102 (yamaha102), 22 January 2008 I am sorry.I will try to do my best in my explanation from now onwards Basically there are two goal in the task... 1st ->raw.txt change to output.txt (This one Completed... thanks for guiding) 2nd ->output.txt to finaloutput.txt (This is the problem now) for the 2nd, i try to use DATA::Xtab to solve it, but i face some difficulties. the output is different than desired . raw.txt Date,Product, Price 20070101,Car, A 20070102, Car, A 20080102, Car, B 20080103, Boat, A 20080104, Boat, B 20080103, Boat, B output.txt 2007,Car, A, 2 2008,Car, B, 1 2008,Boat, A, 1 2008,Boat, B, 2 finaloutput.txt BLANK,Car,Boat 2007,2,0 2008,1,3 TOTAL,3,3 thank you Posted by KevinAD (KevinAD), 22 January 2008 This is a bit contrived to work with your data. The hope is that it helps you in some way, not that it really solves your problem. I added an extra product to the data file, Plane, just so you can sort of get a better feel for how it works.Code:
Note that this approach requires you know all the product names beforehand. That can be changed though to make the @product list dynamic. The input file (data.txt): Date,Product,Price 20070301,Car,A 20070415,Car,A 20070415,Car,B 20070416,Car,B 20070422,Boat,B 20091228,Plane,D 20071201,Boat,B 20071204,Car,A 20080101,Car,A 20080102,Boat,B 20080103,Boat,B 20080104,Car,A Posted by KevinAD (KevinAD), 22 January 2008 When learning data structures like I used, it's very helpful to use the Data:![]() Like in my code, originally the quantities were not printing. That was because the @products array used all lower-case words, but the data file had upper-case characters. When I used Data: ![]() Posted by yamaha102 (yamaha102), 23 January 2008 Thanks.From your code, I somehow learnt on the map and unpack terms. Thank you. With my limitation of knowledge about perl, I tried to modify your code in order to sum all the columns and the ouput file as below: Year,car,plane,boat,TOTAL 2007,5,0,2,0 2008,2,0,2,0 2009,0,1,0,0 ALL,7,1,4,0,12 However the silly code i added to your code was not practical enough as there's alot of "variable" like $count4 += @Raw[3]... I have a few questions: 1. Mind to suggest to me how to simply the steps, please? 2. While i was trying to play around with your code, i do have the difficulty to sum up all the number for row. If you have the time, please suggest to me which step i need to take to counter the problem. 3. Is it more difficult for me to directly convert the output.txt to finaloutput.txt compare to the raw.txt --> finaloutput? use warnings; my %data = (); my @products = qw(car plane boat TOTAL); $products="@products\n"; print $products; open(IN, 'c:/perl/data.txt') or die "$!"; readline IN; #skip header while(<IN>){ chomp; my ($date, $prod, $price) = split(/,/); my ($y, $m, $d) = unpack("a4a2a2",$date); $data{$y}{lc $prod}++; $counter{"$y"}++; } close IN; print join(',','Year',@products), "\n"; foreach my $key (sort keys %data) { print join(',',$key, map{ $data{$key}{lc($_)}||'0'} @products),"\n"; @Raw=map{ $data{$key}{lc($_)}||'0'} @products; #print "@Raw[2]\n"; $count1 += @Raw[0]; $count2 += @Raw[1]; $count3 += @Raw[2]; $count4 += @Raw[3]; } $countAll =($count1 + $count2 + $count3 + $count4); print "ALL,$count1,$count2,$count3,$count4,$countAll\n,"; Thank you. Posted by KevinAD (KevinAD), 23 January 2008 I think what you may need to doin order to better undestand my code is read some tutorials on perls complex data capabilities. The first three tutorial on this page cover references (complex data):http://perldoc.perl.org/index-tutorials.html the seond one is the most indepth: perldsc - data structure complex data structure struct Your attempt to add the total for each row shows a complete lack of understanding of what my code is doing so try and read the tutorials I linked you to. Complex data is a little confusing at first, but once you learn how to use references a whole new world of possibilities opens up. Here is some revised code: use strict; use warnings; use Data: ![]() my %data = (); my @products = qw(car plane boat); open(IN, 'c:/perl_test/data.txt') or die "$!"; readline IN; #skip header while(<IN>){ chomp; my ($date, $prod, $price) = split(/,/); my ($y, $m, $d) = unpack("a4a2a2",$date); $data{$y}{lc $prod}++; } close IN; print join(',','year',@products,'TOTAL'), "\n"; foreach my $key (sort keys %data) { print $key; my $total = 0; foreach my $p (@products) { print ',', $data{$key}{lc($p)} || '0'; $total += $data{$key}{lc($p)} || 0; } print ",$total\n"; } Of course this is not the only way to go about doing this. 10 people will write this 10 differnt ways, using various methods to arrive at the same output. I have a few questions: 1. Mind to suggest to me how to simply the steps, please? I don't understand what you are asking. 2. While i was trying to play around with your code, i do have the difficulty to sum up all the number for row. If you have the time, please suggest to me which step i need to take to counter the problem. See the code I posted for one possible solution 3. Is it more difficult for me to directly convert the output.txt to finaloutput.txt compare to the raw.txt --> finaloutput? You do not need any intermediary output file. From one data set you can produce output.txt and finaloutput.txt. Posted by yamaha102 (yamaha102), 24 January 2008 Thanks.I will try to digest it. Thank you. Posted by yamaha102 (yamaha102), 5 March 2008 Hi, sorry to trouble you guys again.i got 2 files as below: companyA.txt _____________ Date,Product,Price 20060301,Car,A 20070415,Car,A 20070415,Car,B 20070416,Plane,B 20080422,Boat,B companyB.txt _____________ Date,Product,Price 20060301,Car,A 20070415,Car,A 20070415,Car,B _____________ use strict; use warnings; my %data = (); my %data=(); my @products = qw(car plane boat); open(IN, 'c:/perl_test/campanyA.txt') or die "$!"; readline IN; #skip header while(<IN>){ chomp; my ($date, $prod, $price) = split(/,/); my ($y, $m, $d) = unpack("a4a2a2",$date); $data{$y}{lc $prod}++; } close IN; print join(',','year','TOTALA'), "\n"; foreach my $key (sort keys %data) { print $key; my $total = 0; foreach my $p (@products) { $total += $data{$key}{lc($p)} || 0; } print ",$total\n"; } open(IN, 'c:/perl_test/companyB.txt') or die "$!"; readline IN; #skip header while(<IN>){ chomp; my ($date, $prod, $price) = split(/,/); my ($y, $m, $d) = unpack("a4a2a2",$date); $data{$y}{lc $prod}++; } close IN; print join(',','year','TOTALB'), "\n"; foreach my $key (sort keys %data1) { print $key; my $total = 0; foreach my $p (@products) { $total += $data1{$key}{lc($p)} || 0; } print ",$total\n"; } __Output1__ Year,TOTALA 2006,1 2007,3 2008,1 Year,TOTALB 2006,1 2007,2 ______________ emm... Actually i wish to get another output file from the above outputs (comparing the outputs -->TOTALA -TOTALB) where i could get __OUTPUT2___ Year,TOTAL_NEW 2006,0 2007,1 2008,1 please guide me on how to process the desire output(OUTPUT2) without process the OUTPUT1. Thanks. This page is a thread posted to the opentalk forum
at www.opentalk.org.uk and
archived here for reference. To jump to the archive index please
follow this link.
|
| |||||||||||||||
PH: 01144 1225 708225 • FAX: 01144 1225 793803 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho |