in reply to concatenation of lines from two different files

I also noticed that you are reading file2 every time you need to lookup an aircraft ID. This could take a lot of time if your files are very large. Instead, read all of the data from file2 into a hash, and lookup the aircraft ID's as keys in that hash. Then you will have read each file only one time through. The code below is an example of how you can do this.

#!/usr/bin/perl use strict; use warnings; use diagnostics; use Cwd; # display current working directory my $Current_Dir = getcwd; print STDOUT "the current directory is $Current_Dir\n"; # input files to work on my $file_1 = "$ARGV[0]"; my $file_2 = "$ARGV[1]"; my $outfile = "outfile_$file_1"; # open files open INFILE_1, '<', $file_1, or die "Can't open $file_1 : $!"; open INFILE_2, '<', $file_2, or die "Can't open ${file_2} : $!"; open OUTFILE, '>', $outfile, or die "Can't open $outfile : $!"; # print the title of the columns my $titles_line = <INFILE_1>; print OUTFILE $titles_line; # create hash of aircraft specific data from file 2 my %aircraftID; while (my $line_2 = <INFILE_2>){ chomp($line_2); my @Elements_2 = split ';', $line_2; my $aircraft_id_2 = shift @Elements_2; $aircraftID{$aircraft_id_2} = \@Elements_2; } close INFILE_2; # read each line of file1 and look for aircraft while (my $line_1 = <INFILE_1>){ chomp($line_1); my @Elements_1 = split ';', $line_1; my $aircraft_id_1 = $Elements_1[1]; my $length_1 = @Elements_1; # monitor the process print STDOUT "the length is $length_1\n"; print STDOUT "The Table is @Elements_1\n"; # print the current line into the output file print OUTFILE "$line_1"; # if the line contains an aircraft ID, search for its data in file 2 if ($aircraft_id_1 && exists $aircraftID{$aircraft_id_1}) { print OUTFILE join(';',@{$aircraftID{$aircraft_id_1}}), "\n"; } else { print OUTFILE (";" x (40-$length_1)), "\n" } } close INFILE_1; close OUTFILE;

Replies are listed 'Best First'.
Re^2: concatenation of lines from two different files
by steph_bow (Pilgrim) on Aug 20, 2007 at 16:12 UTC

    Thanks a lot dogz007

    I have understood what you explained and I think your code is very good Thanks

    However, I have a problem

    the result of the code with the files posted in the beginning is this

    Slot_time;Aircraft_Id;EOBT;ETOT;CTOT;ATOT;ETO;CTO;ATO;Last_DLA;EOBT_DL +A;Anticip_DLA_min;Flag_EOBT_DLA;Last_FPL;EOBT_FPL;Anticip_FPL_min;Fla +g_EOBT_FPL;ETOT_First;Delta_ETOTs_min;ATFM_Delay;;;;;;;; 08:40:00;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;; 08:40:58;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;; 08:41:56;BMA2CW;08:00;08:20;08:33;08:33;08:31;;08:41;07:52:00;08:00;-8 +;;;;;;;;;;;;;;;; ;;;;;;;;;;;; 08:42:55;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;; 08:43:53;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;; 08:44:51;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;; 08:45:50;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;; 08:46:48;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;; 08:47:46;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;; 08:48:45;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;; 08:49:43;DAL11;08:10;08:30;08:30;08:39;08:47;08:47;08:50;;;;;05:03:00; +08:10;-187;;08:30;0;0;;;;;;;; DAL12;DAL;B772;KATL;EGKK;22:05;1325;05:39;339 08:50:41;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;; 08:51:40;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;; 08:52:38;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;; 08:53:36;ACA879;07:30;07:42;07:42;07:52;08:51;08:51;08:53;;;;;05:06:00 +;07:30;-144;;07:42;0;0;;;;;;;;;;;;;;;;;;;;;;;;;;;;

    I cannot understand why the data added are not on the same line as the data of the first file

    DAL12 should have been on the same line as DAL11 and I cannot understand because there is chomp in the code

    Well, I am going to make a post on this particular point

    Thanks a lot for your explanations