I think you're trying to merge two files that have corresponding records. I've seen this problem several times with legacy systems where different data come from different sources. You have to be sure that all the records line up (e.g. there's not an addition or deletion in one list), but let's assume that's true for now.

This is a simple task if you're used to dealing with line-oriented files (and not everything in the universe is). You read one line from each file, remove the line ending, concatenate the two lines, and output the result to a third file (although I use standard output in this case):

#!/usr/bin/perl use strict; # a programming aid to keep us honest # open each file open my $W, '<', 'W.txt' or die "Could not open W.txt: $!"; open my $Rs, '<', 'Rs.txt' or die "Could not open Rs.txt: $!"; # read the header of W.txt and ignore it # this syncs the positions in the file readline( $W ); while( 1 ) { # keep going until something else stops us # read a line for each file my $W_line = readline( $W ); my $Rs_line = readline( $Rs ); # stop if we ran out of lines from one of the files last unless( defined $W_line and defined $Rs_line ); # remove the line ending from the W line # leave the line ending on the Rs line because we'll use it chomp( $W_line ); # output the combined line with a space between them print $W_line, ' ', $Rs_line; }

I've added plenty of code comments here. When I'm working with something I'm not sure about, I often sketch out the process I want in comments then fill in the code to perform those bits. This is roughly the process you might take if you were doing this by hand. Remember, programming is automating the boring tasks that would take us a long time to do, so the steps are often the same. Indeed, sometimes I do things by hand first to figure out the problems in the process.

But, the real trick in programming is knowing when you don't need to program at all. You want to merge two files. There's a program for that:

% paste W.txt Rs.txt

There's a problem with that header line in W.txt. The easiest thing might be to simply copy the file and remove that one line. If you don't have to do this again, little manual interventions can save you quite a bit of work:

% paste W-noheader.txt Rs.txt

Alternatively, you could add a dummy line to Rs.txt so it has a header too. You might be able to get the source of that data to add that. It would be much nicer to have the column headings for the new values. Another programming trick is the application of beer. It lubricates many problems.

If you aren't on a machine that has paste (I'm not looking at you, Windows, but seriously I am), there's a fabulous project called the Perl Power Tools that recreates the tools in Perl, meaning that you can use them anywhere that you have a perl , but also that you can look at the source to see how they do it. You can take a tool that is close to what you want and modify it slightly for your local purpose. There's nothing special about Perl here. If you find something close in any language, go with that. The trick is to get work done.

But, lets assume that you can neither hand edit the file to remove the header (perhaps because this has to be repeatable) nor can you change the source to add a header. You need to sync the files starting at different lines. I thought that paste should handle this, but no version I've found does, and I also thought a tricky application of tail or head does. Maybe a better Unix guru can provide a command line.

And a Unix guru has provided such a command line using sub-processes. This is from Rhombold on Reddit:

To paste file1 with the contents of file2 minus the first line you can do:

$ paste file1 <(tail -n +2 file2) >output

You can generalize this to any number of inputs:

$ paste <(tail -n +10 file1) <(tail -n +3 file2) <(tail -n +7 file3) >output

I've already given you the answer to get your task done, so now I'm going to go off into the wilds on this problem. I want an improved paste that lets me specify the starting line for each file. First, I need to know how I want to specify that. paste can work with two or more files, so I want to be able to do that too. I need to be able to specify starting line numbers for each file. I could do something like this, where I have a list of starting line numbers in the same order as the files I specify. The comma isn't an argument separator in this case:

% epaste -l 1,2,3 file1 file2 file3

I don't like that because the line numbers are separated from the files; it seems dirty to me. I'd rather keep them together. If I have to build up this command line from another program, I don't want to track the line numbers and wait until the end of each input to know how to output the command. Instead, I'll do something that also feels slightly dirty by allowing the filename to end in "=N" to specify the starting line:

% epaste file1=1 file2=37 file3

This has a problem for files with an = in the name, but life it tough.

Looking at the source for the Perl Power Tools version of paste , I see there's really only one place I need to change. As it opens the files, I need to "fast forward" the files to the right starting line. The current code has this:

for $i (0..$#ARGV) { $fh[$i] = "F$i"; open($fh[$i], $ARGV[$i]) or die "$0: cannot open $ARGV[$i]"; }

But I need to change it to parse the filenames to look for starting line numbers then move to that line number.

for $i (0..$#ARGV) { $fh[$i] = "F$i"; my( $name, $line ) = $ARGV[$i] =~ /(.*?) (?: = ([0-9]+) )? \z/x; open($fh[$i], $name) or die "$0: cannot open $name"; if( defined $line ) { tell( $fh[$i] ); readline( $fh[$i] ) while $. < $line - 1 } }

There are a few interesting things to note here. In the line to get the file name, I have this match:

$ARGV[$i] =~ /(.*?) (?: = ([0-9]+) )? \z/x;

I have a non-greedy match for any character except a newline, (.*?) followed by an option portion to look for an equal sign followed by a series of decimal digits (?: = ([0-9]+) )? , but only at the end \z . The /x let's me spread that out by making whitespace in the pattern insignificant.

If I match something, $line has a value. If I don't, $line has undef. I only need to fast forward if there's something in $line . I use defined to check for that.

if( defined $line ) { ... }

Inside that if , I need to stop at the right line number. If I want to start at line 37, I need to read and discard 36 lines. That's one less than the number I specified.

To do this, I can look at $. , the current line number of the mostly recently read file (documented in perlvar. Note that "most recently read". I haven't read the file I'm working with, but I can use tell to change $. to my just opened filehandle without reading data:

tell( $fh[$i] ); readline( $fh[$i] ) while $. < $line - 1

James at blogs.perl.org comments that I could have used the filehandle as an object and avoid the special variable altogether:

... $fh->input_line_number < $line - 1

That's my Perl 4 showing through. Note that you might have to include use FileHandle in code for v5.12 and earlier since this wasn't added as a default until v5.14.

And that's it, almost. I don't look at the rest of the program and the tricky things it's doing to handle the other features of paste , such as changing the separator.

To keep supporting the special filename - as a name for standard input, I need to adjust the options processing a bit so it doesn't think = is an option (which I don't show here):

% epaste -=3 W.txt

I wish I could specify - with a starting line number multiple times, but those all step on each other because they use the same data. I can specify multiple files at the same time (if your filesystem allows simultaneous file reads):

% epaste animals.txt=2 animals.txt=6 animals.txt=4

This means your solution boils down to:

% epaste W.txt=2 Rs.txt

I've made an epaste gist for those who want the file or have corrections to fix mistakes that I've made.

And, that's the final programming trick of the day: Get someone else to write the program. :)