[kwlug-disc] BASH compare items in two files

John Van Ostrand john at netdirect.ca
Thu Nov 4 09:54:53 EDT 2010


----- Original Message -----
> >From top of my head,
> 
> 1. Clean up ID file, by removing all non "IDnumber".
> sed -e '^#d' -e 's/^/id="/' -e 's/$/"/'
> So, ID file will contain
> id="1"
> id="12"
> id="1930272"
> 
> 2. Clean up XML file, by reformatting so that
> <foo ... id="IDnumber" ...>
> is on the same line, and only one per line. Say, something like
> tr -d '\n' | sed -e 's/<foo[^>]*>/\n&\n/g'
> So, XML file will contain
> <foo ... id="1" ... >
> <foo ... id="12" ... >
> <foo ... id="1930272" ... >
> 
> 3. Do grep.
> fgrep -f ID_file XML_file
> will give you all the "foo" node with IDs listed in ID file.
> fgrep -v -f ID_file XML_file
> will give you the inverse, or substract it from total line count.
> 
> 4. Rinse, and repeat for "bar" and "baz" node.

I know it's from the top of your head, so don't take offence and at the nit picks:

You should change step 1 by adding a space in front of the "id=" so that it won't match things like "sid=".
You should change step 1 by removing any leading and trailing whitespace: sed -r -e '^#d' -e 's/^\s*/id="/' -e 's/\s*$/"/'
You should change step 2 by adding a space after "<foo" so it doesn't match things like "<foobar"

There may be more, that's of the top of my head too.

-- 
John Van Ostrand 
CTO, co-CEO 
Net Direct Inc. 
564 Weber St. N. Unit 12, Waterloo, ON N2L 5C6 
Ph: 866-883-1172 x5102 
Fx: 519-883-8533 

Linux Solutions / IBM Hardware 





More information about the kwlug-disc mailing list