[kwlug-disc] So why not tar -cf tarball.tar a.xz b.xz c.xz, instead of tar -cJf tarball.tar.xz a b c ?

B.S. bs27975.2 at gmail.com
Sat Nov 5 09:00:38 EDT 2016


On 11/05/2016 04:31 AM, William Park via kwlug-disc wrote:
> On Sat, Nov 05, 2016 at 12:33:19AM -0400, B.S. wrote:
>> The other premise of this conversation, though, is the ability to have
>> confidence in a file within a tar at any point in time - integrity
>> confirmation being inherent to the compress process would be an
>> advantage of tar'ring zips over zipping tars.
>
> You're talking about just packing files into tarball. ...

No, I'm talking about validating that whatever goes into a tarball, 
files or otherwise, is still as it went in, when it originally went in 
successfully.

> Your proposal won't work in vast majority of use cases for tar.

I don't agree, but in any case, what there is is better than the current 
case of nothing at all.

>  Tar is wrong tool for wrong job.

Not so. Tar is for what tar does. Most of the time, that's files, it is 
THE archiving tool, after all, but they don't have to be. (As I've been 
talking about post-validation of whatever got into the tar.)

And, as said, only tar seems to be kept up with all the filesystem 
advances, as the compress programs document themselves as not being. As 
a result, there is no other tool for the job but tar.

>  If you want zip behaviour, then use zipfile.

Again, I'm looking for verification.

It just so happens that compresses checksum, so provide that validation. 
As does, as mentioned, md5sums, sha1sums, or whatever.

Further, although I have yet to test on non-files/dirs, the use of 
--to-command='md5sum -' for example, produces something on stdout to be 
processed by md5sum. I suspect, regardless of the object type, md5sum 
will produce a value - whatever tar put out to stdout. And, I expect it 
will always put that same content out to stdout, and thus produce the 
same md5sum - validating that object within the tar.

I expect (or suspect) that should a bit change, or a block be lost in 
the middle of it - output to stdout will change, and thus the md5sum 
will be different, regardless of the object type. Alerting the user to a 
problem upon a diff of the before and after md5sum list diffs.





More information about the kwlug-disc mailing list