[kwlug-disc] Accessing recently-written NFS files

Tue Jan 10 14:12:48 EST 2023

Thanks all for your suggestions, comments and help. I'd like to tell you
that I have a satisfying resolution, but so far the root cause of my NFS
woes remains a mystery.

The thing that really gets my goat is that, based on everything I've read
following threads from Paul's very helpful ArchLinux wiki link, is that NFS
buffering/caching should be transparent to the same client - the
acknowledged concurrency issues that arise from the caching are stated to
be a concern when multiple clients are accessing the same files.

I'll attempt to follow up on some of the ideas. TL:DR; this seems
phase-of-the-moonish, and I load or latency to be the culprit at this point.

Paul said:
> Do you get permission denied errors if you run the
> below loop with 10 byte files, but run it 10000 times?

Yes, with seemingly random-ish distribution. Eg. in one run it was every
100-ish file, but then a bunch of gaps of 500+ without fail, then two
failures one after the other. The distribution does not seem consistent
between runs.

I said:
> $ for i in {1..100};
> do
> head -c 10 </dev/urandom >"/nfs_share/tmp/test_file${i}";
>
done && find /nfs_share/tmp/ -name test_file* -exec mv {}
/nfs_share/final_location
\;

Bob said:
> Right, so in 'command1 && command2' if command1 returns non-zero then
command2 will not run.
>
> OTOH, [Adam] is getting permission failures on the mv command, implying
that the part before the && is returning 0.

You touched an important point here Bob. Without realizing it, I was making
some assumptions
  1. That each command in the for loop runs sequentially
  2. That there will be no failures creating the files
  3. That before each `head` command exits, the redirected output file is
fully written (as far as the filesystem user is concerned) and closed.

Something like this would have expressed my intentions more correctly
 $ for i in {1..100};
do
head -c 10 </dev/urandom >"/nfs_share/tmp/test_file${i}" && mv
"/nfs_share/tmp/test_file${i}" /nfs_share/final_location;
done

In the end, I think both give useful information. If the issue had only to
do with the time between the end of a file being written and the attempt to
unlink it, I'd expect the errors to be biased toward the end of the set in
the `find` version. However, the errors seem to be randomly distributed
throughout the set.

Bob said
> [Adam], do you get the failures if you create just a single file of
10,000 bytes?

Not usually. If I manually run `head -c 10k </dev/urandom
>/nfs_share/tmp/test_file && mv /opt/odrive/tmp/test_file /
nfs_share/final_location`, I occasionally get an error. It seems like this
is more likely to happen if I run the loop, interrupt when an error pops
up, then immediately do the one-off copy. Manually generating 100k files I
got the error a few times, say trial 4 and trial 7, then trial 15, then
couldn't reproduce after maybe 50 more tries.

I'll update if I find anything new. It may be a while, as I need to
construct a useful ticket for the folks who have access to the logs, and
it's not a top priority.

Thanks again and happy new year,
Adam

On Sat, Dec 17, 2022 at 7:04 PM Khalid Baheyeldin <kb at 2bits.com> wrote:

> Adam,
>
> Is NFS a must in this case, or can you use other file systems?
>
> The reason I suggest this is that historically NFS have had a problematic
> legacy in certain use cases. These include caching inconsistencies in some
> use cases.
>
> About a decade ago I investigated how to share file systems across two web
> hosts. That was for a web site to share the code and static files on two
> web
> servers.  NFS was the most problematic, and not practical.
>
> There was GFS (not GFS2) and GlusterFS which showed promise last I looked
> at filesystems for the above use case.
>
> For certain types of use, sshfs may be all you need (e.g. mounting server
> shares
> to my laptop). Have been using it for years without issues. It even
> transparently
> reconnects after the laptop wakes up from sleep. It does not need a daemon
> on
> the server (openssh server does it all), which means great security.
>
> All you need is install the sshfs package on the client, and run the
> following command:
>
> sudo sshfs -o $OPTS user at host:/remote/file/system /local/mount
>
> The OPTS variable is where all the magic is
>
>
> IdentityFile=~myname/.ssh/id_ed25519,port=22,follow_symlinks,allow_other,noatime,
> idmap=file,uidfile=/etc/sshfs-uids,gidfile=/etc/sshfs-gids,nomap=ignore,
> workaround=rename,nonempty,reconnect,ServerAliveInterval=3,
> ServerAliveCountMax=10
>
> Besides your ssh private key, two files are needed to map the uid and gid
> to what they
> are on the server:
>
> user1:1001
> user2:1002
>
> users:100
> user1:1001
> user2:1002
>
> And that is all there is to it.
>
> SSHFS would not work in a high load scenario though.
> _______________________________________________
> kwlug-disc mailing list
> To unsubscribe, send an email to kwlug-disc-leave at kwlug.org
> with the subject "unsubscribe", or email
> kwlug-disc-owner at kwlug.org to contact a human being.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://kwlug.org/pipermail/kwlug-disc_kwlug.org/attachments/20230110/39cc4b28/attachment.htm>