Hi,
Looks like you have a lot of spare time. Getting down to ms latencies on every piece of equipment would be hard enough in an environment where you had dedicated and isolated equipment every step of the way.
FC latencies are definitely a matter of the "moving parts", as you say, mostly queuing/buffering along the paths, possibly reordering of commands within switches with multiple paths/ISL's.
NFS latencies are probably more complex given congestion algorithms and the possibility of necessary TCP retransmissions (assuming/hoping you use TCP with NFS). NFS also needs to keep track of the status of files (and parts of files) and directories and has quite a bit of RPC "chatter" depending the version. Depending on the client, you also have a fair amount of buffering within the OS, without mentioning normal TCP window sizing, NFS read/write size options, Ethernet jumbo frames, TCP offloading or other tuning options for the NICs, general TCP stack tuning, and the like. NFS just has a lot more slack built in. It's not always a negative thing, but this resilience has its cost if it needs to be used, i.e. the latency spikes you see.
FC is pretty rigid, but almost simple in comparison and has generally lower latencies. FC also fails in spectacular ways during congestion scenarios that TCP handles with ease, at least if one doesn't use a lot more complex fabric configuration and strict queue depth policies on servers. It's like a fine sports car while NFS is like your reliable jeep.
I wish you luck in your endeavor, but I would be slightly surprised if you succeed. Isolating all of the parts and settings would be a very complex matrix.
S.