ColinTaylor
Part of the Furniture
As per the title, there have been frequent reports in the forum of file transfers from the router’s USB drive being unreliable. Often this is when people are trying to use the router as a NAS. The problem can manifest in various ways and its intermittent nature has made understanding the cause difficult.
Having come across a simple set of circumstances that appeared to allow me to recreate the problem on demand I thought it would be interesting to diagnose it further. What follows is a summary of my observations, experimentation and one possible solution.
As my test setup is very simple I would be interested to hear from people experiencing this issue whether the solution below was effective for them.
Packet capture on router shows cksum is incorrect (because it's offloaded).
When the problem occurs the router sends a single "TCP Out-Of-Order" followed by multiple "TCP Retransmission"s. The client sends lots of "TCP Dup ACK" back to router before failing.
Only seems to affect data that's been read from a USB drive, even if it's already cached, as it doesn't appear to effect iperf.
EDIT: Retested iperf with 10 parallel streams and it does appear to suffer some sort of problem as tx error messages appear in the syslog.
Only specific files seem to trigger the problem.
SCP transfers don’t appear to suffer from this problem but SMB, FTP and NFS do.
Only affects router to LAN transfers. Does not affect WAN to LAN or LAN to LAN traffic.
Doesn't affect Wi-Fi clients as badly, although I still see Dup ACK’s. But client seems able to recover using normal TCP methods.
To disable TCP segmentation offload, checksumming and scatter gather must also be disabled (which in turn disables generic segmentation offload).
Turning off checksumming alone or gso alone does not fix the problem.
Use this command on your router to turn off TCP segmentation offload (this change will be undone if the router is rebooted):
N.B.
Note: Disabling offloading increases the load on the router’s processor. As such your router to LAN throughput may be restricted, especially on routers with weak CPU’s (e.g. RT-AC68U). This may also impact other CPU-intensive processes running on the router like VPN encryption.
Having come across a simple set of circumstances that appeared to allow me to recreate the problem on demand I thought it would be interesting to diagnose it further. What follows is a summary of my observations, experimentation and one possible solution.
As my test setup is very simple I would be interested to hear from people experiencing this issue whether the solution below was effective for them.
Observations and Packet Capture
Packet capture on router shows cksum is incorrect (because it's offloaded).
When the problem occurs the router sends a single "TCP Out-Of-Order" followed by multiple "TCP Retransmission"s. The client sends lots of "TCP Dup ACK" back to router before failing.
Only seems to affect data that's been read from a USB drive, even if it's already cached, as it doesn't appear to effect iperf.
EDIT: Retested iperf with 10 parallel streams and it does appear to suffer some sort of problem as tx error messages appear in the syslog.
Only specific files seem to trigger the problem.
SCP transfers don’t appear to suffer from this problem but SMB, FTP and NFS do.
Only affects router to LAN transfers. Does not affect WAN to LAN or LAN to LAN traffic.
Doesn't affect Wi-Fi clients as badly, although I still see Dup ACK’s. But client seems able to recover using normal TCP methods.
Potential Solution – Disable TCP segmentation offload on bridge interface
To disable TCP segmentation offload, checksumming and scatter gather must also be disabled (which in turn disables generic segmentation offload).
Turning off checksumming alone or gso alone does not fix the problem.
Use this command on your router to turn off TCP segmentation offload (this change will be undone if the router is rebooted):
Code:
ethtool -K br0 tx off sg off tso off gso off
ethtool
must be installed from the Entware repository (opkg install ethtool
).Note: Disabling offloading increases the load on the router’s processor. As such your router to LAN throughput may be restricted, especially on routers with weak CPU’s (e.g. RT-AC68U). This may also impact other CPU-intensive processes running on the router like VPN encryption.
Last edited: