When connecting to new OVM for SPARC (LDOMs), I found that there was a network problem, which appears not to be a known issue.
This post might be of interest to you if you plan to use a similar platform.
We have a primary control domain and an LDOM (Solaris VM) created on a T4-2 server. Solaris 11.1 is used on the primary domain and the LDOM.
Virtualised network devices are created over link aggregates with VLAN tagging over 10Gb Oracle CNA HBAs in the primary domains and are assigned to the LDOM.
The problem occurs with any outbound traffic on the 10G HBA.

The first symptoms were the inability to transfer large files from Windows using SFTP, and PuTTy SSH or VNC sessions being disconnected frequently and randomly.

I used Microsoft Network Monitor to capture the network traffic during these problems and found that acknowledgement packets from Solaris that should have been empty had two extra bytes added, and so the checksum was incorrect.  After a while, these checksum errors and the retransmissions would result in the TCP session being disconnected.

The problem affected multiple protocols, but only from the LDOMs, not the primary domain.

The closest issues I could find related to checksum offloading, (where the network card performs the checksum processing to reduce the load on the server’s CPUs).
Disabling checksum offloading on the LDOMs for the 10Gb CNA cards successfully worked around the problem.  (The 1GB cards were not affected).
Eg, Add

set ip:dohwcksum=0x0

to /etc/system in the LDOM and reboot.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s