Hi,

I have been pulling my hair out on the following issue for the last 4 days - I am stuck, so I am posting here now.

I am currently running my WAN connection over PIA OVPN on a pfSense VM.

For availability reasons I wanted to move that, my Sophos UTM VM, my Debian DHCP & DNS VM and my pihole VM to a bare metal pfSense box and also migrate to a new IP range (Class C -> Class A + VLANs) and DNS suffix.

For that I chose a PCEngines APU2C4. I installed the 2.3.2 memstick image (amd64), updated to 2.3.2_1 and setup my interfaces and rules.

In testing I didnt use any VLANs but just physical interfaces to my notebook and desktop plus a WAN link (WAN as in IP subnet to my router/modem - no PPPoE and no NAT).

Since everything worked I changed my Netgear GS748TSv1 switch config to give me tagged ports for my VLANs and plugged everything in.

At first on a LACP LAGG since I planned on running a 3 interface LAGG with all my VLANs on it, then on single ports because after just a few packets all communication would stop.

I traced the issue down to inter VLAN communication on a single NIC by now. So when I run traffic from one VLAN to another it works fine as long as that VLAN is on another NIC.

Also if I transmit over/to a NIC without VLANs. But the second I run traffic from one VLAN to another that are hosted on the same NIC that NIC stops all communication.

Clients get a "Destination Unreachable" in terminal or a "ERR_ADDRESS_UNREACHABLE" in chrome etc and pfSense gives me a "Host down".

I tried to run a packet capture before I ran some traffic to recreate the issue but the second it occurs even the packet capture just stops, so its bound interface just terminated.

I moved VLANs across all 3 NICs by now and all 3 give me this issue upon VLAN traffic to and from the same NIC. The issue does not occur from mere pinging, it needs to be some more traffic than that e.g. opening youtube.com or just browsing about on a vSphere web client.

pfSense does not log anything regarding this nor posts any errors to TTY.

When I was on LACP I got some LACP flapping messages so I moved to static LAGG, which didnt resolve my issues ofc so I went on to a single tagged port.

I also encountered a queue length overflow so I added some tunables which made sense to me:

net.inet.tcp.tso 0 kern.ipc.nmbclusters 1000000 kern.ipc.soacceptqueue 4096 hw.igb.max_interrupt_rate 32000 hw.igb.fc_setting 0

I also added a```

if_igb_load="YES"