Friday, March 28, 2014

pfSense and High Availability Part 1 - Network Interface Bonding (LAGG)

One of the problems people face when deploying pfSense is network interface bonding. It's not very straight-forward and in some ways counterintuitive.

Let me illustrate the problem:

We've already set up our external (0 for Cisco and WAN for pfSense afficionados) and internal (100 for Cisco and LAN for pfSense afficionados) interfaces. When we try to set up LAGG, these two interfaces do not appear available for setting up, although every other interface is. The problem lies in the fact that they are already in use. So how do we go about setting up network interface bonding in pfSense?

It's actually pretty simple. Let me illustrate:

First of all, for the sake of clarity my WAN and LAN interfaces are bce0 and bce1:


Go to Interfaces -> Assign -> LAGG and select "+":


Create a WAN LAGG bond consisting of only the interface(s) that will be available, as if the bond consisted of the network interfaces you'd ideally like to consist, except the currently used interface for WAN. Ugh, I'm making it sound more complicated than it is.

To make it clearer, let's suppose you wanted to create a WAN bond consisting of bce0 and em3. What we would ideally like to do is choose bce0 and em3. Well, in our case we only select em3 (bce0 is not available to us anyway) and we create a LAGG team consisted solely of that one interface, silly as it may sound initially.


Save and repeat the process for the LAN LAGG team, creating a team using the interfaces we'd like the team to consist of except the currently used LAN interface.


Save and create the rest of your LAGG interfaces as you would usually.


Here's an idea of what we should roughly have when we're done with this process:


Now, go to "Interface Assignments":


Change the interface assignments to their LAGG interface counterparts, save and add any ones that are needed. Take a peek at mine:


Go to LAGG again:


Edit the WAN LAGG interface:


The previously unavailable WAN interface should be available to form our team now. Select as needed and save:


Repeat the process for the LAN interface:


Everything should be working:


In case your master interface priority is wrong, all you need to do is backup your configuration, open and edit your config.xml file, manually change their position and upload.

For example:
    <laggs>
      <lagg>
        <members>em3,bce0</members>
        <descr><![CDATA[WAN_TEAM]]></descr>
        <laggif>lagg0</laggif>
        <proto>failover</proto>
      </lagg>
      <lagg>
        <members>em4,bce1</members>
        <descr><![CDATA[LAN_TEAM]]></descr>
        <laggif>lagg1</laggif>
        <proto>failover</proto>
      </lagg>
      <lagg>
        <members>em0,em5</members>
        <descr><![CDATA[CARP_TEAM]]></descr>
        <laggif>lagg2</laggif>
        <proto>failover</proto>
      </lagg>
    </laggs>

Now, I would like for my WAN bond to have bce0 as the master/primary interface, for LAN bce1 and for CARP em0. Therefore I edit like so:
    <laggs>
      <lagg>
        <members>bce0,em3</members>
        <descr><![CDATA[WAN_TEAM]]></descr>
        <laggif>lagg0</laggif>
        <proto>failover</proto>
      </lagg>
      <lagg>
        <members>bce1,em4</members>
        <descr><![CDATA[LAN_TEAM]]></descr>
        <laggif>lagg1</laggif>
        <proto>failover</proto>
      </lagg>
      <lagg>
        <members>em0,em5</members>
        <descr><![CDATA[CARP_TEAM]]></descr>
        <laggif>lagg2</laggif>
        <proto>failover</proto>
      </lagg>
    </laggs>

And re-upload to the server in question. Simple enough process.

Note: In pfSense 2.2 and above, LAGG using LACP in FreeBSD 10.0 and newer defaults to "strict mode" being enabled, which means the lagg does not come up unless your switch is speaking LACP.

This will cause your LAGG to not function after upgrade if your switch isn't using active mode LACP.
You can retain the lagg behavior in pfSense 2.1.5 and earlier versions by adding a new system tunable under System>Advanced, System Tunables tab for the following:

net.link.lagg.0.lacp.lacp_strict_mode

With value set to 0. You can configure this in 2.1.5 before upgrading to 2.2, to ensure the same behavior on first boot after the upgrade. It will result in a harmless cosmetic error in the logs on 2.1.5 since the value does not exist in that version.
If you have more than one LAGG interface configured, you will need to enter a tunable for each since that is a per-interface option. So for lagg1, you would add the following.

net.link.lagg.1.lacp.lacp_strict_mode

Also with the value set to 0.

1 comment:

  1. thank you for sharing this info. Ill be sure to credit your blogsite when i am done documenting our setup

    ReplyDelete