BFD as a Service for Neutron

https://bugs.launchpad.net/neutron/+bug/1907089

Add possibility to define BFD monitors in Neutron and associate it with routes.

Problem Description

BFD (Bidirectional Forwarding Detection) can be used to detect link failures between a Neutron router and an arbitrary destination, like an external non-Neutron router.

BFD can be used to provide quick, low-overhead link failure detection between adjacent forwarding engines ([1]) to check for example if an extra route (a nexthop-destination pair) is alive or not, and change routes accordingly.

Note

This spec is not covering other use cases like tunnel monitoring and HA or DVR router monitoring, or BGP or other routing protocol monitoring.

Proposed Change

The goal of this spec is to add APIs to manage bfd_monitor instances, fetch their status, associate a monitor to an extra route. The goal is to cover the use case when Neutron managed bfd_monitor instances take the ACTIVE role, thus send BFD control packets, (see [2]).

This gives possibility to the operators to monitor the routes (destination - nexthop pairs) added for a router.

Monitoring scenarios

  • Two different nexthops for the same destination, like ecmp route legs

dst1

nexthop1

bfd_dst1

bfd1

dst1

nexthop2

bfd_dst1

bfd2

____nexthop1 ____
                 \ ____dst1
____nexthop2____ /
  • monitoring 2 different destinations through the same nexthop, the case when bfd3 and bfd4 dst_ip is the same:

dst3

nexthop3

bfd_dst3

bfd3

dst4

nexthop3

bfd_dst4

bfd4

                   ____dst3
_____nexthop3_____/
                  \____dst4

REST API Impact

New bfd monitor API

  • Create bfd_monitor:

    POST /v2.0/bfd_monitors
    {
        "bfd_monitor": {
            "name": "bfd1",
            "mode": "asynchronous",
            "min_rx": 1000,
            "min_tx": 100,
            "multiplier": 3,
            "dst_ip": "10.0.0.13",
            "auth_type": "",
            "auth_keys": {1: "secret_1", 2: "secret_2"},
        }
    }
    
  • List bfd_monitors:

    GET /v2.0/bfd_monitors
    
  • Show bfd_monitor:

    GET /v2.0/bfd_monitors/{monitor_id}
    {
        "bfd_monitor": {
            "name": "bfd1",
            "project_id": "{project_id}",
            "description": "",
            "mode": "asynchronous",
            "min_rx": 1000,
            "min_tx": 100,
            "multiplier": 3,
            "dst_ip": "10.0.0.13",
            "src_ip": "169.254.112.144",
            "auth_type": "",
            "auth_keys": {1: "secret_1"},
        }
    }
    
  • Update a bfd_monitor:

    PUT /v2.0/bfd_monitors/{monitor_id}
    {
        "bfd_monitor": {
            "name": "bfd_monitor1"
            "description": "foo",
            "min_rx": 2000,
            "min_tx": 200,
            "multiplier": 4
        }
    }
    
  • Delete bfd_monitor:

    DELETE /v2.0/bfd_monitors/{monitor_id}
    

Note

Only an unassociated bfd_monitor can be deleted.

API fields and descriptions for bfd_monitor:

Attribute

Type

Req

CRUD

Description

id

uuid-str

No

R

id of bfd_monitor

name

String

No

CRU

Human readable name for the bfd_monitor (255 characters limit). Does not have to be unique.

description

String

No

CRU

Human readable description for the bfd_monitor (255 characters limit).

project_id

String

No

R

Owner of the bfd_monitor.

mode

String

No

CR

Can be asynchronous (default common echo mode of BFD) or demand (some other mechanism is used to detect link state) and can accept future modes like one-arm-echo see [4] .

dst_ip

String

Yes

CR

The destination IP address to be monitored. In case of singlehop bfd this is the nexthop ip of the route, for the general case (like multihop bfd) this is an arbitrary IP (IPv4 or Ipv6) that can serve as BFD neighbor.

src_ip

String

No

CR

IP address used as source for transmitted BFD packets. An optional field, if not specified, the IP set on the interface, like on qg-xyz. This can be set on the remote end to be configured.

min_rx

Integer

No

CRU

The shortest interval, in millisecs, at which this BFD session offers to receive BFD control messages. At least 1. Defaults is 1000.

min_tx

Integer

No

CRU

The shortest interval, in millisecs, at which this BFD session is willing to transmit BFD control messages. At least 1. Default is 100.

multiplier

Integer

No

CRU

The BFD detection multiplier, An endpoint signals a connectivity fault if the given number of consecutive BFD control messages fail to arrive. Default is 3.

status

String

N/A

R

Shows if the BFD monitor was succesfully created in the backend, but nothing about the session status, for that the session_status API endpoint can be used.

auth_type

String

No

CR

The Authentication Type, which can be password, MD5, MeticulousMD5, SHA1, MeticulousSHA1, if empty no authentication is used.

auth_key

Dict

No

CR

A dictionary of authentication key chain in which key is an integer of Auth Key ID and value is a string of Password or Auth Key.

Note

For using authentication with BFD, please check [5]

API extension proposal for bfd_monitors:

ALIAS = 'bfd-monitor'
IS_SHIM_EXTENSION = False
IS_STANDARD_ATTR_EXTENSION = False
NAME = 'BFD monitors for Neutron'
DESCRIPTION = "Provides support for BFD monitors"
UPDATED_TIMESTAMP = "2021-02-12T11:00:00-00:00"
BFD_MONITOR = 'bfd_monitor'
BFD_MONITORS = 'bfd_monitors'
BFD_SESSION_STATUS = 'bfd_session_status'

BFD_MODE_ASYNC = 'asynchronous'
BFD_MODE_DEMAND = 'demand'
BFD_MODE_ONE_ARM = 'one_arm_echo'

RESOURCE_ATTRIBUTE_MAP = {
    BFD_MONITORS: {
        'id': {'allow_post': False, 'allow_put': False,
               'validate': {'type:uuid': None},
               'is_visible': True,
               'primary_key': True,
               'enforce_policy': True},
        'name': {'allow_post': True, 'allow_put': True,
                 'validate': {'type:string': db_const.NAME_FIELD_SIZE},
                 'default': '', 'is_filter': True, 'is_sort_key': True,
                 'is_visible': True},
        'description': {'allow_post': True, 'allow_put': True,
                        'is_visible': True, 'default': '',
                        'validate': {
                            'type:string': db_const.DESCRIPTION_FIELD_SIZE}},
        'project_id': {'allow_post': True, 'allow_put': False,
                       'validate': {
                           'type:string': db_const.PROJECT_ID_FIELD_SIZE},
                       'required_by_policy': True,
                       'is_visible': True, 'enforce_policy': True},
        'mode': {'allow_post': True, 'allow_put': False,
                 'validate': {'type:string': db_const.STATUS_FIELD_SIZE},
                 'default': BFD_MODE_ASYNC, 'is_filter': True,
                 'is_sort_key': True, 'is_visible': True},
        'dst_ip': {'allow_post': True, 'allow_put': False,
                   'validate': {'type:ip_address': None},
                   'is_sort_key': True, 'is_filter': True,
                   'is_visible': True, 'default': None,
                   'enforce_policy': True},
        'src_ip': {'allow_post': True, 'allow_put': False,
               'validate': {'type:ip_address_or_none': None},
               'is_sort_key': True, 'is_filter': True,
               'is_visible': True, 'default': None,
               'enforce_policy': True},
        'min_rx': {'allow_post': True, 'allow_put': True,
                   'validate': {'type:non_negative': None},
                   'convert_to': converters.convert_to_int,
                   'default': 1000,
                   'is_visible': True, 'enforce_policy': True},
        'min_tx': {'allow_post': True, 'allow_put': True,
                   'validate': {'type:non_negative': None},
                   'convert_to': converters.convert_to_int,
                   'default': 100,
                   'is_visible': True, 'enforce_policy': True},
        'multiplier': {'allow_post': True, 'allow_put': True,
                       'validate': {'type:non_negative': None},
                       'convert_to': converters.convert_to_int,
                       'default': 3,
                       'is_visible': True, 'enforce_policy': True},
        'status': {'allow_post': False, 'allow_put': False,
                   'is_filter': True, 'is_sort_key': True,
                   'is_visible': True},
        'auth_type': {'allow_post': True, 'allow_put': False,
                      'validate': {'type:string_or_none':
                                   db_const.NAME_FIELD_SIZE},
                      'default': constants.ATTR_NOT_SPECIFIED,
                      'is_visible': True},
        'auth_key': {'allow_post': True, 'allow_put': False,
                     'validate': {'type:dict_or_none': None},
                     'default': constants.ATTR_NOT_SPECIFIED,
                     'is_visible': True},
    },
}
SUB_RESOURCE_ATTRIBUTE_MAP = {}
ACTION_MAP = {
    BFD_MONITOR: {
        'bfd_session_status': 'GET',
        'bfd_monitor_associations': 'GET',
    }
}
ACTION_STATUS = {}
REQUIRED_EXTENSIONS = [l3.ALIAS]
OPTIONAL_EXTENSIONS = []
  • Show bfd session status (For details see rfc5880, State Variables section):

    GET /v2.0/bfd_monitors/{monitor_id}/session_status
    {
        "bfd_session_status": {
            "remotes": [{
                "type": "extra_route",
                "router": {router_id},
                "extra_route": {
                    "destination": "10.0.3.0/24",
                    "nexthop": "10.0.0.1"
                },
                "src_ip": "169.254.112.144",
                "status": {
                    "SessionState": "Up",
                    "RemoteSessionState": "Up",
                    "LocalDiagnostic": "",
                    "RemoteDiagnostic": "",
                    "LocalDiscriminator": "",
                    "RemoteDiscriminator": "",
                    "Forwarding": "",
                }
            },
            {
                "type": "extra_route",
                ...
            }]
        }
    }
    
  • Show bfd_monitor associations:

    GET /v2.0/bfd_monitors/{monitor_id}/bfd_monitor_associations
    {
         "bfd_monitor_associations": {
             "remotes": [{
                 "type": "extra_route",
                 "router": {router_id},
                 "extra_route": {
                     "destination": "10.0.3.0/24",
                     "nexthop": "10.0.0.1"
                 },
                 "src_ip": "169.254.112.144",
             },
             {
                 "type": "extra_route",
                 ...
             }]
         }
    }
    

Note

The current proposal is only for “type”: “extra_route”, so the extra_route’s details (nexthop, destination as of now) will appear.

Note

One bfd_monitor instance can be associated to several extra_routes.

Note

src_ip is the IP address used as source for transmitted BFD packets. A read only field that practically shows the IP set on the interface, like on qg-xyz. This can be set on the remote end.

Short description of the status fields:

Attribute

Description

SessionState

The state of the BFD session, it can be admin_down, down, init, or up.

RemoteSessionState

The state of the remote endpoint’s BFD session, it can be admin_down, down, init, or up.

LocalDiagnostic

A diagnostic code specifying the local system’s reason for the last change in session state. The error messages are defined in section 4.1 of the RFC 5880 (see [3]).

RemoteDiagnostic

A diagnostic code specifying the remote system’s reason for the last change in session state. The error messages are defined in section 4.1 of the RFC 5880 (see [3]).

LocalDiscriminator

The local discriminator for this BFD session, used to uniquely identify it.

RemoteDiscriminator

The remote discriminator for this BFD session. This is the discriminator chosen by the remote system.

Forwarding

Reports whether the BFD session believes this Interface may be used to forward traffic. It can be true or false.

Note

The above fields are optional, so based on the backend not all will be filled, and in worst case a cumulative Forwarding result will only present.

Changes to Router extra routes API

  • Changes to add or remove extra routes to router or Update router API:

    PUT /v2.0/routers/{router_id}
    {
        "router": {
            "routes": [
                {
                    "destination": "179.24.1.0/24",
                    "nexthop": "172.24.3.99"
                    "bfd_monitor": {bfd_monitor_uuid}
                },
            ]
        }
    }
    
    PUT /v2.0/routers/{router_id}/add_extraroutes
    {
        "router" : {
            "routes" : [
               { "destination" : "10.0.3.0/24", "nexthop" : "10.0.0.13", "bfd_monitor": {bfd_monitor_uuid} },
               { "destination" : "10.0.4.0/24", "nexthop" : "10.0.0.14" }
            ]
        }
    }
    
    PUT /v2.0/routers/{router_id}/remove_extraroutes
    {
        "router" : {
            "routes" : [
               { "destination" : "10.0.3.0/24", "nexthop" : "10.0.0.13", "bfd_monitor": {bfd_monitor_uuid} }
            ]
        }
    }
    

Note

The API will allow to remove bfd_monitor association from a route without deleting and recreating the whole nexthop-destination tuple.

  • Get routes status:

    GET /v2.0/routers/{router_id}/routes_status
    {
        "router": {
            "routes": [
                { "destination" : "10.0.3.0/24",
                  "nexthop" : "10.0.0.13",
                  "status": "UP",
                  "bfd_monitor": {bfd_monitor_uuid}
                },
            ]
        }
    }
    

API extension proposal for allowing bfd_monitor association to extra_routes:

ALIAS = 'bfd-for-extraroutes'
IS_SHIM_EXTENSION = False
IS_STANDARD_ATTR_EXTENSION = False
NAME = 'BFD monitors for extraroutes'
DESCRIPTION = ('Provides the possibility to associate a bfd_monitor to'
               'extra_routes')
UPDATED_TIMESTAMP = '2021-01-29T00:00:00+00:00'
RESOURCE_NAME = l3.ROUTER
COLLECTION_NAME = l3.ROUTERS
ROUTES = 'routes'
RESOURCE_ATTRIBUTE_MAP = {
    COLLECTION_NAME: {
        ROUTES: {
            'allow_post': False, 'allow_put': True,
            'validate': {'type:hostroutes': ['destination', 'nexthop',
                                             'bfd_monitor_id']},
            'convert_to': converters.convert_none_to_empty_list,
            'is_visible': True,
            'default': constants.ATTR_NOT_SPECIFIED},
    }
}
SUB_RESOURCE_ATTRIBUTE_MAP = None
ACTION_MAP = {}
REQUIRED_EXTENSIONS = [l3.ALIAS, extraroute.ALIAS]
OPTIONAL_EXTENSIONS = []
ACTION_STATUS = {}

Note

[6] proposes the change of the hostroutes validator to accept a list of possible fields to validate, like: destination, nexthop and bfd_monitor_id.

BFD itself and fetching bfd_monitor session status information can be quite resource intensive operation (the information must be fetched from the backend), so new API endpoint is proposed to fetch the status of monitoring to not overload or affect ordinary router API operations.

The routes_status API endpoint gives feedback to the user if the monitored link is up.

If the bfd_monitor’s session_status is DOWN (BFD detects the given link dead) the given route should be removed in the backend (on the command line: ip r delete….), and set back if the monitor is UP again.

The bfd_monitor instance in the backend is created when the bfd_monitor is associated with any routes, the status of the bfd_monitor is set to ACTIVE from DOWN. The bfd_monitor’s status set to DOWN when the extra route updated and bfd_monitor value is set to empty, so the bfd_monitor is deleted in the backend.

As a bfd_monitor can be associated to many routes, if it is associated to at least one route and the creation in the backend is successful the status field of the bfd_monitor is ACTIVE, and it will change to DOWN when the last route association is deleted (the bfd_monitor is deleted from the last extra_route).

Backend

OVS can handle BFD on an interface and check the status of it. The problem with it is that ovs can have 1 BFD session per port, and to manage it ovsdb is the simplest way, but as LIU Yulong stated in [7] touching ovsdb from l3-agent can be weird.

A working solution with OVS is to create a BFD bridge, like br-bfd, and add veth ports to it, and enable BFD on those interfaces.

 --------
| br-int |
 --------
   | veth-xy-br-int
   |
   | veth-xy
 --------
| br-bfd |
 --------

Another possibility is to use os-ken, but during my experiments the BFD implementation in os-ken is not mature enough, and it needs ovsdb connection anyway, so I do not suggest os-ken to be used for BFD.

Data Model Impact

New table is created for bfd_monitors, routers table is changed to make it possible to associate a bfd_monitor to an extra route (a nexthop - destination pair).

Security Impact

None

Performance Impact

BFD and fetching bfd_monitor status information can be quite resource intensive operation as it can be fetched from the backend, it must be carefully documented.

Implementation

Assignee(s)

Primary assignee:

Lajos Katona <katonalala@gmail.com> (IRC: lajoskatona)

Work Items

  • REST API update.

  • DB schema update.

  • L3 agent update to handle BFD:

    • RPC change to send BFD data to l3 agent.

    • RPC change to fetch BFD status information from l3 agent.

  • CLI update.

  • Documentation.

  • Tests and CI related changes.

Testing

  • Unit Test

  • Functional test

  • API test

Perhaps this is something that can be easily tested end-to-end with fullstack tests. Need more investigation.

Documentation Impact

User Documentation

New API and changes to legacy APIs like routers must be documented.

References