-
Notifications
You must be signed in to change notification settings - Fork 42
How to write a principal
Note: This page is a work in progress.
This document explains how to write a principal in Linux XIA. Although it briefly touches on the theory of what a principal is, it is mostly meant to explain how to take an idea for a principal and translate it into an implementation in the XIA Linux kernel.
|
There is no widely-accepted definition for what a principal is, and there are not necessarily any restrictions on what a principal should or should not do. However, before trying to implement a principal, it is a good idea to have some plan for:
- How the principal should behave upon instantiation.
- What kind of routing table entries are necessary for the principal (explained more in the section "Defining Routing Table Utility Functions" below).
- How the principal should behave when it is processed in a DAG.
- How the principal may interact with other principals.
This section describes how a principal can be implemented in the Linux kernel by using the AD principal as an example.
Each step is assumed to be executed from within the top-level XIA-for-Linux directory. Each principal is given its own directory, so all of the code related to the AD principal is in the net/xia/ppal_ad directory.
To compile the principal and build its module, Makefile targets are added for both the core XIA module and the new principal. Using the AD principal as an example, these additions are:
- An entry in net/xia/Kconfig of the following form:
source "net/xia/ppal_ad/Kconfig"
- An entry in net/xia/Makefile of the following form:
obj-$(CONFIG_XIA_PPAL_AD) += ppal_ad
- A Kconfig file for AD in net/xia/ppal_ad/Kconfig:
config XIA_PPAL_AD
tristate "Autonomous Domain Principal (EXPERIMENTAL)"
default m
help
Autonomous Domain (AD) Principal is essential to route in XIA.
If you have added XIA, you likely want to add this principal
as well.
If you are unsure how to answer this question, answer M.
- A Makefile for AD in net/xia/ppal_ad/Makefile:
# # Makefile for AD principal # obj-$(CONFIG_XIA_PPAL_AD) += xia_ppal_ad.o xia_ppal_ad-objs := main.o
Note: The steps taken in this subsection should likely at a minimum be adapted for most principals.
For loading and unloading a module, each principal should have an initialization and exit function. For example, the AD module has xia_ad_init and xia_ad_exit for those respective tasks. Below are descriptions of the functions called within xia_ad_init, which may help you to determine whether they are necessary for your new principal:
- vxt_register_xidty is called to create a virtual XID type, which serves as an index into an array of loaded principals for the main XIA routing mechanism.
- xia_register_pernet_subsys, which is a wrapper for register_pernet_subsys, is called in order to guarantee that XIA principals are initialized after XIA core. This function registers a network namespace subsystem for this principal which has initialization and exit functions that are called when network namespaces are created and destroyed. More information about network namespaces is in the section "Defining Network Namespace Initialization/Exit Functions" below.
- xip_add_router is called to add the given principal's routing process to the overall XIA routing process. In other words, it puts a pointer to this principal's routing function in the array of routing functions to be selected at routing-time by the XIA routing mechanism.
- ppal_add_map is called to map a principal name to a principal type identifier.
- ppal_del_map is called to remove the mapping between a principal name and a principal type identifier.
- xip_del_router is called to delete this principal's routing process from the overall XIA routing process.
- xia_unregister_pernet_subsys is called to unregister the network namespace for this principal.
Note: The steps taken in this subsection should likely at a minimum be adapted for most principals.
Network namespaces allow an operating system to have separate instances of interfaces and routing tables that operate separately from each other. Since Linux XIA supports this functionality, there are also initialization and exit functions for adding or deleting per-principal information to a network namespace. The AD module has ad_net_init and ad_net_exit for this purpose.
The following steps are taken in ad_net_init:
- create_ad_ctx is called to create a new AD context and initialize it. In general, a principal's context contains information relevant to the principal in each network namespace instance, such as the principal type, routing table, dependency anchors, etc. Some contexts can have additional information; for example, the XDP principal's context has fields for XIA socket information, and the U4ID principal's context has a field for the source of a encapsulation tunnel.
- init_xid_table is called to create a new table in the FIB for AD routing.
- xip_add_ppal_ctx is called to add the AD context to the current struct net. It does this by attaching to struct netns_xia, which has an entry in struct net. This prevents the need for an arbitrary number of new entries in struct net, which would depend on the number of principals loaded.
- xip_del_ppal_ctx is called to delete the AD context from the given struct net.
- free_ad_ctx is called to free the resources that this AD context is using, including the routing table.
Note: The steps taken in this subsection can vary for different principals.
For each principal, there are typically two different types of routing table entries: local and main. Local entries represent instances of the principal that are hosted on the local host. Main entries represent instances of the principal that are hosted elsewhere in the network. For example, a local AD would be one that represents the local autonomous domain. A main AD would be one that represents an autonomous domain somewhere else in the network. Although this distinction between local and main entries is not absolutely necessary, it's how most network entities handle routing tables. For example, the notion of local and main entries is used in the IP stack as well as in most principals in XIA. Although it is recommended that any new principals use the local/main dichotomy, it is not strictly necessary. For example, the U4ID principal only uses local routing table entries, and the United4ID principal doesn't have any distinction between local and main entries (since it redirects to other principals).
The way local and main XIDs are handled varies from principal to principal. Some principals choose not to handle certain entries at all, and instead redirect the routing to the next principal in the DAG. For example, the AD principal keeps track of local ADs and when one is encountered in the DAG, the only action that is required is to simply move the last node pointer. In effect, this acknowledges that the indicated autonomous domain has been reached and progress is indicated in the DAG by advancing to the next node. However, the AD principal also keeps track of main ADs and when one is encountered in the DAG, the AD principal redirects to the next node in the DAG. The routing mechanism for the principal is then invoked to see if it can be routed. This redirecting process is not unique to main ADs, however; for example, main XDPs also redirect. How local and main entries are routed differently is handled in the next subsection.
This subsection is about creating functions for adding a new XID to the routing table, deleting an XID from the routing table, freeing an XID once it has been removed from the routing table, and dumping the routing table for display. These four pieces of functionality are implemented each for both local XIDs and main XIDs, and are grouped together as function pointers inside a type xia_ppal_all_rt_eops_t. For example, the AD version of this structure looks like:
static const xia_ppal_all_rt_eops_t ad_all_rt_eops = {
[XRTABLE_LOCAL_INDEX] = {
.newroute = local_newroute,
.delroute = fib_default_local_delroute,
.dump_fxid = local_dump_ad,
.free_fxid = local_free_ad,
},
XIP_FIB_REDIRECT_MAIN,
};
The first entry in this structure is a collection containing the four function pointers mentioned above for local ADs. The AD principal defines its own routines for three of them: local_newroute (adding a new XID to the routing table), local_dump_ad (dumping the routing table), and local_free_ad (freeing an XID that has been removed from the routing table). The other function, fib_default_local_delroute, is used for deleting an XID from the routing table. This function can be used because there is no special processing required for local AD routing table entries.
The second entry in this structure is a macro that represents another collection containing the four function pointers mentioned above for main ADs. The function pointers in that macro point to default routines that manage XIDs that redirect, as main ADs do.
The xia_ppal_all_rt_eops_t will vary based on how the local and main entries are handled, but most of them contain very similar core chunks of code. Compare these functions in the various versions of the principals to get an understanding of how they work; the AD principal is again a good place to start.
Note: The steps taken in this subsection can vary for different principals.
Once it is decided how local entries and main entries will behave and functions are implemented to populate the routing tables, we must define how a principal is routed when an instance of the principal is encountered in a DAG. The distinction between local and main entries is again important, since they will likely be routed in different ways.
When the last node pointer in the DAG refers to an AD principal, the following steps are taken in ad_deliver:
- xia_find_xid_rcu is called to find the routing table entry for the given AD XID. If no entry is found, then we need to add a negative dependency to indicate an unknown instance of a known principal.
- Depending on whether the given AD XID is a local AD or main AD, the DST's passthrough and sink actions are defined appropriately in ad_deliver using the following constants:
- XDA_DIG: select this edge, which in effect advances the last node pointer and a new query with the next principal is necessary.
- XDA_ERROR: an error occurred. This is frequently used for cases where a principal is inappropriately placed as a sink in a DAG; for example, an AD cannot be a sink.
- XDA_DROP: silently drop the packet without further processing.
- XDA_METHOD: use the associated DST input/output methods for additional processing.
- XDA_METHOD_AND_SELECT_EDGE: use the associated DST input/output method for additional processing and advance the last node in the DAG.
- Although all of these constants are available, the AD principal does not use all of them. For local ADs, if the AD is a sink then XDA_ERROR is used. For local ADs, if the AD is not a sink (it is a passthrough) then XDA_DIG is used. This indicates that we've reached the autonomous domain and we can continue processing the DAG. XRP_ACT_FORWARD is returned to indicate that the DST entry should now be equipped to forward packets based on the information given.
- For main ADs, redirects are used. Therefore, fib_mrd_redirect is called to find the next XID entry and XRP_ACT_REDIRECT is returned to indicate that next XID entry will be consulted to see how to continue forwarding.
- If further packet processing is necessary, then either XDA_METHOD or XDA_METHOD_AND_SELECT_EDGE should be used and DST input/output functions must be defined. See the subsection below for information about defining DST input/output functions.
The AD principal does not use any DST input/output functions because no special packet processing is needed when handling ADs. However, an example of this functionality can be found in XDP's deliver function:
if (xdst->input) {
xdst->dst.input = local_input_input;
xdst->dst.output = local_input_output;
} else {
xdst->dst.input = local_output_input;
xdst->dst.output = local_output_output;
}
First, the function checks the value of xdst->input. If that value is true, then the packet in question came from a network device. If it is false, then it comes from a local application. Therefore, if a packet came from a network device, then the local_input_* functions are used. If a packet came from an application, then the local_output_* functions are used. Note that "local" in the function names is used because this is the part of the code that handles local XDPs. If it was code that handled main XDPs, then the function names would be main_input_* and main_output_*.
Once the origin of the packet is decided, we assign separate DST input and output functions. The input functions are called when a packet arrives at a network interface, and the output functions are called when a packet is being written to a network interface.
For example, local_input_input is called when a packet that is destined for a local XDP arrives from an external source. Since local XDPs represent a socket, local_input_input simply delivers the packet to the socket. However, local_input_output is called when a packet that is to be written to a network device had arrived from an external source. Since this case should not happen, the local_input_output throws an error.
xiaconf is a collection of utilities for controlling an XIA stack. You should again look at the xip code for principals that have already been implemented to get an idea for how this code should work. A good place to start would be copying the code in xip/xipad.c to a file for your new principal (such as in xip/xipXXX.c) and to work from there. In this guide, we will use the AD principal as an example.
In the section above about defining routing table utility functions, we implemented the kernel-side routing table functionality in the principal's module. Now we are going to define the userspace functionality that invokes those routines in the kernel to allow users to interact with the routing table.
Note: The steps taken in this subsection should likely at a minimum be adapted for most principals.
To add an additional command to xip, you need to add references to your new principal in a Makefile and in some header files. These references are:
- An reference in xip/Makefile on the line XIP_OBJ_INCLUDE of the following form (in alphabetical order):
XIP_OBJ_INCLUDE = xip.o xiart.o xipad.o xipdst.o xipxdp.o xipserval.o xipXXX.o
- A reference in xip/xip.c:usage of the following form (in alphabetical order):
"where OBJECT := { ad | dst | hid | serval | xdp | XXX }\n"
- A reference in xip/xip.c struct cmd cmds[] of the following form (in alphabetical order):
{ "XXX", do_XXX },
- A reference in xip/xip_common.h of the following form (in alphabetical order):
/* From xipXXX.c */ int do_XXX(int argc, char **argv);
- An entry in etc-production/xia/principals containing the name and number of your principal (in numerical order by XID type).
Note: The steps taken in this subsection should likely at a minimum be adapted for most principals.
The first thing to do is to design and implement a usage function. You need to decide how you are going to let users interact with instances of your principal in the routing table. Typically, users should be allowed to add and delete local entries, add and delete main entries (routes), and show local and main entries.
For example, the usage function for AD is defined as:
static int usage(void)
{
fprintf(stderr,
"Usage: xip ad { addlocal | dellocal } ID\n"
" xip ad addroute ID gw XID\n"
" xip ad delroute ID\n"
" xip ad show { locals | routes }\n"
"where ID := HEXDIGIT{20}\n"
" XID := PRINCIPAL '-' ID\n"
" PRINCIPAL := '0x' NUMBER | STRING\n");
return -1;
}
This shows how the user can use the various ad commands with xip.
The next thing to do is to define a do_help function, which essentially invokes the usage method and exits. This function will likely look the same for most principals. For the AD principal, it looks like:
static int do_help(int argc, char **argv)
{
UNUSED(argc);
UNUSED(argv);
usage();
exit(1);
}
Note: The steps taken in this subsection can vary for different principals. You may want to review the section Defining Functions for Adding and Deleting Main Entries (Routes) in conjunction with this section to get a feel for what is appropriate for your principal.
The difference between adding and deleting a routing table entry can be very minimal. For the case of the AD principal, the difference is essentially two variables. When a new local entry is added, do_addlocal is called. When a local entry is deleted, do_dellocal is called:
static int do_addlocal(int argc, char **argv)
{
return do_local(argc, argv, 1);
}
static int do_dellocal(int argc, char **argv)
{
return do_local(argc, argv, 0);
}
Beyond these functions, the code for these two cases is exactly the same. Both functions call do_local to do some processing and to fetch a DST entry for the XID in question, but they pass in as a third parameter whether or not the request is for a new entry (1 is to add an entry, 0 is to delete an entry). do_local then calls modify_local, which contains some rtnetlink code. The rtnetlink environment allows userspace applications to communicate with the network stack through request messages.
Most of this code can be copied from principal to principal without changing it, if all you need to do is keep track of local entries without any additional processing. If you do have additional processing to do, then your mileage may vary.
Recall that the only difference between adding and deleting a local entry so far is the third parameter passed to do_local. do_local actually further passes that value to modify_local as int to_add and then uses it in the following way:
static int modify_local(const struct xia_xid *dst, int to_add)
{
...
if (to_add) {
req.n.nlmsg_flags = NLM_F_REQUEST|NLM_F_CREATE|NLM_F_EXCL;
req.n.nlmsg_type = RTM_NEWROUTE;
} else {
req.n.nlmsg_flags = NLM_F_REQUEST;
req.n.nlmsg_type = RTM_DELROUTE;
}
...
}
The fields nlmsg_flags and nlmsg_type are the only ones that differentiate adding a local entry from deleting a local entry. The rest of the code in modify_local is exactly the same.
Note: The steps taken in this subsection can vary for different principals. You may want to review the section Defining Functions for Adding and Deleting Local Entries in conjunction with this section to get a feel for what is appropriate for your principal.
Main routing table entries (routes) can be different from local routing table entries. Typically, a route is a combination of an XID and a gateway through which we can route to that XID. Therefore, we are usually no longer just keeping track of the XID entry, but we are also concerned about the gateway for that XID, such as in the case of the AD principal (which we will look at as an example).
However, just as in the case of the local entry, the difference between adding and deleting a route can be very minimal. For the case of the AD principal, the difference is essentially three statements. When a new route is added, do_addroute is called. When a local entry is deleted, do_delroute is called:
static int do_addroute(int argc, char **argv)
{
struct xia_xid dst, gw;
if (argc != 3) {
fprintf(stderr, "Wrong number of parameters\n");
return usage();
}
if (strcmp(argv[1], "gw")) {
fprintf(stderr, "Wrong parameters\n");
return usage();
}
xrt_get_ppal_id("ad", usage, &dst, argv[0]);
xrt_get_xid(usage, &gw, argv[2]);
return xrt_modify_route(&dst, &gw);
}
static int do_delroute(int argc, char **argv)
{
struct xia_xid dst;
if (argc != 1) {
fprintf(stderr, "Wrong number of parameters\n");
return usage();
}
xrt_get_ppal_id("ad", usage, &dst, argv[0]);
return xrt_modify_route(&dst, NULL);
}
Beyond these functions, the code for these two cases is exactly the same. Both functions call xrt_modify_route, which contains some rtnetlink code. However, notice that they pass in as a third parameter a gateway for the new route (or NULL to represent that a route should be deleted).
Again, the difference between adding and deleting an entry boils down to just a few statements inside xrt_modify_route. This is just like modify_local, but with a slight change. Moreover, notice that when we are adding an entry, we need to keep track of the gateway:
if (to_add)
addattr_l(&req.n, sizeof(req), RTA_GATEWAY, gw, sizeof(*gw));
Note: The steps taken in this subsection can vary for different principals.
This section describes how to define functions that dump, or display, the routing table entries for a principal. The difference between the local and main entries is minimal. When defining these functions for your own principal, it's possible that the code will be almost exactly the same as the code already existing in the principals, especially since much of the code is specific to the rtnetlink mechanism and not to the principals themselves.
The first thing to do is to define a filter for the routing table dump. This ensures that we get the correct version (local vs. main) when displaying the table. The filter is a tuple containing the 32-bit table identifier and a 32-bit XID type:
static struct
{
__u32 tb;
xid_type_t xid_type;
} filter;
In the AD principal, when the user tries to dump the routing table the do_show function is called in xip/xipad.c. After checking for the correct number of parameters, this function then determines whether the user is requesting the local table or the main table:
name = argv[0];
if (!matches(name, "locals")) {
return dump(XRTABLE_LOCAL_INDEX);
} else if (!matches(name, "routes")) {
xid_type_t ty;
assert(!ppal_name_to_type("ad", &ty));
return xrt_list_rt_redirects(XRTABLE_MAIN_INDEX, ty);
} else {
fprintf(stderr, "Unknow routing table '%s', it must be either 'locals', or 'routes'\n",
name);
return usage();
}
For the local case, the dump function is called. This initiates a dump request message through rtnetlink, and indicates that print_route will be the callback function. When it is invoked, print_route does some error checking (including using the filter to make sure the correct table is being shown) and prints a routing table entry. The print_route function is called once for every entry in the table. For most principals, this is how local entries will be printed out. It's possible that there could be some more information associated with the entry that needs to be shown, however. For that, see the section titled "Dumping Additional Information with Each Entry" below.
For the main case, the do_show function calls xrt_list_rt_redirects. This function can be used to dump any routing table entries that redirect in the forwarding mechanism, as main AD entries do. xrt_list_rt_redirects, which is in xip/xiart.c, prints out the exact same thing that print_route displays for the local case.
Sometimes it is necessary to display more information about a routing table entry than just its XID. For example, when showing neighbors in the HID principal, the XID, hardware address, and interface is displayed for each routing table entry:
# xip hid showneighs to hid-7ac27f90663ef36da12cfcc37c9a6bb6b85dec96 lladdr: 00:90:f5:ba:71:5f dev: eth0 flags []
This can be done by adding nested attributes to the routing table entry via rtnetlink. First, this has to be done on the kernel side in the principal's module. For example, the HID code, the following struct is declared:
struct rtnl_xia_hid_hdw_addrs {
__u16 hha_len;
__u8 hha_addr_len;
__u8 hha_ha[MAX_ADDR_LEN];
int hha_ifindex;
};
This is the data structure that will hold all of the additional information. Also in the HID code, the function main_dump_hid uses the above struct:
ha_attr = nla_nest_start(skb, RTA_MULTIPATH);
...
list_for_each_entry(pos_ha, &mhid->xhm_haddrs, ha_list) {
struct rtnl_xia_hid_hdw_addrs *rtha =
nla_reserve_nohdr(skb, sizeof(*rtha));
...
rtha->hha_addr_len = pos_ha->dev->addr_len;
memmove(rtha->hha_ha, pos_ha->ha, rtha->hha_addr_len);
rtha->hha_ifindex = pos_ha->dev->ifindex;
...
}
nla_nest_end(skb, ha_attr);
Notice that this information is stored in the RTA_MULTIPATH attribute of the table. This is important, because back in the xip program, we are going to fetch this information through that same attribute (there are other RTA_* attributes available for different purposes -- net/xia/fib_frontend.c in the kernel tree is a good starting point to see how they are used). The print_neigh function in xip/xiphid.c is very similar to the print_route function described above in xip/xipad.c, but print_neigh also takes into account the extra information:
if (tb[RTA_MULTIPATH]) {
struct rtnl_xia_hid_hdw_addrs *rtha =
RTA_DATA(tb[RTA_MULTIPATH]);
...
}
Once the data structure is fetched, each entry can be printed out one at a time using a loop and the functions RTHA_OK and RTHA_NEXT from the HID code.
All grants that have generously supported the development of Linux XIA are listed on our Funding page.