The x4c Book

This book provides an introduction to the P4 language using the x4c compiler. The presentation is by example. Each concept is introduced through example P4 code and programs - then demonstrated using simple harnesses that pass packets through compiled P4 pipelines.

A basic knowledge of programming and networking is assumed. The x4c Rust compilation target will be used in this book, so a working knowledge of Rust is also good to have.

The Basics

This chapter will take you from zero to a simple hello-world program in P4. This will include.

  • Getting a Rust toolchain setup and installing the x4c compiler.
  • Compiling a P4 hello world program into a Rust program.
  • Writing a bit of Rust code to push packets through the our compiled P4 pipelines.

Installation

Rust

The first thing we'll need to do is install Rust. We'll be using a tool called rustup. On Unix/Linux like platforms, simply run the following from your terminal. For other platforms see the rustup docs.

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

It may be necessary to restart your shell session after installing Rust.

x4c

Now we will install the x4c compiler using the rust cargo tool.

cargo install --git https://github.com/oxidecomputer/p4 x4c

You should now be able to run x4c.

x4c --help
x4c 0.1

USAGE:
    x4c [OPTIONS] <FILENAME> [TARGET]

ARGS:
    <FILENAME>    File to compile
    <TARGET>      What target to generate code for [default: rust] [possible values: rust,
                  red-hawk, docs]

OPTIONS:
        --check          Just check code, do not compile
    -h, --help           Print help information
    -o, --out <OUT>      Filename to write generated code to [default: out.rs]
        --show-ast       Show parsed abstract syntax tree
        --show-hlir      Show high-level intermediate representation info
        --show-pre       Show parsed preprocessor info
        --show-tokens    Show parsed lexical tokens
    -V, --version        Print version information

That's it! We're now ready to dive into P4 code.

Hello World

Let's start out our introduction of P4 with the obligatory hello world program.

Parsing

The first bit of programmable logic packets hit in a P4 program is a parser. Parsers do the following.

  1. Describe a state machine packet parsing.
  2. Extract raw data into headers with typed fields.
  3. Decide if a packet should be accepted or rejected based on parsed structure.

In the code below, we can see that parsers are defined somewhat like functions in general-purpose programming languages. They take a set of parameters and have a curly-brace delimited block of code that acts over those parameters.

Each of the parameters has an optional direction that we see here as out or inout, a type, and a name. We'll get to data types in the next section for now let's focus on what this parser code is doing.

The parameters shown here are typical of what you will see for P4 parsers. The exact set of parameters varies depending on ASIC the P4 code is being compiled for. But in general, there will always need to be - a packet, set of headers to extract packet data into and a bit of metadata about the packet that the ASIC has collected.

parser parse (
    packet_in pkt,
    out headers_t headers,
    inout ingress_metadata_t ingress,
){
    state start {
        pkt.extract(headers.ethernet);
        transition finish;
    }

    state finish {
        transition accept;
    }
}

Parsers are made up of a set of states and transitions between those states. Parsers must always include a start state. Our start state extracts an Ethernet header from the incoming packet and places it in to the headers_t parameter passed to the parser. We then transition to the finish state where we simply transition to the implicit accept state. We could have just transitioned to accept from start, but wanted to show transitions between user-defined states in this example.

Transitioning to the accept state means that the packet will be passed to a control block for further processing. Control blocks will be covered a few sections from now. Parsers can also transition to the implicit reject state. This means that the packet will be dropped and not go on to any further processing.

Data Types

There are two primary data types in P4, struct and header types.

Structs

Structs in P4 are similar to structs you'll find in general purpose programming languages such as C, Rust, and Go. They are containers for data with typed data fields. They can contain basic data types, headers as well as other structs.

Let's take a look at the structs in use by our hello world program.

The first is a structure containing headers for our program to extract packet data into. This headers_t structure is simple and only contains one header. However, there may be an arbitrary number of headers in the struct. We'll discuss headers in the next section.

struct headers_t {
    ethernet_t ethernet;
}

The next structure is a bit of metadata provided to our parser by the ASIC. In our simple example this just includes the port that the packet came from. So if our code is running on a four port switch, the port field would take on a value between 0 and 3 depending on which port the packet came in on.

struct ingress_metadata_t {
    bit<16> port;
}

As the name suggests bit<16> is a 16-bit value. In P4 the bit<N> type commonly represents unsigned integer values. We'll get more into the primitive data types of P4 later.

Headers

Headers are the result of parsing packets. They are similar in nature to structures with the following differences.

  1. Headers may not contain headers.
  2. Headers have a set of methods isValid(), setValid(), and setValid() that provide a means for parsers and control blocks to coordinate on the parsed structure of packets as they move through pipelines.

Let's take a look at the ethernet_h header in our hello world example.

header ethernet_h {
    bit<48> dst;
    bit<48> src;
    bit<16> ether_type;
}

This header represents a layer-2 Ethernet frame. The leading octet is not present as this will be removed by most ASICs. What remains is the MAC source and destination fields which are each 6 octets / 48 bits and the ethertype which is 2 octets.

Note also that the payload is not included here. This is important. P4 programs typically operate on packet headers and not packet payloads. In upcoming examples we'll go over header stacks that include headers at higher layers like IP, ICMP and TCP.

In the parsing code above, when pkt.extract(headers.ethernet) is called, the values dst, src and ether_type are populated from packet data and the method setValid() is implicitly called on the headers.ethernet header.

Control Blocks

Control blocks are where logic goes that decides what will happen to packets that are parsed successfully. Similar to a parser block, control blocks look a lot like functions from general purpose programming languages. The signature (the number and type of arguments) for this control block is a bit different than the parser block above.

The first argument hdr, is the output of the parser block. Note in the parser signature there is a out headers_t headers parameter, and in this control block there is a inout headers_t hdr parameter. The out direction in the parser means that the parser writes to this parameter. The inout direction in the control block means that the control both reads and writes to this parameter.

The ingress_metadata_t parameter is the same parameter we saw in the parser block. The egress_metadata_t is similar to ingress_metadata_t. However, our code uses this parameter to inform the ASIC about how it should treat packets on egress. This is in contrast to the ingress_metdata_t parameter that is used by the ASIC to inform our program about details of the packet's ingress.

control ingress(
    inout headers_t hdr,
    inout ingress_metadata_t ingress,
    inout egress_metadata_t egress,
) {

    action drop() {
		ingress.drop = true;
	}

    action forward(bit<16> port) {
        egress.port = port;
    }

    table tbl {
        key = {
            ingress.port: exact;
        }
        actions = {
            drop;
            forward;
        }
        default_action = drop;
        const entries = {
            16w0 : forward(16w1);
            16w1 : forward(16w0);
        }
    }

    apply {
        tbl.apply();
    }

}

Control blocks are made up of tables, actions and apply blocks. When packet headers enter a control block, the apply block decides what tables to run the parameter data through. Tables are described in terms if keys and actions. A key is an ordered sequence of fields that can be extracted from any of the control parameters. In the example above we are using the port field from the ingress parameter to decide what to do with a packet. We are not even investigating the packet headers at all! We can indeed use header fields in keys, and an example of doing so will come later.

When a table is applied, and there is an entry in the table that matches the key data, the action corresponding to that key is executed. In our example we have pre-populated our table with two entries. The first entry says, if the ingress port is 0, forward the packet to port 1. The second entry says if the ingress port is 1, forward the packet to port 0. These odd looking prefixes on our numbers are width specifiers. So 16w0 reads: the value 0 with a width of 16 bits.

Every action that is specified in a table entry must be defined within the control. In our example, the forward action is defined above. All this action does is set the port field on the egress metadata to the provided value.

The example table also has a default action of drop. This action fires for all invocations of the table over key data that has no matching entry. So for our program, any packet coming from a port that is not 0 or 1 will be dropped.

The apply block is home to generic procedural code. In our example it's very simple and only has an apply invocation for our table. However, arbitrary logic can go in this block, we could even implement the logic of this control without a table!

apply {
    if (ingress.port == 16w0) {
        egress.port = 16w1;
    }
    if (ingress.port == 16w1) {
        egress.port = 16w0;
    }
}

Which then begs the question, why have a special table construct at all. Why not just program everything using logical primitives? Or let programmers define their own data structures like general purpose programming languages do?

Setting the performance arguments aside for the moment, there is something mechanically special about tables. They can be updated from outside the P4 program. In the program above we have what are called constant entries defined directly in the P4 program. This makes presenting a simple program like this very straight forward, but it is not the way tables are typically populated. The focus of P4 is on data plane programming e.g., given a packet from the wire what do we do with it? I prime example of this is packet routing and forwarding.

Both routing and forwarding are typically implemented in terms of lookup tables. Routing is commonly implemented by longest prefix matching on the destination address of an IP packet and forwarding is commonly implemented by exact table lookups on layer-2 MAC addresses. How are those lookup tables populated though. There are various different answers there. Some common ones include routing protocols like OSPF, or BGP. Address resolution protocols like ARP and NDP. Or even more simple answers like an administrator statically adding a route to the system.

All of these activities involve either a stateful protocol of some sort or direct interaction with a user. Neither of those things is possible in the P4 programming language. It's just about processing packets on the wire and the mechanisms for keeping state between packets is extremely limited.

What P4 implementations do provide is a way for programs written in general purpose programming languages that are capable of stateful protocol implementation and user interaction - to modify the tables of a running P4 program through a runtime API. We'll come back to runtime APIs soon. For now the point is that the table abstraction allows P4 programs to remain focused on simple, mostly-stateless packet processing tasks that can be implemented at high packet rates and leave the problem of table management to the general purpose programming languages that interact with P4 programs through shared table manipulation.

Package

The final bit to show for our hello world program is a package instantiation. A package is like a constructor function that takes a parser and a set of control blocks. Packages are typically tied to the ASIC your P4 code will be executing on. In the example below, we are passing our parser and single control block to the SoftNPU package. Packages for more complex ASICs may take many control blocks as arguments.

SoftNPU(
    parse(),
    ingress()
) main;

Full Program

Putting it all together, we have a complete P4 hello world program as follows.

struct headers_t {
    ethernet_h ethernet;
}

struct ingress_metadata_t {
    bit<16> port;
    bool drop;
}

struct egress_metadata_t {
    bit<16> port;
    bool drop;
    bool broadcast;
}

header ethernet_h {
    bit<48> dst;
    bit<48> src;
    bit<16> ether_type;
}

parser parse (
    packet_in pkt,
    out headers_t headers,
    inout ingress_metadata_t ingress,
){
    state start {
        pkt.extract(headers.ethernet);
        transition finish;
    }

    state finish {
        transition accept;
    }
}

control ingress(
    inout headers_t hdr,
    inout ingress_metadata_t ingress,
    inout egress_metadata_t egress,
) {

    action drop() { }

    action forward(bit<16> port) {
        egress.port = port;
    }

    table tbl {
        key = {
            ingress.port: exact;
        }
        actions = {
            drop;
            forward;
        }
        default_action = drop;
        const entries = {
            16w0 : forward(16w1);
            16w1 : forward(16w0);
        }
    }

    apply {
        tbl.apply();
    }

}

// We do not use an egress controller in this example, but one is required for
// SoftNPU so just use an empty controller here.
control egress(
    inout headers_t hdr,
    inout ingress_metadata_t ingress,
    inout egress_metadata_t egress,
) {

}

SoftNPU(
    parse(),
    ingress(),
    egress(),
) main;

This program will take any packet that has an Ethernet header showing up on port 0, send it out port 1, and vice versa. All other packets will be dropped.

In the next section we'll compile this program and run some packets through it!

Compile and Run

In the previous section we put together a hello world P4 program. In this section we run that program over a software ASIC called SoftNpu. One of the capabilities of the x4c compiler is using P4 code directly from Rust code and we'll be doing that in this example.

Below is a Rust program that imports the P4 code developed in the last section, loads it onto a SoftNpu ASIC instance, and sends some packets through it. We'll be looking at this program piece-by-piece in the remainder of this section.

All of the programs in this book are available as buildable programs in the oxidecomputer/p4 repository in the book/code directory.

use tests::softnpu::{RxFrame, SoftNpu, TxFrame};
use tests::{expect_frames};

const NUM_PORTS: u16 = 3;

p4_macro::use_p4!(p4 = "book/code/src/bin/hello-world.p4", pipeline_name = "hello");

fn main() -> Result<(), anyhow::Error> {
    let pipeline = main_pipeline::new(NUM_PORTS);
    let mut npu = SoftNpu::new(NUM_PORTS, pipeline, false);
    let phy1 = npu.phy(0);
    let phy2 = npu.phy(1);
    let phy3 = npu.phy(2);

    npu.run();

    // Expect this packet to be dropped
    phy3.send(&[TxFrame::new(phy3.mac, 0, b"to the bit bucket with you!")])?;

    phy1.send(&[TxFrame::new(phy2.mac, 0, b"hello")])?;
    expect_frames!(phy2, &[RxFrame::new(phy1.mac, 0, b"hello")]);

    phy2.send(&[TxFrame::new(phy1.mac, 0, b"world")])?;
    expect_frames!(phy1, &[RxFrame::new(phy2.mac, 0, b"world")]);

    Ok(())
}

The program starts with a few Rust imports.

use tests::softnpu::{RxFrame, SoftNpu, TxFrame};
use tests::{expect_frames};

This first line is the SoftNpu implementation that lives in the test crate of the oxidecomputer/p4 repository. The second is a helper macro that allows us to make assertions about frames coming from a SoftNpu "physical" port (referred to as a phy).

The next line is using the x4c compiler to translate P4 code into Rust code and dumping that Rust code into our program. The macro literally expands into the Rust code emitted by the compiler for the specified P4 source file.

p4_macro::use_p4!(p4 = "book/code/src/bin/hello-world.p4", pipeline_name = "hello");

The main artifact this produces is a Rust struct called main_pipeline which is used in the code that comes next.

let pipeline = main_pipeline::new(NUM_PORTS);
let mut npu = SoftNpu::new(NUM_PORTS, pipeline, false);
let phy1 = npu.phy(0);
let phy2 = npu.phy(1);
let phy3 = npu.phy(2);

This code is instantiating a pipeline object that encapsulates the logic of our P4 program. Then a SoftNpu ASIC is constructed with three ports and our pipeline program. SoftNpu objects provide a phy method that takes a port index to get a reference to a port that is attached to the ASIC. These port objects are used to send and receive packets through the ASIC, which uses our compiled P4 code to process those packets.

Next we run our program on the SoftNpu ASIC.

npu.run();

However, this does not actually do anything until we pass some packets through it, so lets do that.

// Expect this packet to be dropped
phy3.send(&[TxFrame::new(phy3.mac, 0, b"to the bit bucket with you!")])?;

This code transmit an Ethernet frame through the third port of the ASIC with a payload value of "to the bit bucket with you!". The phy3.mac parameter of the TxFrame sets the destination MAC address and the 0 for the second parameter is the ethertype used in the outgoing Ethernet frame.

Based on the logic in our P4 program, we would expect this packet to be dropped by the switch, i.e. it will not be sent out of any port at all. This is because the table lookup on the ingress port value of 2 would get a miss, and the table would execute the default action drop. Thus we do not call expect_frames! here, as we do for the test packets below.

phy1.send(&[TxFrame::new(phy2.mac, 0, b"hello")])?;

This code transmits an Ethernet frame through the first port of the ASIC with a payload value of "hello".

Based on the logic in our P4 program, we would expect this packet to come out the second port. Let's test that.

expect_frames!(phy2, &[RxFrame::new(phy1.mac, 0, b"hello")]);

This code reads a packet from the second ASIC port phy2 (blocking until there is a packet available) and asserts the following.

  • The Ethernet payload is the byte string "hello".
  • The source MAC address is that of phy1.
  • The ethertype is 0.

To complete the hello world program, we do the same thing in the opposite direction. Sending the byte string "world" as an Ethernet payload into port 2 and assert that it comes out port 1.

phy2.send(&[TxFrame::new(phy1.mac, 0, b"world")])?;
expect_frames!(phy1, &[RxFrame::new(phy2.mac, 0, b"world")]);

The expect_frames macro will also print payloads and the port they came from.

When we run this program we see the following.

$ cargo run --bin hello-world
   Compiling x4c-book v0.1.0 (/home/ry/src/p4/book/code)
    Finished dev [unoptimized + debuginfo] target(s) in 2.05s
     Running `target/debug/hello-world`
[phy2] hello
[phy1] world

SoftNpu and Target x4c Use Cases.

The example above shows using x4c compiled code is a setting that is only really useful for testing the logic of compiled pipelines and demonstrating how P4 and x4c compiled pipelines work. This begs the question of what the target use cases for x4c actually are. It also raises question, why build x4c in the first place? Why not use the established reference compiler p4c and its associated reference behavioral model bmv2?

A key difference between x4c and the p4c ecosystem is how compilation and execution concerns are separated. x4c generates free-standing pipelines that can be used by other code, p4c generates JSON that is interpreted and run by bmv2.

The example above shows how the generation of free-standing runnable pipelines can be used to test the logic of P4 programs in a lightweight way. We went from P4 program source to actual packet processing using nothing but the Rust compiler and package manager. The program is executable in an operating system independent way and is a great way to get CI going for P4 programs.

The free-standing pipeline approach is not limited to self-contained use cases with packets that are generated and consumed in-program. x4c generated code conforms to a well defined Pipeline interface that can be used to run pipelines anywhere rustc compiled code can run. Pipelines are even dynamically loadable through dlopen and the like.

The x4c authors have used x4c generated pipelines to create virtual ASICs inside hypervisors that transit real traffic between virtual machines, as well as P4 programs running inside zones/containers that implement NAT and tunnel encap/decap capabilities. The mechanics of I/O are deliberately outside the scope of x4c generated code. Whether you want to use DLPI, XDP, libpcap, PF_RING, DPDK, etc., is up to you and the harness code you write around your pipelines!

The win with x4c is flexibility. You can compile a free-standing P4 pipeline and use that pipeline wherever you see fit. The near-term use for x4c focuses on development and evaluation environments. If you are building a system around P4 programmable components, but it's not realistic to buy all the switches/routers/ASICs at the scale you need for testing/development, x4c is an option. x4c is also a good option for running packets through your pipelines in a lightweight way in CI.

By Example

This chapter presents the use of the P4 language and x4c through a series of examples. This is a living set that will grow over time.

VLAN Switch

This example presents a simple VLAN switch program. This program allows a single VLAN id (vid) to be set per port. Any packet arriving at a port with a vid set must carry that vid in its Ethernet header or it will be dropped. We'll refer to this as VLAN filtering. If a packet makes it past ingress filtering, then the forwarding table of the switch is consulted to see what port to send the packet out. We limit ourselves to a very simple switch here with a static forwarding table. A MAC learning switch will be presented in a later example. This switch also does not do flooding for unknown packets, it simply operates on the lookup table it has. If an egress port is identified via a forwarding table lookup, then egress VLAN filtering is applied. If the vid on the packet is present on the egress port then the packet is forwarded out that port.

This example is comprised of two programs. A P4 data-plane program and a Rust control-plane program.

P4 Data-Plane Program

Let's start by taking a look at the headers for the P4 program.

header ethernet_h {
    bit<48> dst;
    bit<48> src;
    bit<16> ether_type;
}

header vlan_h {
    bit<3> pcp;
    bit<1> dei;
    bit<12> vid;
    bit<16> ether_type;
}

struct headers_t {
    ethernet_h eth;
    vlan_h vlan;
}

An Ethernet frame is normally just 14 bytes with a 6 byte source and destination plus a two byte ethertype. However, when VLAN tags are present the ethertype is set to 0x8100 and a VLAN header follows. This header contains a 12-bit vid as well as an ethertype for the header that follows.

A byte-oriented packet diagram shows how these two Ethernet frame variants line up.

                     1
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8
+---------------------------+
|    src    |    dst    |et |
+---------------------------+
+-----------------------------------+
|    src    |    dst    |et |pdv|et |
+------------------------------------

The structure is always the same for the first 14 bytes. So we can take advantage of that when parsing any type of Ethernet frame. Then we can use the ethertype field to determine if we are looking at a regular Ethernet frame or a VLAN-tagged Ethernet frame.

parser parse (
    packet_in pkt,
    out headers_t h,
    inout ingress_metadata_t ingress,
) {
    state start {
        pkt.extract(h.eth);
        if (h.eth.ether_type == 16w0x8100) { transition vlan; } 
        transition accept;
    }
    state vlan {
        pkt.extract(h.vlan);
        transition accept;
    }
}

This parser does exactly what we described above. First parse the first 14 bytes of the packet as an Ethernet frame. Then conditionally parse the VLAN portion of the Ethernet frame if the ethertype indicates we should do so. In a sense we can think of the VLAN portion of the Ethernet frame as being it's own independent header. We are keying our decisions based on the ethertype, just as we would for layer 3 protocol headers.

Our VLAN switch P4 program is broken up into multiple control blocks. We'll start with the top level control block and then dive into the control blocks it calls into to implement the switch.

control ingress(
    inout headers_t hdr,
    inout ingress_metadata_t ingress,
    inout egress_metadata_t egress,
) {
    vlan() vlan;
    forward() fwd;
    
    apply {
        bit<12> vid = 12w0;
        if (hdr.vlan.isValid()) {
            vid = hdr.vlan.vid;
        }

        // check vlan on ingress
        bool vlan_ok = false;
        vlan.apply(ingress.port, vid, vlan_ok);
        if (vlan_ok == false) {
            egress.drop = true;
            return;
        }

        // apply switch forwarding logic
        fwd.apply(hdr, ingress, egress);

        // check vlan on egress
        vlan.apply(egress.port, vid, vlan_ok);
        if (vlan_ok == false) {
            egress.drop = true;
            return;
        }
    }
}

The first thing that is happening in this program is the instantiation of a few other control blocks.

vlan() vlan;
forward() fwd;

We'll be using these control blocks to implement the VLAN filtering and switch forwarding logic. For now let's take a look at the higher level packet processing logic of the program in the apply block.

The first thing we do is start by assuming there is no vid by setting it to zero. The if the VLAN header is valid we assign the vid from the packet header to our local vid variable. The isValid header method returns true if extract was called on that header. Recall from the parser code above, that extract is only called on hdr.vlan if the ethertype on the Ethernet frame is 0x1800.

bit<12> vid = 12w0;
if (hdr.vlan.isValid()) {
    vid = hdr.vlan.vid;
}

Next apply VLAN filtering logic. First an indicator variable vlan_ok is initialized to false. Then we pass that indicator variable along with the port the packet came in on and the vid we determined above to the VLAN control block.

bool vlan_ok = false;
vlan.apply(ingress.port, vid, vlan_ok);
if (vlan_ok == false) {
    egress.drop = true;
    return;
}

Let's take a look at the VLAN control block. The first thing to note here is the direction of parameters. The port and vid parameters are in parameters, meaning that the control block can only read from them. The match parameter is an out parameter meaning the control block can only write to it. Consider this in the context of the code above. There we are passing in the vlan_ok to the control block with the expectation that the control block will modify the value of the variable. The out direction of this control block parameter is what makes that possible.

control vlan(
    in bit<16> port,
    in bit<12> vid,
    out bool match,
) {
    action no_vid_for_port() {
        match = true;
    }

    action filter(bit<12> port_vid) { 
        if (port_vid == vid) { match = true; } 
    }
    
    table port_vlan {
        key             = { port: exact; }
        actions         = { no_vid_for_port; filter; }
        default_action  = no_vid_for_port;
    }

    apply { port_vlan.apply(); }
}

Let's look at this control block starting from the table declaration. The port_vlan table has the port id as the single key element. There are two possible actions no_vid_for_port and filter. The no_vid_for_port fires when there is no match for the port id. That action unconditionally sets match to true. The logic here is that if there is no VLAN configure for a port e.g., the port is not in the table, then there is no need to do any VLAN filtering and just pass the packet along.

The filter action takes a single parameter port_vid. This value is populated by the table value entry corresponding to the port key. There are no static table entries in this P4 program, they are provided by a control plane program which we'll get to in a bit. The filter logic tests if the port_vid that has been configured by the control plane matches the vid on the packet. If the test passes then match is set to true meaning the packet can continue processing.

Popping back up to the top level control block. If vlan_ok was not set to true in the vlan control block, then we drop the packet. Otherwise we continue on to further processing - forwarding.

Here we are passing the entire header, and ingress and egress metadata structures into the fwd control block which is an instantiation of the forward control block type.

fwd.apply(hdr, ingress, egress);

Lets take a look at the forward control block.

control forward(
    inout headers_t hdr,
    inout ingress_metadata_t ingress,
    inout egress_metadata_t egress,
) {
    action drop() { ingress.drop = true; }
    action forward(bit<16> port) { egress.port = port; }

    table fib {
        key             = { hdr.eth.dst: exact; }
        actions         = { drop; forward; }
        default_action  = drop;
    }

    apply { fib.apply(); }
}

This simple control block contains a table that maps Ethernet addresses to ports. The single element key contains an Ethernet destination and the matching action forward contains a single 16-bit port value. When the Ethernet destination matches an entry in the table, the egress metadata destination for the packet is set to the port id that has been set for that table entry.

Note that in this control block all parameters have an inout direction, meaning the control block can both read from and write to these parameters. Like the vlan control block above, there are no static entries here. Entries for the table in this control block are filled in by a control-plane program.

Popping back up the stack to our top level control block, the remaining code we have is the following.

vlan.apply(egress.port, vid, vlan_ok);
if (vlan_ok == false) {
    egress.drop = true;
    return;
}

This is pretty much the same as what we did at the beginning of the apply block. Except this time, we are passing in the egress port instead of the ingress port. We are checking the VLAN tags not only for the ingress port, but also for the egress port.

You can find this program in it's entirety here.

Rust Control-Plane Program

The main purpose of the Rust control plane program is to manage table entries in the P4 program. In addition to table management, the program we'll be showing here also instantiates and runs the P4 code over a virtual ASIC to demonstrate the complete system working.

We'll start top down again. Here is the beginning of our Rust program.

use tests::expect_frames;
use tests::softnpu::{RxFrame, SoftNpu, TxFrame};

const NUM_PORTS: u16 = 2;

p4_macro::use_p4!(
    p4 = "book/code/src/bin/vlan-switch.p4",
    pipeline_name = "vlan_switch"
);

fn main() -> Result<(), anyhow::Error> {
    let mut pipeline = main_pipeline::new(NUM_PORTS);

    let m1 = [0x33, 0x33, 0x33, 0x33, 0x33, 0x33];
    let m2 = [0x44, 0x44, 0x44, 0x44, 0x44, 0x44];

    init_tables(&mut pipeline, m1, m2);
    run_test(pipeline, m2)
}

After imports, the first thing we are doing is calling the use_p4! macro. This translates our P4 program into Rust and expands the use_p4! macro in place to the generated Rust code. This results in the main_pipeline type that we see instantiated in the first line of the main program. Then we define a few MAC addresses that we'll get back to later. The remainder of the main code performs the two functions described above. The init_tables function acts as a control plane for our P4 code, setting up the VLAN and forwarding tables. The run_test code executes our instantiated pipeline over a virtual ASIC, sends some packets through it, and makes assertions about the results.

Control Plane Code

Let's jump into the control plane code.

fn init_tables(pipeline: &mut main_pipeline, m1: [u8;6], m2: [u8;6]) {
    // add static forwarding entries
    pipeline.add_ingress_fwd_fib_entry("forward", &m1, &0u16.to_be_bytes());
    pipeline.add_ingress_fwd_fib_entry("forward", &m2, &1u16.to_be_bytes());

    // port 0 vlan 47
    pipeline.add_ingress_vlan_port_vlan_entry(
        "filter",
        0u16.to_be_bytes().as_ref(),
        47u16.to_be_bytes().as_ref(),
    );

    // sanity check the table
    let x = pipeline.get_ingress_vlan_port_vlan_entries();
    println!("{:#?}", x);

    // port 1 vlan 47
    pipeline.add_ingress_vlan_port_vlan_entry(
        "filter",
        1u16.to_be_bytes().as_ref(),
        47u16.to_be_bytes().as_ref(),
    );

}

The first thing that happens here is the forwarding tables are set up. We add two entries one for each MAC address. The first MAC address maps to the first port and the second MAC address maps to the second port.

We are using table modification methods from the Rust code that was generated from our P4 code. A valid question is, how do I know what these are? There are two ways.

Determine Based on P4 Code Structure

The naming is deterministic based on the structure of the p4 program. Table modification functions follow the pattern <operation>_<control_path>_<table_name>_entry. Where operation one of the following.

  • add
  • remove
  • get.

The control_path is based on the names of control instances starting from the top level ingress controller. In our P4 program, the forwarding table is named fwd so that is what we see in the function above. If there is a longer chain of controller instances, the instance names are underscore separated. Finally the table_name is the name of the table in the control block. This is how we arrive at the method name above.

pipeline.add_fwd_fib_entry(...)

Use cargo doc

Alternatively you can just run cargo doc to have Cargo generate documentation for your crate that contains the P4-generated Rust code. This will emit Rust documentation that includes documentation for the generated code.

For example, in the main p4 repository that contains the vlan switch example code, when you run cargo doc you'll see something like this

$ cargo doc
[snip]
 Documenting x4c_error_codes v0.1.0 (/Users/ry/src/p4/x4c_error_codes)
 Documenting clap v3.2.23
 Documenting tests v0.1.0 (/Users/ry/src/p4/test)
 Documenting sidecar-lite v0.1.0 (/Users/ry/src/p4/lang/prog/sidecar-lite)
 Documenting p4-macro-test v0.1.0 (/Users/ry/src/p4/lang/p4-macro-test)
 Documenting x4c-book v0.1.0 (/Users/ry/src/p4/book/code)
 Documenting x4c v0.1.0 (/Users/ry/src/p4/x4c)
    Finished dev [unoptimized + debuginfo] target(s) in 15.87s
   Generated /Users/ry/src/p4/target/doc/p4_macro/index.html
   Generated /Users/ry/src/p4/target/doc/p4_macro_test/index.html
   Generated /Users/ry/src/p4/target/doc/p4_rust/index.html
   Generated /Users/ry/src/p4/target/doc/p4rs/index.html
   Generated /Users/ry/src/p4/target/doc/sidecar_lite/index.html
   Generated /Users/ry/src/p4/target/doc/tests/index.html
   Generated /Users/ry/src/p4/target/doc/x4c/index.html
   Generated /Users/ry/src/p4/target/doc/hello_world/index.html
   Generated /Users/ry/src/p4/target/doc/vlan_switch/index.html
   Generated /Users/ry/src/p4/target/doc/x4c_error_codes/index.html

If you open the file target/doc/vlan_switch/index.html. You'll see several struct and function definitions. In particular, if you click on the main_pipeline struct, you'll see methods associated with the main pipeline like add_ingress_fwd_fib_entry that allow you to modify pipeline table state.

Now back to the control plane code above. You'll also notice that we are adding key values and parameter values to the P4 tables as byte slices. At the time of writing, x4c is not generating high-level table manipulation APIs so we have to pass everything in as binary serialized data.

The semantics of these data buffers are the following.

  1. Both key data and match action data (parameters) are passed in in-order.
  2. Numeric types are serialized in big-endian byte order.
  3. If a set of keys or a set of parameters results in a size that does not land on a byte-boundary, i.e. 12 bytes like we have in this example, the length of the buffer is rounded up to the nearest byte boundary.

After adding the forwarding entries, VLAN table entries are added in the same manner. A VLAN with the vid of 47 is added to the first and second ports. Note that we also use a table access method to get all the entries of a table and print them out to convince ourselves our code is doing what we intend.

Test Code

Now let's take a look at the test portion of our code.

fn run_test(
    pipeline: main_pipeline,
    m2: [u8; 6],
    m3: [u8; 6],
) -> Result<(), anyhow::Error> {
    // create and run the softnpu instance
    let mut npu = SoftNpu::new(NUM_PORTS.into(), pipeline, false);
    let phy1 = npu.phy(0);
    let phy2 = npu.phy(1);
    npu.run();

    // send a packet we expect to make it through
    phy1.send(&[TxFrame::newv(m2, 0, b"blueberry", 47)])?;
    expect_frames!(phy2, &[RxFrame::newv(phy1.mac, 0x8100, b"blueberry", 47)]);

    // send 3 packets, we expect the first 2 to get filtered by vlan rules
    phy1.send(&[TxFrame::newv(m2, 0, b"poppyseed", 74)])?; // 74 != 47
    phy1.send(&[TxFrame::new(m2, 0, b"banana")])?; // no tag
    phy1.send(&[TxFrame::newv(m2, 0, b"muffin", 47)])?;
    phy1.send(&[TxFrame::newv(m3, 0, b"nut", 47)])?; // no forwarding entry
    expect_frames!(phy2, &[RxFrame::newv(phy1.mac, 0x8100, b"muffin", 47)]);

    Ok(())
}

The first thing we do here is create a SoftNpu virtual ASIC instance with 2 ports that will execute the pipeline we configured with entries in the previous section. We get references to each ASIC port and run the ASIC.

Next we send a few packets through the ASIC to validate that our P4 program is doing what we expect given how we have configured the tables.

The first test passes through a packet we expect to make it through the VLAN filtering. The next test sends 4 packets in the ASIC, but we expect our P4 program to filter 3 of them out.

  • The first packet has the wrong vid.
  • The second packet has no vid.
  • The third packet should make it through.
  • The fourth packet has no forwarding entry.

Running the test

When we run this program we see the following

$ cargo run --bin vlan-switch
    Finished dev [unoptimized + debuginfo] target(s) in 0.11s
     Running `target/debug/vlan-switch`
[
    TableEntry {
        action_id: "filter",
        keyset_data: [
            0,
            0,
        ],
        parameter_data: [
            0,
            47,
        ],
    },
]
[phy2] blueberry
drop
drop
drop
[phy2] muffin

The first thing we see is our little sanity check dumping out the VLAN table after adding a single entry. This has what we expect, mapping the port 0 to the vid 47.

Next we start sending packets through the ASIC. There are two frame constructors in play here. TxFrame::newv creates an Ethernet frame with a VLAN header and TxFrame::new creates just a plane old Ethernet frame. The first argument to each frame constructor is the destination MAC address. The second argument is the ethertype to use and the third argument is the Ethernet payload.

Next we see that our blueberry packet made it through as expected. Then we see three packets getting dropped as we expect. And finally we see the muffin packet coming through as expected.

You can find this program in it's entirety here.

Guidelines

Ths chapter provides guidelines on various aspects of the x4c compiler.

Endianness

The basic rules for endianness follow. Generally speaking numeric fields are in big endian when they come in off the wire, little endian while in the program, and transformed back to big endian on the way back out onto the wire. We refer to this as confused endian.

  1. All numeric packet field data is big endian when enters and leaves a p4 program.
  2. All numeric data, including packet fields is little endian inside a p4 program.
  3. Table keys with the exact and range type defined over bit types are in little endian.
  4. Table keys with the lpm type are in the byte order they appear on the wire.