Bit Fields, Byte Order and Serialization -- Wu Yongwei

logo.pngNetwork packets can be represented as bit fields. Wu Yongwei explores some issues to be aware of and offers solutions.

Bit Fields, Byte Order and Serialization

by Wu Yongwei

From the article:

n order to store data most efficiently, the C language has supported bit fields since its early days. While saving a few bytes of memory isn’t as critical today, bit fields remain widely used in scenarios like network packets. Endianness adds complexity to bit field handling – especially since network packets are typically big-endian, while most modern architectures are little-endian. This article explores these problems and their solutions, including my reflection-based serialization project.

Memory layout of bit fields

The memory layout of bit fields is implementation-defined. In a typical little-endian environment, bit fields start from the lower bits of the lower byte and extend toward higher bits and bytes. In a typical big-endian environment, bit fields start from the higher bits of the lower byte and extend toward lower bits and higher bytes.

Let’s consider a practical scenario. Suppose we want to use a 32-bit integer to store a date. How should we achieve this? A simple approach is to store the number of days from a fixed point of time (e.g. 1 January 1900). We can calculate the number of years that can be expressed as follows:

However, with this approach, extracting specific year, month, and day information becomes very cumbersome. A simpler way is to store the year, month, and day as bit fields. We can define the following struct, using only 32 bits:

  struct Date {
    int      year  : 23;
    unsigned month : 4;
    unsigned day   : 5;
  };

Our intention is to use a 23-bit signed integer for the year (ranging from -4,194,304 to 4,194,303), a 4-bit unsigned integer for the month (0–15, covering legal values 1–12), and a 5-bit unsigned integer for the day (0–31, covering legal values 1–31). This representation is similarly compact, with a slightly narrower range, but it’s quite sufficient and much more convenient for many common usages (excepting interval calculation).

Add a Comment

Comments are closed.

Comments (0)

There are currently no comments on this entry.