Creating and Accessing ELF sections, in userland and beyond

Posted on Oct 27, 2025
tl;dr: There can be many reasons one might need to create or access an ELF section, this article will go through doing so in simple userland programs, followed by Linux Kernel Modules.

Userland: Land of the Free

Everything starts with a simple a.out. We will now demonstrate how to create and place variables into custom sections using GCC and Clang attributes, as well as accessing these sections using obscure linker features.

Creation

One very useful but not widely known GCC and Clang attribute is section, which allows the programmer to determine in which section a variable is placed. Normally, the compiler decides which variables go to which section in an object file (.o), and the linker combines all the object files that you provide it with, and merges the sections in the final ELF.

If you have not already, I would recommend checking out GCC and Clang attributes, as they provide lots of useful features.

The following example places the integer cat into a custom section called bisi:

// Places the static int cat into the ELF section bisi.
__attribute__((section("bisi"))) static int cat = 67;

Note that the variable should have a static lifetime, so global variables and variables declared as static in local scopes are fine, but an integer that lives solely in the stack will not work.

In the final ELF, you can view the sections using objdump -h, which will provide an output similar to this:

 ...
 21 .data         00000010  0000000000004000  0000000000004000  00003000  2**3
                  CONTENTS, ALLOC, LOAD, DATA
 22 bisi          00000004  0000000000004010  0000000000004010  00003010  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 23 .bss          00000004  0000000000004014  0000000000004014  00003014  2**0
                  ALLOC
 ...

As you can see, the custom section named bisi is there, along with the numerous other sections that are created. One downside of using this attribute is the lack of flexibility with how the section is defined. There is no way to declare the section read-only, for example.

Of course, if you are not feeling like using bloated attributes (!) or you are using some other compiler that does not provide this feature, you can replicate it using inline assembly, which while looking horrendous, can provide some extra flexibility. Here is an example, which does the same thing as the example above:

asm(".section bisi,\"aw\",@progbits\n"
    ".global cat\n"
    ".type cat, @object\n"
    ".size cat, 4\n"
    "cat:\n"
    ".long 0x43\n");

extern int cat; // Holds the value 67

Very verbose! If you are proficient in assembly, you probably noticed a problem with this approach. Imagine the following scenario: you want to declare a variable in a custom section, inside a function. This will work no sweat with the section attribute, but if you use the inline assembly above, it will not work.

Why ? Because the instructions of the function are stored in the .text section in the assembly file, so we cannot randomly start a new section directive. We need a way to switch to a new section from within a section, do the necessary operations, and return to the original section. This functionality is provided by the .pushsection and .popsection directives, which allow you to place a section anywhere in the assembly file.

asm(".pushsection bisi,\"aw\",@progbits\n"
    ".global cat\n"
    ".type cat, @object\n"
    ".size cat, 4\n"
    "cat:\n"
    ".long 0x43\n"
    ".popsection\n");

extern int cat;

Give me [] pointers

One can access these variables as any other variable, but there also exists a way to access the sections directly at runtime. With this you could for example use variables that were aggregated from multiple object files.

The way this is done is by the linker providing the start and stop pointers to the start and end of a section respectively. Although an obscure feature, one can do so with the following ld convention:

extern int __start_bisi[];
extern int __stop_bisi[];

Where __start_ and __stop_ are special naming conventions provided by the ld linker. These symbols will be provided by the linker, as long as the section name results in a valid C variable name. This example also assumes that we only place integers into this section, hence the int[] declaration. If you store a struct in the section for example, you can get the section start as a pointer to that struct.

You might have noticed a problem here, namely that section names usually begin with a dot, so you cannot use this feature with section names starting with a dot (unless I am mistaken!).

In that case, one must use a linker script to provide this functionality. Here is an example script that places __start_bisi before the section, and __stop_bisi after the section:

SECTIONS {
  . = ALIGN(8);
  .bisi : {
    __start_bisi = .;
    *(.bisi)
    __stop_bisi = .;
  }
}

Since we are using a custom linker script, you can name the pointers whatever you want. You will have to compile your ELF with -Wl -T/path/to/linker/script.ld in order to include the linker script.

Callsite aggregation

Here I will try to provide an example that uses both the section creation and access methods. Imagine we have a project, in which we would like to locate some instances where a specific function is called. We can locate the callsite of a function by placing a label directly after the function call, and using the GCC and Clang extension “label by value”, which allows one to store the address of a C label.

Preferably, we would like to do this during compile time, as this information is available during compile time. So, to store the callsite addresses, we would need a structure that would be able to do. This is not that easy to do, because we can have calls ranging from zero to thousands, so a static array is not the way to go. I guess this part could be done by using a really complicated pre-processor macro, that would increment the __COUNTER__ every time we encounter a callsite, and somehow allocate the structure that way.

This gets even more complicated as you have multiple files, and calls across those files. An easy solution to this problem is storing the label addresses in a custom section, and accessing this at runtime as a void*[]. Here is an example implementation:

#define LBL    __call_label_
#define CALLSITE __call_site_

#define CONCAT2(a, b) a##b
#define CONCAT(a, b)  CONCAT2(a, b)
#define LABEL(label)  CONCAT(label, __LINE__)

#define __callsite __attribute__((__section__("callsite"), used))

// Memory barrier to prevent some optimizations
#define MEM_BARRIER asm volatile("" ::: "memory")

// Place label right after function call, and store label address
#define EMIT_LABEL(func)                                        \
    do {                                                        \
        static __callsite void *LABEL(CALLSITE) = &&LABEL(LBL); \
                                                                \
        asm volatile("nop");                                    \
        MEM_BARRIER;                                            \
        func;                                                   \
        LABEL(LBL)                                              \
            :;                                                  \
        MEM_BARRIER;                                            \
                                                                \
    } while (0)

// ... Inside some function
// Imagine we would like to know the location of the call here
EMIT_LABEL(cat_call());

In this example, we discard the function return value, but this can be handled by changing the macro slightly. We use the method we discussed earlier, namely placing a label after the function call, and storing the address of that label by using the compiler extension. We need a few restrictions on the function we call, namely that it must be marked noclone and noinline, since cloning and inlining will break the label addresses.

Furthermore, compiler optimizations tend to play around a lot with labels. GCC with optimizations enabled will reorder the labels to the top of the procedure declaration in the assembly, so the labels will produce garbage addresses. Clang, on the other hand, seems to realize that using the label by value extension means that the label locations are important, and does not reorder them (even produces a comment in the assembly file acknowledging this!). Unfortunately it is not smart enough to consider they might get destroyed when a tail call optimization is applied.

That is the reason for the MEM_BARRIER macros, and the reason why the code will not work on GCC with optimizations. To work around this issue, one might instead use the following inline assembly:

asm volatile("1:\n"
    ".pushsection callsite,\"aw\"\n"
    ".quad 1b\n"
    ".popsection\n" ::
        : "memory");

When placed after a function call, it will create a numerical label, switch to the callsite section, populate it by 1b, which in this case refers to the previous definition of the label 1. GCC will not reorder the labels in this case.

This is all fair and straightforward to implement in userland as it was shown, however things get a bit more complicated when one wants to do the same inside a Linux Module.

A step into the kernel

With the Linux Module approach, the linker script is not going to be sufficient, since the sections are relocated to the kernel address space. Fortunately, this mechanism already exists in the kernel, namely the start addresses of modules are provided in the userspace, in /sys/module/$module_name/sections/$section_name, accessible with the appropriate permissions. However, there is no clear way to get this information from within the kernel. (Also note that your kernel must be compiled with CONFIG_KALLSYMS!)

Now one approach would be to interact with the userland in order to extract this information, but we do not do that here. We know that there exists a sysfs mechanism that provides this information to the userland, so there must be a way to access this directly. That is indeed the case, but this feature is not provided as a kernel API, so the solutions are hacky, and more importantly version dependent.

Kernel versions v2.6-v5.19

This feature of providing section addresses to the userland was first introduced in version 2.6. The kernel needs to be compiled with the CONFIG_KALLSYMS option enabled, otherwise this feature is not available, although the author (Jonathan Corbet) mentions that CONFIG_DEBUG_INFO might be a better option.

In fact, this exact problem and solution was discussed in a stackoverflow post.

struct module_sect_attr {
    struct module_attribute mattr;
    char *name;
    unsigned long address;
};

struct module_sect_attrs {
    struct attribute_group grp;
    unsigned int nsections;
    struct module_sect_attr attrs[0];
};

static unsigned long get_section_addr(const char *section_name) {
    unsigned long section_addr = 0;
    unsigned int nsections = THIS_MODULE->sect_attrs->nsections;
    struct module_sect_attr* sect_attr = THIS_MODULE->sect_attrs->attrs;

    unsigned int i;
    for (i = 0; i < nsections; i++) {
        if (strcmp((sect_attr + i)->name, section_name) == 0)
        section_addr = (sect_attr + i)->address;
    }

    return section_addr;
}

Code by peachykeen. The code is pretty straightforward, the data structures are not exported by the kernel, so we have to define them manually. We traverse over the section attributes of the current module, match the section name and return the section address. If no match is made, we return 0.

Kernel versions v5.19-v6.13

Some changes to the structures in the kernel were introduced in version 5.19, so some corrections are needed. In order to find what I had to change in the code, I used the elixir cross referencer, which is a powerful tool one can use to traverse the kernel data structures. Bootlin provides a version hosted on the web for the linux kernel, along with multiple other big projects.

So I investigated a bit with elixir and found the changes made to the code, where the module_attribute structure was removed in favor of bin_attribute. The section name is now embedded in bin_attribute.

struct module_sect_attr {
    struct bin_attribute battr;
    unsigned long address;
};

struct module_sect_attrs {
    struct attribute_group grp;
    unsigned int nsections;
    struct module_sect_attr attrs[];
};

static unsigned long get_section_addr(const char *section_name) {
    unsigned long section_addr = 0;
    unsigned int nsections = THIS_MODULE->sect_attrs->nsections;
    struct module_sect_attr *sect_attr = THIS_MODULE->sect_attrs->attrs;

    unsigned int i;
    for (i = 0; i < nsections; i++) {
        if (strcmp((sect_attr + i)->battr.attr.name, section_name) == 0) {
            section_addr = (sect_attr + i)->address;
            break;
        };
    }

    return section_addr;
}

With some minimal changes, the code still works.

Kernel versions v6.14-v6.17

In version 6.14 the structure was simplified further, with the bin_attribute struct being directly used. The attribute group should now be used to traverse over the sections, and the private field of the bin_attribute is populated by the address.

struct module_sect_attrs {
    struct attribute_group grp;
    struct bin_attribute attrs[];
};

static void *get_section_addr(const char *section_name) {
    void *section_addr = NULL;
    struct bin_attribute **bin_attr;

    for (bin_attr = THIS_MODULE->sect_attrs->grp.bin_attrs; *bin_attr; bin_attr++) {
        if (strcmp((*bin_attr)->attr.name, section_name) == 0) {
            section_addr = (*bin_attr)->private;
            break;
        };
    }

    return section_addr;
}

One can see that the development of Linux can be sudden and with a lot of changes. The change in the v6.14 was minimal, but the removal of the section address stumped me a bit. I had to read through the commit messages in order to understand why it was removed. The reason was explained clearly in the commit message, and the commits were atomic, so it was easy to debug.

So with this we went over how to access section addresses of a module in the Linux kernel, which was a fun exercise to do. Of course, no one knows what the future holds, so this article might be outdated at any time.

Links