Address Binding

When a binary is executed, it must be moved into main memory within the context of a process, so that it can begin executing instructions.

When a source code program is compiled, the compiler can use absolute or relocatable addresses. Absolute can only be used if the location where the program will be run in memory is known at compile time. If it's not known, a relocatable address must be used.

  • In most cases, a relocatable address is used. Reserving a location for a binary precludes other binaries from using that address space. If, for example, two binaries are set to run at address 0x8DEF0000 and one is executed, the other cannot execute.

  • Modern systems are position independent, allowing binaries to run in any location in main memory.

ELF formatted binaries use a PLT (Procedure Linkage Table) and GOT (Global Offset Table) to facilitate dynamic linking and loading. ELF binaries are constructed using lazy linking; a process where the linker makes an insertion in the relocation table, determines which shared library has the source code, and creates an incomplete entry in the GOT. The GOT entry, initially, is unresolved. The relocation table will include information about the source standard library, the type of routine (as defined by the ABI) used to dynamically link (e.g. JUMP_SLOT), etc.

To read more about dynamic linking, check out my article on PLT and GOT.

The linker will link object files together and organize sections in the binary; the final result is an executable.

When loaded, the loader must bind the relocatable address to an absolute address. This binding is essentially a mapping from one memory location to another.

Note: When code is linked dynamically, the GOT will hold the absolute memory address for code will be loaded. Initially, this binding will not exist; subsequent calls after the first PLT access will have a resolved entry in the GOT.

If a process can be moved during execution from one memory segment to another, binding must be delayed until run time. Special hardware is required to facilitate this.

Logical and Physical Address

A logical address is an address generated by the CPU and a Physical address is the address that exists in the Memory-Address Record, which is seen by the MMU.

When address binding occurs at compile or load time, an identical logical and physical address is generated.

The execution-time binding scheme results in differing logical and physical addresses; in this case, the logical address is referred to as a virtual address.

When relocatable to absolute address binding occurs at compile/load time -> Relocatable_Address::Absolute_Address, where Absolute Address = Logical Address = Physical Address.

When relocatable to absolute address binding is postponed until run time (because the program needs to move between different segments) -> Logical Address = Virtual Address, Virtual Address is bound to an absolute (Virtual::Absolute = Physical Address).

Logical Address Space = set of all logical addresses generated by a program.

Putting it Together

The base register which holds the lowest legal address, is also referred to as the relocation register. The value in the relocation register is added to the an address generated by a user process when that address is sent to memory.

If the base value is 14000 and a program accesses 346, an actual access would be mapped to 14346. That program can create a pointer to 346, store it in memory, manipulate it, and compare it with other addresses all using the number 346. Only when it's used as a memory address (i.e. indirect load/store) is it relocated relative to base address, becoming 14346.

User programs only use logical or virtual addresses.

Logical/Virtual Addresses (range) -> 0 to MAX

Physical Addresses (range) -> R + 0 to R + MAX

Max is 2^n where n is the address space bits. On a 32-bit, max = 2^32 and on a 64-bit OS, max == 2^64.

Last updated