EVM Internals Every Smart Contract Auditor Should Know
Long time no see,
i'm, back again! Been deep in the EVM trenches lately. Today, we’re unpacking one of the most slept-on but crucial parts of smart contract dev: the EVM memory layout. If you’ve ever been confused by stack vs memory vs storage — you’re not alone. Let’s break it down and make it click.
What Is the EVM?
The Ethereum Virtual Machine (EVM) is the runtime environment for smart contracts in Ethereum. It's stack-based and operates with:
- Memory (temporary, wiped after execution)
- Storage (persistent, costly)
- Calldata (input data)
- Stack (limited to 1024 slots)
EVM Memory Layout
Despite its massive theoretical size, practical EVM memory is organized in 32-byte words (256-bit slots) and only expands as needed. By convention, Solidity reserves the first 128 bytes of memory for special purposes. The layout is as follows:
-
0x00–0x3F (64 bytes): Scratch space. A temporary workspace used by the EVM (for example, by the
KECCAK256
/SHA3
opcode and other internal routines). Contracts and inline assembly can use this area for ad-hoc computation, but its contents are not guaranteed to be zero and will be overwritten frequently. -
0x40–0x5F (32 bytes): Free memory pointer. This word stores the current “allocated memory size” — i.e., the lowest unused memory address. Solidity initializes this pointer to 0x80 at the start of execution. All dynamic allocations (arrays, structs, etc.) are taken from this pointer, which must be updated (via
MSTORE 0x40
) as memory is consumed. -
0x60–0x7F (32 bytes): Zero slot. By convention, this slot is always zero. Solidity uses it as the default value for new dynamic data (e.g., an empty array or string starts with length 0). The compiler and EVM assume this slot remains zero and will never overwrite it.
-
0x80 and above: Dynamic memory area. All other memory (from 0x80 onwards) is available for actual data. New objects are placed starting at the current free pointer. For example, a dynamic array in memory is laid out as
[ length (32 bytes) | element0 | element1 | … ]
, with each element taking a full 32-byte word. No packing: every value occupies its own 32-byte slot.
These conventions can be summarized as:
- 0x00–0x3F: Scratch space (ephemeral; used for hashing, etc.)
- 0x40–0x5F: Free memory pointer (points to next free byte; initially 0x80)
- 0x60–0x7F: Zero slot (always zero; initial value for dynamic data)
- 0x80+: Heap (dynamic memory for variables, arrays, structs, strings, etc.)
These ranges are annotated by Solidity’s compiler and tooling. For example, at startup, the compiler emits MSTORE 0x40, 0x80
to initialize the pointer. From then on, each time memory is allocated, the pointer is read, advanced, and written back.
Scratch Space (0x00–0x3F)
The first 64 bytes (addresses 0x00 to 0x3F) serve as a scratch pad. It is the workspace for certain EVM operations — notably, the KECCAK256
(alias SHA3
) opcode will read its input from this area and write the 32-byte hash back here. Other instructions (like CALLDATALOAD
or arithmetic routines) may use it under the hood. Crucially, this area is not guaranteed to be initialized to zero. Whatever data happens to be there will be overwritten. Smart contract code (especially inline assembly) can use scratch space for temporary variables but should treat it as ephemeral. In practice, if you need more than 64 bytes of scratch space, the compiler spills over to the heap without moving the free pointer (for example, complex arithmetic or ABI encoding might do this).
Free Memory Pointer (0x40–0x5F)
At address 0x40 (the word 0x40–0x5F), the EVM keeps the free memory pointer, a 32-byte integer equal to the next free memory offset. At the beginning of execution, this pointer is set to 0x80. When your code needs to allocate memory, it does so by reading this pointer, writing data at that location, and then updating the pointer. For example, in Yul one might write:
let ptr := mload(0x40) // load the free memory pointer (initially 0x80)
mstore(ptr, 0xDEADBEEF) // store a 32-byte value at [ptr .. ptr+31]
let newPtr := add(ptr, 0x20) // advance pointer by 32 bytes
mstore(0x40, newPtr) // write back the new free memory pointer
Each MSTORE (and other memory-writing opcode) expands memory if needed. After this sequence, the pointer will be 0xA0, meaning the next allocation will start at 0xA0. Solidity always places new objects at the free pointer and never reuses or frees memory. Forgetting to update the free pointer after a manual allocation can corrupt subsequent data.
Zero Slot (0x60–0x7F)
The word at 0x60–0x7F is the dedicated zero slot. It is initialized to zero at contract start and is never written to. Solidity relies on this convention for dynamic data: for example, a newly allocated dynamic array’s initial length is 0, and the compiler simply reads from the zero slot if it needs a zero value. In short, this slot remains a constant source of zeros. Auditors should note that you should not use this slot for your own data — it is reserved for the compiler’s expectations.
Dynamic Memory Area (0x80 and Beyond)
All memory from 0x80 onward is a general-purpose heap. This is where user data lives: local variables in memory, contents of string/bytes, temporary arrays, return buffers, etc. Solidity’s convention (and compiler implementation) is to allocate everything from the free pointer onward. For instance, a uint256[] memory arr = new uint256 is laid out as 32 bytes of length n followed by n 32-byte elements. The length is stored at the first slot, and then elements occupy consecutive words. Even if the element type is smaller (e.g. uint32), each occupies a full 32-byte word. (The only exceptions are bytes and string, which are treated as byte arrays – they still start with a 32-byte length, but their contents are not padded to 32-byte boundaries beyond what is needed for the total length.) Multi-dimensional arrays are pointers to their sub-arrays in memory, but each pointer is itself a 32-byte value.
It’s important to remember: memory variables are word-aligned. You can index into memory by bytes (e.g. mload(0x95) reads 32 bytes starting at 0x95) but effectively each variable takes up a whole 32-byte slot. This simplifies addressing at the cost of some wasted space for small types.
Memory Operations (MLOAD, MSTORE, etc.)
The EVM provides a handful of opcodes to work with memory:
- MLOAD: Pop an address off the stack, then push the 32-byte word at that memory address. (Cost: 3 gas base, plus expansion cost if this access expands memory).
- MSTORE: Pop an address and a 32-byte value, and store the value at that memory address (overwriting bytes [addr..addr+31]). (Cost: 3 gas base, plus expansion cost).
- MSTORE8: Pop an address and a value (0–255), store the low-order byte at that address. (Cost: 3 gas base, plus expansion if out of bounds).
- MSIZE: Push the current memory size in bytes (rounded up to 32-byte words times 32). For example, if the highest accessed byte is 0x9F, MSIZE returns 0xA0.
- RETURN/REVERT: These pop (offset, length) and return or revert with that memory segment as output. They also incur the memory expansion cost if needed.
- CALLDATACOPY/CODECOPY/EXTCODECOPY: Copy data into memory (e.g. calldata into memory), also causing expansion as they touch memory.
- KECCAK256: Reads from memory (inputs from [ptr..ptr+size]) and writes output to memory (at ptr), but we can think of it as an internal use of memory/scratch.
Whenever any opcode writes to or reads from a memory address beyond the current MSIZE, the memory is “expanded”. The gas cost for memory is charged per 32-byte word: each new word added costs 3 gas, plus a quadratic term once you exceed ~724 bytes. In practice, this means small memory (<~700 bytes) grows cheaply (roughly 3 gas per word), but very large allocations see an additional growing cost. For example, using up to 1 KB of memory only costs on the order of a few hundred gas. Once a region of memory has been expanded, further loads/stores within it cost only the flat fee (3 gas for MLOAD/MSTORE) with no extra.
Allocation Flow: Step-by-Step
Putting it all together, a typical memory allocation sequence works like this (often in Yul/assembly):
- Read the free pointer: ptr := mload(0x40). By default this will be 0x80 on first use.
- Use memory at ptr: For example, mstore(ptr, value) or copy data into [ptr..ptr+N-1].
- Compute new pointer: newPtr := ptr + (size_needed). Always round up to a multiple of 32 bytes.
- Update the free pointer: mstore(0x40, newPtr)
This advances the heap for future allocations. For example:
assembly {
let ptr := mload(0x40) // step 1: get free pointer
mstore(ptr, 0x1234) // step 2: write 0x1234 at [ptr..ptr+31]
let next := add(ptr, 0x20) // plan to advance by 32 bytes
mstore(0x40, next) // step 4: update free memory pointer
}
After this code, memory from ptr to ptr+31 holds 0x1234, and the new free pointer is 32 bytes higher. If multiple words or bytes are stored, advance by the appropriate amount (e.g. add(ptr, 0x40) to allocate 64 bytes, etc.).
Gas Implications
From a gas perspective, memory behaves predictably. Each MLOAD or MSTORE has a flat cost of 3 gas. If the operation touches new memory words, a memory expansion cost is added: roughly 3 gas per additional 32-byte word, plus a minor quadratic surcharge once you use more than about 724 bytes. Concretely, small allocations (a few dozen words) are almost linear cost. Beyond ~700 bytes the cost grows slightly faster, but even 1 KB of memory only costs on the order of 98–200 additional gas. After memory is expanded to a certain size, further reads/writes within that size carry no extra expansion fee, only the 3-gas base.
In contrast, storage costs are orders of magnitude higher: writing a new storage slot costs 20,000 gas, and even resetting it costs 5,000 gas. This stark difference is why large, temporary data structures should use memory, and only final results (or persistent state) should be written to storage. Auditors should also note that reading uninitialized memory is allowed (it just returns zero for any word beyond MSIZE), whereas reading uninitialized storage is always zero by definition.
Best Practices and Takeaways
- Respect Solidity’s conventions: Avoid using the reserved slots (0x00–0x7F) except as allowed, and always maintain the free memory pointer at 0x40. If writing inline assembly, take care to save and restore the free pointer if you need custom allocations.
- Use memory for transient data: Calculations, temporary buffers, return values, logs, and call inputs all belong in memory. Remember that data in memory is cleared after the call, so don’t try to rely on it across calls.
- Align and manage allocations: Always round up to 32-byte boundaries when allocating. Track msize or the free pointer to know how much memory is in use. Be cautious of excessively large allocations, as they increase gas nonlinearly beyond certain thresholds.
- Profiling memory usage: Tools and manual reviews should confirm that memory expansion is minimized. For example, accumulating data in a bytes array in a tight loop will grow memory and cost more gas; sometimes it’s cheaper to log events or operate on smaller pieces.
- Understand gas costs: Remember that each MLOAD/MSTORE costs 3 gas, and memory expansion costs ~3 gas/word. Optimize your code to reuse memory space if possible, and limit unnecessary writes.
In summary, the EVM memory model is simple but powerful: a linear byte-array with a small reserved setup. Mastering the memory layout and allocation patterns is essential for writing efficient contracts and for auditing low-level code. By knowing exactly how the scratch space, free pointer, and heap are organized, auditors can better understand security implications (e.g. pointer overwrites) and gas consequences of memory usage.