Collisions of Solidity Storage Layouts

Author: MixBytes team
This is the second article in the series "Upgradable Smart Contracts: Storage Highlights and Challenges". The first part can be found here.
This time we will take a closer look at Solidity data storage approaches and potential pitfalls of using proxy contracts.
Delegatecall & Collisions of Solidity Storage Layouts
Let me remind you what EVM storage model looks like and how Solidity applies it for storing elementary type variables, arrays and mappings.

Ethereum smart contract storage is a mapping uint256 to uint256. Uint256 value is 32-bytes wide; this fixed-size value is called a slot in Ethereum context. This model is similar to virtual random access memory with an exception that the address width is 256 bits (unlike the standard 32 and 64-bits) and each value size is 32 bytes instead of one. 256-bit wide address is enough room for a well-known Solidity trick: any 256-bit hash can be used as an address, we will come back to it later. This data storage approach is quite extravagant and differs from those applicable to WebAssembly but its effectiveness is out of scope of this article.

During a standard computer program execution, RAM should be controlled so that different variables and data structures do not "collide" and damage each other's data. Usually the so-called memory allocators cope with that task. Allocators have an API (malloc, free, new, delete, and other functions). Moreover, records are often stored "compactly" not to litter the address space with data that is also an allocator responsibility. Solidity does not possess a storage-controlling allocator, and the tasks are handled differently. Smart contract values are stored in slots, starting from slot 0 and so on. Elementary fixed-size value types occupy one slot. Moreover, they sometimes can be packed into one slot and unpacked on the fly. To store an array, Solidity records an array element count into the next slot (let us call it "a header slot"). The elements themselves will be located at the address that can be calculated as keccak256 hash of the array header slot number. This is similar to dynamic array storage mechanisms used in C++ and Java, when array data structures are located in a separate memory location that is referred to by the main struct. The only difference is that Solidity does not keep this pointer anywhere. This is possible as we can write to any storage location without allocating memory - it fully belongs to us and by default is initialized by zero values.

For instance, after calling the allocate() function in the following code:
uint256 foo;
uint256 bar;
uint256[] items;

function allocate() public {
    require(0 == items.length);

    items.length = 2;
    items[0] = 12;
    items[1] = 42;
}
Storage looks like:
We can easily check an array element address by executing the js-code:
web3.sha3('0x00000000000000000000000000000000000000000000000000000000000000002
', {encoding: 'hex'})
Mapping has similar mechanics with only difference that every value is located separately and hash computation involves a corresponding key. You may think of possible data collisions but these collisions are neglected as well as 256-bytes hash collisions. Contract inheritance doesn't add up to the current situation. For contracts that use inheritance, the ordering of state variables is determined by the C3-linearized order of contracts starting with the most base-ward contract.

The above-described rules are called "Layout of State Variables in Storage" (later referred to as "Storage Layout"), the details can be found and should be checked here. Modifying these rules would eliminate backward compatibility, therefore we are unlikely to see any changes affecting smart contracts and libraries in the future.

Now that we are familiar with a proxy smart contract operation and storage layout, let's find out what can go wrong.

A particular code version, that records data in the storage of a proxy, has its own variables and storage layout. The following version will also have its storage layout and it must be capable of handling the data formed in accordance with the previous storage layout. That's half the trouble. Don't forget about the proxy code that also has a storage layout and operates in parallel with the current smart contract version that gains control. Thus, the proxy code storage layout and the current smart contract version should not interact, i.e. they cannot use the same slots for different data.

The easiest solution is to store proxy data sidestepping the usual Solidity storage layout mechanics, using EVM sstore and sload in structions to read and write data into the slots with pseudorandom numbers, for instance, returned by the following hash-function keccak256(my.proxy.version). We managed to escape collision.

Another approach is to use identical storage layouts and high-level data dispute resolution, as in github.

Let's figure out what may happen if we miss two "competing" storage layouts. Have a look at the commit 3ad8eaa6f2849dceb125c8c614d5d61e90d465a2 from the AkropolisToken repository. We performed a token sale contract security audit for that company.

We noticed that both TokenProxy and current AkropolisToken version have their own state variables (AkropolisToken ones are located in basic contracts). We are expecting a collision to happen and, consequently, a big trouble ahead. However, a quick test deflates this guess. If we change the paused flag (that is present in the AkropolisToken inherited from Pausable) after calling the pause function, TokenProxy state variables won't change. TokenProxy function calls perform correctly - calling getter functions of the TokenProxy contract happens after addressing the token as they are defined in the proxy contract. As proxy doesn't have a pause function, it is called via calling the default function of the basic UpgradeabilityProxy contract that in turn executes a delegatecall in AkropolisToken containing a pause function. Why there's no collision still?

We'll have to concentrate and make a detailed map of TokenProxy and AkropolisToken slots, taking into account state variable location rules described above. We have to find out the correct order of basic contracts and bear in mind possible packing of several state variables in one slot. You can test your results in Remix: send a transaction of some record, debug it and track storage changes.

The results are as follows:
Pay attention to slots 3 and 4. Like in a TokenProxy contract, slot 3 is used for storing the pendingOwner variable. However, pendingOwner belongs to the address type and is only 20-bytes wide, i.e. it doesn't occupy a whole slot. Therefore, one-bit boolean flags paused and locked can also be packed into the slot that, in turn, explains the absence of paused and name flag collisions. As slot 4 that is a header slot in the whitelist mapping is not used, there is no collision of name and whitelist.

Two contracts almost escaped collision but we can still trace it in slot 5. To illustrate our point, we wrote the following test:

https://gist.github.com/Eenae/8e9affde78e2e15dfd6e75174eb2880a

The test fails at line 42 - decimals value no longer equals 18, though due to the TokenProxy contract code this value is unchangeable.

The easiest way out is disabling slot 5 that has been executed in this commit.
Conclusion
Using low-level instructions such as delegatecall demands a deep understanding of Solidity storage layout. We briefly reviewed the subject, provided an example of possible issues and suggested several solutions. Safe contracts to you!
  • Who is MixBytes?
    MixBytes is a team of expert blockchain auditors and security researchers specializing in providing comprehensive smart contract audits and technical advisory services for EVM-compatible and Substrate-based projects. Join us on X to stay up-to-date with the latest industry trends and insights.
  • Disclaimer
    The information contained in this Website is for educational and informational purposes only and shall not be understood or construed as financial or investment advice.
Other posts