It is widely acknowledged that generating secure random numbers on the Ethereum blockchain is difficult due to its deterministic nature. Each time a smart contract’s function is called inside of a transaction, it must be replayed and verified by the rest of the network. This is crucial so that it is not possible for a miner to manipulate the internal state during execution and modify the result for their own benefit. For example, if the Ethereum Virtual Machine (EVM) provided functionality to generate a random number using a cryptographically secure random source on the miner’s system, it would not be possible to confirm that the random number generated had not been manipulated by the miner. Another more important reason however, is that this would not be determinsitic and if ether is transferred or alternative code paths are taken based on decisions made inside the function as a result of the generated number, the contract’s ether balance and storage state may be inconsistent with the view of the rest of the network.
This post is the first in a three-part series where we will look at some of the techniques developers are using to generate numbers that appear to be random in the deterministic Ethereum environment, and look at how it is possible in-practice to exploit these random number generators for our advantage. Our first post will focus on generating random numbers on-chain and what the security implications of doing so are. In the remaining two posts we will review another two commonly used techniques including using oracles and participatory schemes where numbers are provided via multiple participants.
Sources of Entropy in Ethereum
We have proposed that we cannot trust a single miner to generate a “high quality” random number for our smart contract and that if a “random” number is produced, the same number must be produced when other nodes of the network execute the smart contract code for verification. One method that is commonly used is the use of a Pseudorandom Number Generator (PRNG), which will produce a series of bytes that look random in a deterministic way, based on an initial private seed value and internal state.
The Ethereum blockchain provides a number of block properties that are not controllable by a single user of the network and are only somewhat controllable by miners, such as the timestamp and coinbase. When using these block properties as a source of entropy for an initial seed to a PRNG, it may well look sufficient as the output appears to look random and the seed value cannot be directly manipulated by users of the smart contract.
The following block variables are commonly used when generating random numbers on-chain:
block.blockhash(uint blockNumber): hash of the given block (only works for 256 most recent blocks excluding current)
block.number: current block number
block.coinbase: current block miner’s address
block.timestamp: current block timestamp as seconds since unix epoch
The main advantages of using block properties as a seed for randomness is they are simple to implement and the resulting random numbers are immediately available to the smart contract. This simplicity, speed and lack of dependence on external parties or systems makes the use of block properties a desirable option. It is often assumed that when using block properties as a source of randomness, only miners would be in a position to cheat. For example, if the output number did not work in their favour, they can throw away the block and wait for a new block whereby the generated number worked in their favour.
With the assumption that only miners are able to exploit the number generation using block properties as a seed, there are multiple blog posts, Reddit posts, and Stack Overflow threads regarding when it is safe to use these properties for random number generation. These often incorrectly state that it is acceptable to use block properties only when the potential payout is less than the mining reward, as it would not be beneficial for a malicious miner to throw out the block. However, this is not case, as we will see when we analyse and exploit the vulnerable smart contracts below.
Exploiting a Simple Number Guessing Smart-Contract
Firstly we will look a naïve, yet not uncommon implementation using the block.blockhash property. The GuessingGame smart contract allows the participant to guess a randomly generated number. If the participant guesses correctly they win twice their initial bet.
If we look at the badRandom function, we can see how the random number is generated by casting the blockhash of the previous block to an unsigned integer, then performing a modulus operation:
This will appear to provide a random value between 1 and 10 (unfortunately this also introduces a modulo bias meaning that some values are more likely than others). As the previous block number is not controllable by an attacker it cannot be manipulated to produce a random number in the attackers favour… however, the seed is known to the attacker. It is therefore possible to predict what the next winning number will be and beat the house. One potential problem with this approach, is that the attacker needs to take the current block number, get the blockhash, generate the next number and make sure their bet was placed in the very next block.
This isn’t very feasible to do manually, however we can get around this by calculating the next winning number on-chain, then make an external contract call to the GuessingGame with the correct number. The following attacker contract will always predict the winning number when the cheat() function is called.
Another Vulnerable ‘Lottery’ Style Game
The above contract will allow us to always take away the winnings, however, can we still exploit this type of random number generation when the generation takes place at some point in the future? To explore this, we have the following lottery style smart contract where participants can buy a ticket in a draw. When enough tickets have been sold a winner can be selected. A common, but problematic, coding pattern is shown below:
By looking at the buyTicket function below, there is nothing the attacker can control when buying a ticket, other than waiting for specific tickets to be sold and buying theirs at a specific point, such as waiting for 2 to be sold and then attempting to purchase the 3rd.
Lets now look at how the winning ticket is chosen:
Firstly, there is a require statement to ensure that the winner can only be chosen once the required number of tickets have been sold. If this requirement is met the sale is over and a random number is generated. In this case we have no control over what the winnerIndex will be, however we can calculate who the winner will be before invoking the drawWinner() function. Allowing the attacker to wait until a blockhash is used that generates a random number making the attacker the winner.
The problem with this approach is that the attacker needs to know which ticket they have, or at which index in the drawParticipants array their account address is located. Within the blockchain, even private variables are readable by everyone, even if the contract does not directly expose them. The web3.eth.getStorageAt(contractAddress, index) method can be used to look into the contracts persistent storage and identify which ticket is the attackers.
The attacker contract below will take the desired winner index, then calculate if that index is going to win the draw during the current block. If the desired winner is going to be selected, the drawWinner() function is called and the attacker takes home the contract balance. If the attacker is not going to win, the call returns before drawing the winner. The attacker just needs to repeatedly call the cheat(winnerIndex) function until the blockhash outputs a number that results in the correct winner. It is true that this process is going to cost the attacker in transaction fees for each repeated call, however this is likely to be negligible when compared to a games payout.
The primary drawback with this approach is that if the drawWinner() function is called by another participant, then the next winner may be chosen at a blocknumber which does not result in the attacker winning. Another issue is that depending on the number of participants, the attacker may need to submit a large number of transactions before they are chosen.
A partial mitigation?
As games are typically designed to be played by real players, rather than other smart contracts, we could look to identify whether the player’s address is a regular Externally Controlled Account (EOA) or a smart contract account. It appears this can currently be achieved by using inline assembly and the EXTCODESIZE opcode, which returns the size of the CODE property of an external Ethereum account using its address. For example, this could be implemented with the following:
This will restrict specific functions from only being called from Externally Owned Accounts and therefore mitigate the attacks outlined above. However, this does not mitigate against attacks from malicious miners and will likely break under future accounts created under the Ethereum account abstraction proposed in EIP-86 which is scheduled for Constantinople Metropolis stage 2.
The practise of generating pseudo-random numbers using block properties is highly discouraged. We have looked at how an attacker can actually exploit such PRNG implementations via external contract calls, which allow an attacker to predict the next number to be generated in the same block. Whilst a partial mitigation does exist to prevent the specific attacks mentioned, block properties and on-chain data are always public and therefore carry the risk that an attacker may be able to predict the winning number and use it for their advantage.
In the following two parts of this series, we will analyse the use of generating random numbers using participatory schemes where numbers are provided via multiple participants, and through the use of external sources of randomness that are consumed via the use of Oracles.