On December 19, 2020, the Filecoin network experienced an on-chain outage, which meant that new blocks could be created for a period of time, but miners could not reach consensus on the resulting state, and each block calculated a different value. Thanks to the rapid response among community members, miners, and developers - a fix was released within four hours, and the network achieved full recovery within seven hours. The underlying issue is potentially non-deterministic iteration over a mapping of objects in the storage miner actor implementation. The actor is implemented in Go. Iteration over Go mappings is known to be non-deterministic. Participants always sort the results of an iteration before using it (enforced by static analysis). Unfortunately, a bug in the comparison function used when sorting two such maps resulted in an invalid sort (see #1335 ). As a result, different nodes processed the map entries in different orders, leading to different results and gas consumption. This code path can only be reached by (a) a miner declaring multiple sectors to terminate at once, or (b) a miner recovering from a failure across multiple partitions at once. (The other two code paths get to this point, but are extremely unlikely in practice.) Neither of these paths has been used in mainnet before, with multiple sectors/partitions exposed as non-deterministic data. The simultaneous termination of multiple sectors triggered this stall. Most importantly, it should be emphasized that no data was lost during the outage . While the inability to create new blocks temporarily inhibited transactions on the network, all data provided by storage providers is safe and available once the network is back up and running. In addition, it is worth noting that the Filecoin protocol specification provides for data retrieval even in the event of a chain outage. In other words: on-chain transactions were not possible for the duration of the event, but the core functionality of the Filecoin network remained intact. The speed with which basic issues were first discovered, identified, fixed, and deployed was also evident: 1. Automatic monitoring triggered an alarm within 15 minutes of the incident. 2. Within thirty minutes, miners and implementation developers came together to respond 3. Within four hours, the developers identified and released a fix for this issue 4. Within seven hours, enough nodes adopted the fix to exceed the power threshold for majority consensus, putting the network on the path to recovery This is an incredibly fast response for a young decentralized network. Even though established blockchains experience chain pauses and forks, the time it takes Filecoin to resolve this event is comparable to blockchains that have been running for years. The entire community should be proud of the speed with which this event was handled. Building a blockchain is like building a rocket. There are so many complex technologies involved that it’s hard to get everything right on the first try. Just like a real rocket, unexpected events can be hard to anticipate. When they do happen, it’s important to have the infrastructure in place to resolve the issue as quickly as possible, minimize the impact, and reduce the likelihood of the problem happening again. To this end, multiple teams worked on the writing and execution of post-mortems, identifying test coverage for actors/roles and other improvements to alerting and issue escalation for network infrastructure/communications to help mitigate future incidents. With the concerted efforts of the entire Filecoin community, this new technology will continue to be improved. We believe that the entire network will continue to improve in the process of discovering and solving problems, and will eventually form a stable and reliable "launchable" platform. |
>>: What else can millions of Ethereum 4GB graphics card mining machines do?
Analysis of the facial features of men with white...
Some people say that the Internet is the printing...
In the past, people may think that moles are ugly...
In fact, for everyone, work plays an indispensabl...
FX168 News: Bitcoin prices soared sharply in the ...
In physiognomy, the nose represents a person'...
What kind of women like older men? What kind of w...
If the population mobility is very fast, credit i...
Dear Antminer users: In order to provide better a...
In fact, a person's walking posture can also ...
The location of a woman's cheekbones can tell...
The so-called broken palm refers to the palm lines...
Many people know that in palmistry, there is a lin...
Moles in different positions on the face represen...
There has been a saying since ancient times that ...