Observing Cryptocurrency Mining from a Statistical Perspective

Observing Cryptocurrency Mining from a Statistical Perspective

Author: Luxor Tech

Translation: Zoe Zhou

Source: Crypto Valley

Simple definition
The "luck" of cryptocurrency mining is essentially a probabilistic event. Imagine that each miner is given a lottery ticket for a certain amount of hash power they provide. For the sake of illustration, let's say you provide 1 EH/s of hash power, and the total hash power in the network is 100 EH/s, then you will get 1 of 100 lottery tickets, which means the probability of winning is 1%. So for every 100 blocks found, according to statistical calculations, you will also find 1 of them.
Now imagine you find 2 blocks out of 100, that means you find another block earlier than statistically expected. Then you are lucky! Now imagine you find 0 blocks out of 100, you are unlucky again. In the long run, statistically you should find one of these blocks every 100, but in the short run there will be differences.
The above description should give you a basic understanding of what “luck” is, but if you want to learn more about it, keep reading below!

Key Terms
  • Mining Pools

A mining pool is a group of miners that work together to reduce the volatility of their rewards. Miners share their processing power across the network, which then splits the rewards based on the amount of work they did to find a block, and charges a fee to the pool operator.
When mining difficulty increases to the point where it may take years for a small miner to find a block on their own, joining a mining pool becomes a trend. The solution to this problem is for miners to pool their resources so that they can generate blocks faster, thereby earning part of the block reward on a concerted basis.
  • PPS model: Pay-Per-Share method - that is, paying for each share

This distribution model is based on the miners' share of computing power in the mining pool, and they receive a fixed income every day. The ultimate goal of this model is to eliminate the "luck" element and reduce the risk of miners, but transfer the risk to the operator of the mining pool. The operator can charge a fee to compensate for the losses that may be caused by these risks.
  • PPLNS mode: (the purest team mining) Full name: Pay-Per-Last-N-Shares

This means "paying income based on the past N shares". This distribution model is based on the actual income of the mining pool on that day, and distributes the income to miners according to the proportion of computing power. This means that once all miners discover a block, everyone will be allocated the currency in the block according to the proportion of the number of shares contributed by each person.

Calculating “luck”

  • Probability of mining a block

The probability of mining a block for each hash power is 1/(²³²*Difficulty).
As of February 19, 2020, the BTC difficulty is 15,546,745,765,549.
Therefore, the probability of each hash power being able to mine a block is 0.0000000000000000000001498%.
  • "luck"

The correct way to calculate "luck" is to look at the expected share and the actual share in each round.
“Luck” = average (expected share per round / actual share per round)
The more "luck" you have, the better your mining "luck" will be. 200% luck means you only need to submit half of your shares to find a block.
  • "Luck Statistics"

The “luck statistic” is the opposite of the above. This is how we as pool operators think about “luck”.
"Luck statistic" = average (actual share per round / expected share per round)
The smaller the Luck Statistic value, the better your “luck”. A Luck Statistic of 0.5 means that you submitted half of the shares needed to find a block.
  • The difference in luck

When looking at "luck" over a period of time, we should not look at it in terms of time (hours, days, etc.), but in terms of blocks. "Luck" differences are particularly important when drawing conclusions based on "luck" or comparing two different miners. One miner may mine more blocks than another, which greatly affects our perception of "luck".

Visualizing “luck statistics”

  • Finding the right distribution

Let's start with the basic Poisson distribution, which is a discrete random probability distribution that represents the probability of a given number of events occurring in a fixed interval of time or space if those events occur at a known constant average rate and are independent of the time since the last event. The problem with the Poisson distribution is that it is discrete and not continuous. The Poisson distribution deals with the number of events in a fixed time period, but that's not how we look at "luck statistics".
The next step is to examine the gamma distribution, which is continuous. The gamma distribution is the probability distribution of the times between events in a Poisson point process. A process in which events occur continuously and independently at a constant average rate. The gamma distribution solves the question "how long will it take for n random events to occur?" When the shape parameter of the gamma distribution is an integer, the distribution is called an Erlang distribution. This is important for looking at "luck statistics" data because it is always a positive integer.
The "luck statistic" is a negative binomial distribution, so an analogy can be used with the Erlang distribution.
  • Erlang distribution

We don’t need to go into the formula for this distribution, but you can think of the Erlang distribution as a generalization of the exponential distribution. This distribution is a continuous distribution that is often used to measure the expected time of an event (i.e. mining a block).
Using this distribution makes it easier to calculate "luck", and will actually become more accurate as the network difficulty increases. At the current network difficulty, the error rate is no more than one in a million.
If this is hard to understand, the next section will help you visualize it.
  • Probability Density Distribution Function (PDF)

Using the Erlang distribution, the PDF can indicate how likely it is that the "luck statistic" is any arbitrary value. At any given time, there is a 0% chance that the "luck statistic" value is an exact number (i.e., 1.00000000000). Conversely, the PDF can be used to specify the probability that the "luck statistic" value falls within a specific range of values ​​(i.e., below 1.0).
The reference formula is as follows:
You can calculate the PDF using R or python. But an easier way is to use the Wolfram Alpha website.
quantile(ErlangDistribution[Number of Blocks, Number of Blocks], optional %)
The PDF shows a range of potential outcomes for each block plotted, which means that the “luck statistic” is likely to be far below the mean of 1.0.
If we increase the number of blocks to 14, which is about 10% of the network reward per day, then we can see that the distribution starts to become more normalized. Now, it is more likely that the "luck statistic" value is close to 1.0, but there is still a lot of room for variation.
If we increase the number of blocks to 144, or 100% of the daily network reward, then we see a normal curve with a fairly small range of potential outcomes. It is unlikely that the "luck statistic" will be below 0.7 or above 1.3 in 144 blocks.
The PDF helps understand the importance of looking at “luck statistics” in large samples (i.e. averaged over more blocks).
  • Cumulative Distribution Function (CDF)

CDFs are a great way to analyze “luck statistics” data. Let’s say your mining pool had a “luck statistic” of 1.3 over the past 1, 10, and 140 blocks. Is this unlucky or highly unlikely? (and raises other questions).
Again, you can use R or python to model this, but you can also use the Wolfram Alpha website or excel.
Wolfram Alpha: CDF[ErlangDistribution[nblocks, nblocks], Luck Statistic]
Excel: =GAMMA.DIST(Luck Statistic, nblocks, 1 / nblocks, 1)

A "luck statistic" of 1.3 for a block would show up as 0.727468. This means that in 100 reruns of a block, about 73 times we will see a luckier block. And in 27% of cases, we will see a more unlucky block.

We put together the following table to show the probability of having a particular "luck statistic" over multiple blocks. As you can see, there are times when the "luck statistic" is so bad that the value is unbelievable (e.g., an average of 1.5 over 60 blocks).
We usually don't care too much about these since we're not losing money. But it might be a reason to check if the data is correct.

in conclusion

  • PPS Mining Pool

As we described at the beginning, PPS pools eliminate mining variance for its miners. Therefore, bad "luck" hurts the pool, good "luck" benefits the pool, and miners are never affected. The only thing that worries miners is that if a pool goes bankrupt, they will not get their pending payments and will have to go through a downtime period until they switch to a new pool.
As a PPS pool operator, we track our "luck" very seriously. We want to make sure there is enough liquidity to cover short-term differences. We also want to make sure everything is running smoothly. Usually when "luck" is bad, we check whether we are really that unlucky. If we find that our "luck" is worse than 99% of the cases, then we start to consider other factors, such as attacks on the mining pool or technical bugs. We will discuss this further with an example of a mining pool attack below.
Since we launched our own BTC mining pool, we have found an average "luck" of 0.502 for every 9 blocks found. As a pool operator, we are happy with this, but it is still within the realm of possibility (it would be better in 4% of cases). From running 10 other mining pools, we know that bad "luck" is also possible, so we don't want this to continue.
  • PPLNS Mining Pool

The PPLNS mining pool operates in the opposite way to the PPS mining pool. Instead of the mining pool taking on the differential risk, the miners do. A period of bad "luck" means the miners get less in return, but a period of good "luck" means they get more in return.
For the PPLNS mining pool, miners may leave after a period of bad luck. This is called the "Gambler's Fallacy", which is the belief that if a particular event has occurred more frequently than usual in the past, it is less likely to occur in the future (and vice versa). People think that the next mining pool may not have better "luck".
Miners in the PPLNS pool should pay close attention to the "luck" in their pool. If it is unlikely to be due to bad "luck", then the pool may be under attack or have a bug.
  • Block Withholding Attack

This is one of the most common attacks on mining pools, usually from other competing mining pools. If they are PPS mining pools, they will go bankrupt; if they are PPLNS mining pools, their miners will leave them.
Block withholding occurs when miners do not return valid hashes that they created.
The attacker (miner) sets some custom specifications to not return hashes smaller than a preset size. The preset size is usually slightly below the network target (the inverse of the difficulty). Therefore, the miner will still get paid for the shares they submit to the pool that are above the share target but below the network target. The miner will never send a hash of a valid block to the pool. Because the miners are paid as if they could produce a network block, the PPS pool will lose money and the miners in the PPLNS pool will also lose income.
There are ways to prevent these attacks, but there is no clear, foolproof method. This is usually done by mining pools monitoring the individual "luck" of miners and locking their accounts after they become unlikely to produce a hashrate below the network target.

<<:  Canaan Technology is bearish on Wall Street: related-party transactions, lawsuits, and clients involved in major commercial fraud

>>:  If Bitcoin doesn’t take off as expected, what will happen to miners after the halving?

Recommend

What kind of face has good luck with noble people?

A noble person is someone who will help you when ...

Rare! Egypt is accused of secretly using citizens' computers to "mine"

As the price of Bitcoin soared, all walks of life...

ECB President: Bitcoin is ‘highly speculative’ and needs to be regulated

European Central Bank President Christine Lagarde...

What does the triangle formed by three moles on the neck represent?

It is relatively rare to have three moles growing...

Samsung IT unit invests in blockchain and security solutions companies

Rage Comment : Samsung is the largest multination...

There are 274 domestic Bitcoin-related lawsuits

According to data from the China Judgments Online...

Türkiye Bans Bitcoin Payments as Lira Collapses

Türkiye has banned all cryptocurrency payments as...

What do ears represent?

What do ears represent? 1. People with bright ear...

The key points of Ethereum's next fork are enough to read this article

In the past two weeks, due to the large number of...

My opinion on Lightning Network: A leap forward for the Bitcoin network

About three weeks ago, I wrote a post titled “The...

ETH 2.0 reaches the minimum launch threshold, ETH hits a high of $620 today

Ethereum 2.0 Launchpad‌ According to previous new...