How does ProgPoW resist ASIC? The development team IfDefElse will give you the answer

How does ProgPoW resist ASIC? The development team IfDefElse will give you the answer


After receiving some attention from mainstream media, the ProgPoW development team IfDefElse received many questions about the algorithm, and they answered several common questions. With the consent of the original author, Mine Vision translated and reported on it.

1

Q: What is your position on Ethereum governance?

A: We have no position yet. We think many questions should be left to the community to answer, such as whether or when to adopt ProgPoW. We are responsible for proposing new algorithms and are happy to answer technical questions related to them.

2

Q: Where does ProgPoW come from?

A: IfDefElse is a small team that analyzes and optimizes PoW algorithms. We have observed that the ETH community has repeatedly requested a new PoW algorithm in which professional ASIC miners have little advantage over conventional hardware. It is heartbreaking to see that many algorithms are vulnerable to ASIC miners, and every time a new ASIC miner comes out, the entire ETH community is frustrated.

So one day in the spring of 2018, we had the idea of ​​modifying the Ethash algorithm to achieve the expected effect of GPU mining. After initially editing the algorithm, we put it on the GitHub public section for research and development and fine-tuning.

3

Q: Who has evaluated ProgPoW?

A: During the process of collecting feedback on the algorithm, we were lucky to receive feedback emails from engineers at the Ethereum Foundation, Ethereum core R&D engineers, NVIDIA engineers, and AMD engineers. Both NVIDIA and AMD engineers gave generally positive comments on the algorithm.

It is worth mentioning that two algorithm updates and optimizations were made based on the comments of community members mbevand and Schemykh.

4

Q: How has AMD responded?

A: AMD’s response addresses two major concerns:

If the ProgPoW algorithm is used to replace the Ethash PoW algorithm, is it impossible for ASIC mining machine manufacturers to quickly study the open source code and manufacture specialized ASIC mining machines?

Will the ProgPoW algorithm make it more difficult for GPU miners to mine Ethereum?

An AMD engineer gave a positive answer. In theory, it is possible to build new ASIC mining machines for ProgPoW, but this requires the manufacturer to have specialized GPU knowledge background, especially memory controller technology.

Not only that, they also expressed concerns about the size of the caches (which share data locally and on AMD chips).

They mentioned in the email that whether the cache is 8KB or 16KB, there is no big difference in performance between AMD and NVIDIA. However, at 32KB and 64KB, it may have a significant impact on the architectures of the two GPU manufacturers, and there will be incompatibility between Polaris and Vega.

Based on their feedback, we set the size of PROGPOW_CACHE_BYTES to 16KB.

5

Q: How has NVIDIA responded?

A: NVIDIA engineers generally agree with our approach. They say the algorithm fills the holes between memory accesses with computation, rather than letting the GPU sit there like a noble memory controller doing nothing.

Their main concern is that if too many random operations are added to the algorithm, it will eventually become compute-bound rather than memory-bound, and ASIC miners built for compute-bound algorithms may achieve greater efficiency and gains.

Based on their feedback, we fine-tuned PROGPOW_CNT_CACHE and PROGPOW_CNT_MATH to ensure that the algorithm remains memory bound for most current GPUs.

6

Q: If ProgPoW calls the module on the main loop and uses kiss99() to select random instructions, wouldn't an ASIC designed for this algorithm be more efficient?

A: This is a common misunderstanding when looking at the algorithm for the first time. In fact, the calls to the modulo and kiss99() method in the main loop are calculated by the CPU to generate a random program, which is then compiled by the CPU. The GPU is responsible for executing the optimized code, which has already solved the problem of which instructions to execute and which hybrid state to use.

As Alexey said, ProgPoW generates source code every 50 blocks. An example of the generated program can be found in: kernel.cu.

We will also provide further explanation in the standard.

7

Q: Do miners need to install the AMD or NVIDIA SDK in order to compile the generated source code?

A: No. AMD and NVIDIA drivers include OpenCL, DirectX and Vulkan compilers. For CUDA, binary kernel files are distributed with a small software development kit.

8

Q: Does the ProgPoW algorithm have a preference for GPU architecture?

A: No, the ProgPoW algorithm is designed to ensure fairness as much as possible. There is no difference in execution between OpenCL and CUDA, and a 16KB cache can run smoothly on both architectures.

We've avoided having 16-bit or 24-bit operations on just one architecture, whether it's AMD's indexed register file or NVIDIA's LOP3, all operations are well supported across generations of architecture.

The performance of a ProgPoW GPU in a mining workload will also reflect the average gaming performance of that GPU.

9

Q: Why is the speed difference between Ethash and ProgPoW more than 2 times slower than expected for GPUs with heavily modified VBIOS?

A: ProgPoW reads twice as much memory per hash as Ethash, so the expected hashrate is 1/2. All tuning and sample hashrates we reported previously (see “Results:Hashrate”) were done on GPUs running at normal frequencies. Modifying the VBIOS extensively to reduce core frequency will cause the miner to be compute-bound rather than storage-bound when running this algorithm.

If the user needs to switch to a new algorithm, the VBIOS modification and tuning will need to be performed again.

10

Q: Can you explain how Ethash ASIC miners are twice as efficient as GPU miners?

The Ethash algorithm only needs to execute 3 components:

High bandwidth memory (for DAG access)

Keccak f1600 engine (for initial/final hashing)

Micro-computing core (for inner loop FNV and module calls)

FPGA data shows that the power consumed by Keccak calculations is almost negligible. We estimate that when executing the Ethash algorithm, only about 1/2 of the GPU power is spent on memory access. The power of the Keccak and computing cores of the Ethash ASIC miner is negligible, and its power is mainly consumed in memory access, so there is still room for GPU mining efficiency to improve by twice.

A quick summary of current Ethash mining hardware:

Except for Titan V, all data comes from whattomine.com and asicminervalue.com.

The first generation of Ethash ASIC miners, Bitmain’s Antminer E3, has no efficiency advantage over GPU miners. This is because its DDR3 memory consumes more power than the GDDR memory of GPU miners.

As far as we know, the yet-to-be-released Innosilicon A10 ETHMaster is said to have better performance in terms of efficiency. Because Innosilicon uses GDDR6 IP technology on this series of miners, it will make its efficiency reach twice that of the current most efficient mining GPU RTX 2070.

11

Q: How practical is HBM?

A: Our initial algorithm evaluation used the same memory type for apples-to-apples comparisons. HBM has low power consumption, but is expensive, making it impractical. For example, an NVIDIA Titan V with HBM is only slightly less efficient than an A10 ETHMaster, but costs $3,000, which is clearly impractical.

AMD Vega cards with HBM are reasonably priced, but for some reason they only have 175 KH/s/W hashrate. We are not sure what limits Vega efficiency, increasing access size improves this significantly (bandwidth utilization goes from 61% to 75% - see "Results:Hashrate") but Vega cards still use too much power. We expect the just announced double bandwidth AMD Radeon VII cards to improve efficiency significantly.

We estimate that HBM uses about half the power of GDDR6. If HBM is used to manufacture expensive Ethash ASIC mining machines, the computing power will exceed 1 MH/s/W, which is about 4 times the efficiency of conventional GPUs on the market.

12

Q: How efficient can a ProgPoW ASIC be?

A: ProgPoW is designed to significantly reduce the efficiency gains of dedicated ASIC miners. The algorithm requires the following components to be met:

High bandwidth memory (for DAG access)

Keccak f800 engine (for initial/final hashing)

Large register file (for mixed states)

High-throughput SIMD integer math (for random operations)

High throughput SIMD cache (for random cache access)

Keccak capacity is reduced, so its power consumption on GPU is negligible. As a result, the advantage of ASIC miners in reducing power consumption will no longer exist.

In order to execute the random sequence, the ProgPoW ASIC miner needs to execute something very similar to the computing core on the GPU. All SIMD register accesses, mathematical operations and cache accesses require a GPU-like operating environment.

Yes, a ProgPoW ASIC ISA can be precisely designed to match the ProgPoW algorithm, such as removing floating point, adding explicit merge() operations, etc. However, this specialization will only provide a small marginal benefit, not an order of magnitude increase in benefits.

Optimistically, we assume that a well-designed ProPoW ASIC ISA can remove 1/4 of the computing core power consumption. Since the GPU core is much more active when executing ProPoW, we estimate that the memory interface consumes about 1/3 of the GPU power. Then the relative power consumption of a Prop PoW ASIC miner using GDDR is:

1/3 (memory) * 1 + 2/3 (computation) * 3/4 ​​= 5/6

The advantage is 1.2 times

If HBM is used, the relative power consumption of the ProgPoW ASIC miner is:

1/3 (memory) * 1/2 + 2/3 (computation) * 3/4 ​​= 2/3

The advantage is 1.5 times

13

Q: Can ProgPoW be run on an FPGA?

A: First, there are practical problems with running ProgPoW on an FPGA. Because the random program changes every 12.5 minutes, new bitstreams need to be compiled and loaded frequently. The tools and facilities to accomplish this task are essentially non-existent.

Even ignoring this issue, ProgPoW does not map well to FPGAs, which work well for computationally intensive algorithms such as Keccak or Lyra. These algorithms can significantly improve performance and reduce power consumption by packing multiple operations into a single clock cycle and running multiple operations simultaneously.

The ProgPoW algorithm loop has many cache reads interleaved in sequence, which greatly reduces the operations that can be packed into a single clock cycle or run in parallel. Under the ProgPoW algorithm, the FPGA's packing operations both reduce the performance of the mining hardware and increase the length of the information channel. Because of the large mixed state (16 lanes * 32 regs * 4 bytes = 2 kilobytes), the increased information channel length also becomes a problem.

If this large mixed state is replicated periodically along each information channel, a lot of power will be wasted. Of course, we can also store the mixed state in the register file, making the FPGA's computing core look like an ASIC or GPU, but in that case, the FPGA's computing efficiency will be significantly lower than that of an ASIC.

14

Q: All the above questions and answers seem very lengthy. Can you give a brief summary?

A: Of course


Relative efficiency of mining hardware

Our initial 2x and 1.2x estimates for Ethash and ProgPoW assumed apples-to-apples comparisons of the same memory type. At the time of writing this article, we realized that when most GPUs use GDDR, we also need to compare different standards, such as comparing ASIC miners using HBM.

Original link:

https://medium.com/@ifdefelse/progpow-faq-6d2dce8b5c8b

Original author: IfDefElse Translator & proofreader: Youtiaoyu

This article is translated and edited by Mine Vision. If you need to reprint it, please indicate the source.

<<:  BSV Miners Lose $2.2 Million, Why Are They Still Persisting?

>>:  How long would it take to crack your private key? If all Bitcoin miners in the world are against you

Recommend

What does Danfeng eyes mean?

In ancient times, there were many eye shapes that...

What is the function of the body palace?

What is the body palace? The body palace represen...

Facial features that clearly differentiate between love and hate

Facial features that clearly differentiate betwee...

What does peach blossom pattern mean?

What does peach blossom pattern mean? What is pea...

What are the signs of an optimistic and confident person?

Many times, people tend to have negative thoughts...

Facial features suitable for cooperation

Facial features suitable for cooperation Predicti...

Solution for A card mining ETH computing power 1.x

Easy miner, creating value for users. Last month ...

Analysis of the three major misfortunes in wealth, career and health

As one of the traditional physiognomy techniques, ...