AI Flash

OpenAI Engineer Critiques DeepSeek V4 Hardware Advice

3 weeks ago Apr 24, 2026 · 16:06 19 views
Quick Brief

OpenAI engineer Clive Chan has publicly questioned the hardware recommendations in the recently released DeepSeek V4 technical report. While acknowled...

OpenAI engineer Clive Chan has publicly questioned the hardware recommendations in the recently released DeepSeek V4 technical report. While acknowledging the report's overall high quality, Chan described the hardware section as "surprisingly mediocre and even flawed," a stark contrast to the widely prAIsed V3 report.
The hardware Q&A section of the V3 report was a major topic at the ISCA academic conference, with specific suggestions relevant to interconnect Standards. However, Chan finds the V4 advice to be much more generic.

🔍 Point-by-Point Rebuttals

Chan provided a detailed critique of several key recommendations:
  • On power consumption: The report suggests that software optimizations allow chips to run at full capacity, advising manufACTurers to allocate more power headroom. Chan argues this is counterproductive, as a higher power budget often forces a lower operating frequency to stay within thermal limits, ultimately reducing computational power.

  • On data transfer: The report advocates for a "pull" model (GPU actively reads data) over a "push" model (data is sent to the GPU), citing high notification overhead for the latter. Chan disagrees, stating that the "pull" method is inherently slower and that improving the network card's processing capability would be a better solution. It's noted that they might be discussing different aspects: the report focuses on notification overhead, while Chan is concerned with transmission latency.

  • On Activation Functions: DeepSeek's report recommends replacing the complex SwiGLU activation function with a simpler one to reduce computational load. Chan finds this unnecessary, pointing out that the Sonic MoE architecture has already proven that SwiGLU can achieve optimal performance.

Given these points, Chan speculates that DeepSeek may have intentionally weakened this particular section of the report.


★★★★★
★★★★★
Be the first to rate this article.

Comments & Questions (0)

Captcha
Please be respectful — let's keep the conversation friendly.

No comments yet

Be the first to comment!