CITIC Securities: Continually optimistic about the growth trend of storage innovation

CryptocurrencySniper · 2026-04-01T03:13:43+00:00

CITIC Securities research report states that the core strength of the Agent AI era lies in storage, driving a long-term paradigm shift in the storage industry. On the supply and demand side, AI inference significantly increases token consumption, leading to a linear surge in KV Cache. The mismatch between demand explosion and original manufacturer capacity expansion has resulted in normal shortages, which are expected to persist until 2027, with price increases continuing throughout 2026. Technologically, amid extreme shortages and high costs of HBM and DRAM, manufacturers are sharing NAND innovation solutions to alleviate the pressure on graphics memory capacity. CITIC Securities remains optimistic about the growth trend driven by storage innovation.Full text belowStorage | Observations on Storage Development Trends from the Flash Memory Market SummitAgent AI time

CryptocurrencySniper

2026-04-01 03:13:43

China Galaxy Securities research report says that in the Agent AI era, storage capacity is the core driver, prompting the storage industry to undergo a long-cycle paradigm shift. On the supply-and-demand side, AI inference drives a sharp surge in Token consumption, which in turn leads to a linear explosion in KV Cache; demand spikes and misalignment between original equipment manufacturers’ (OEMs) capacity expansions results in stockouts becoming the norm. It is expected that the situation of supply falling short of demand will continue until 2027, with price increases running throughout all of 2026. Technically, against the backdrop of extreme shortages and soaring costs for HBM and DRAM, manufacturers share NAND innovation solutions to help shoulder the demand pressure for memory capacity. China Galaxy Securities continues to be bullish on the growth trend of storage innovation.

Full text follows

Storage | Looking at storage development trends from the Flash Memory Summit

In the Agent AI era, storage capacity is the core, driving a long-cycle paradigm shift for the storage industry. On the supply-and-demand side, AI inference drives a sharp surge in Token consumption, which in turn leads to a linear explosion in KV Cache; demand spikes and misalignment between OEM capacity expansions result in stockouts becoming the norm. It is expected that supply shortages will continue until 2027, with price increases throughout all of 2026. Technically, against the backdrop of extreme shortages and soaring costs for HBM and DRAM, manufacturers share NAND innovation solutions to help shoulder the demand pressure for memory capacity. We continue to look favorably on the growth trend of storage innovation.

▍ The 2026 China Flash Memory Market Summit will be held, focusing on storage innovation in the AI era and opportunities for upgrading the industrial chain.

On March 27, 2026, CFMS MemoryS 2026, the global storage industry’s annual event, will be held in Shenzhen. As an industry bellwether-level summit, this year’s event centers on the theme “Crossing the cycle and unlocking value,” with an in-depth focus on technological innovation and coordinated upgrading across the industrial chain. It attracts dozens of global leading companies, including Samsung Electronics, Everspin Technologies, Kioxia, Solidigm, Intel, Tencent Cloud, and others, covering the entire industrial chain from storage-chip original manufacturers, to controller design, module manufacturing, and cloud services. By running high-end forums and technology exhibitions in parallel, the summit addresses evolving outlooks on market conditions and focuses on the explosion of capacity demand under the sharp increase of Agent AI era token/KV Cache. It also delves into cutting-edge discussions on storage innovation and changes driven by PCIe 5.0/6.0 SSDs, breakthroughs in ultra-high-capacity QLC technologies, and other AI-driven advances, while simultaneously showcasing more than 100 innovative products.

▍ AI inference drives a breakout in storage demand; structural mismatches become normalized. It is expected that the supply-demand imbalance at least will continue until 2027, with price increases throughout all of 2026.

Demand side: According to CFM China flash memory market data, 2026 server shipments will increase year over year by +15%. AI servers will account for over 20% of total server shipments. As large models move from the training stage to the inference stage, the surge of Agent applications leads to a sharp increase in Token consumption. When sequence length increases from 1k to 128k tokens, KV Cache usage increases from 0.5GB to 64GB (BF/FP16, per request). Under long context plus high concurrency, storage demand rises linearly with the number of tokens/concurrency. CFM forecasts that HBM capacity will increase year over year by more than +90%/+35% in 2025/2026, respectively. Meanwhile, the downward shift of KV Cache, combined with demand overflow driven by HDD supply shortages, drives eSSD to become the largest downstream for NAND in 2026 (with share rising to 37%).

Supply side: Misalignment in capacity expansion cycles means shortages and price hikes will persist long-term. Storage original manufacturers generally adopt strategies to stabilize prices, prioritizing advanced capacity investments into high-gross-margin AI storage products. According to CFM, the proportion of relatively high-end DRAM capacity—including HBM/DDR5/LP5X/6—rises from less than 50% in 2024 to 85%+ in 2026. Mature-node process capacity and consumer-grade capacity are continually squeezed. Industry inventory declines from 10–12 weeks in Oct–Dec 2023, to 8–10 weeks in Aug–Oct 2024, and to 4 weeks in 2026, falling below historical safety lines. Storage capacity expansion cycles last as long as 18–24 months; there cannot be a supply inflection point in 2H26. Huisheng Technology believes that 2027 is when storage shortages reach the “darkest moment.” Starting from 2H25, storage prices will see an epic round of increases. CFM expects DRAM and NAND ASPs to keep rising throughout all of 2026. With storage capacity as the core in the AI inference era, storage undergoes a long-cycle paradigm shift. For super growth, it is not a periodic rebound.

▍ The storage industry chain accelerates value reconfiguration.

At the recent GTC conference, NVIDIA emphasized the “Token factory economics.” The core significance is to strengthen storage’s strategic position within AI infrastructure, which also means that storage’s profit ceiling will be opened for the long term. According to CFM data, the ASP of eSSD products in 26Q1 has already reached more than 2 times the consumer-grade NAND ASP. For storage original manufacturers, the key lies in medium upgrades and system-architecture-level reconfiguration. This forum’s keynote speeches mainly focus on the enterprise market. For storage solution vendors, the industry focus shifts from “who is cheaper” to “who can actually get product.” At the same time, leading vendors such as Phison Electronics are accelerating a transition toward “customized high–added-value modules” powered by in-house controller design, and expanding into enterprise SSDs to redefine storage value and break away from the traditional model of relying on low-cost inventory.

▍ AI cloud (enterprise-grade) storage trends: a surge in high-capacity QLC and rapid interface evolution, reshaping the compute engine.

AI is accelerating the shift from the “training” stage to the “inference” stage. In the future, the ratio of inference servers to training servers is expected to be as high as 10:1 to 50:1. Currently, constrained by the storage bandwidth bottleneck, the availability (utilization) of GPU clusters is only about 46% to 50%. Upgrading display memory (VRAM) becomes a core requirement. Meanwhile, at this summit, multiple manufacturers shared features for function reallocation via compute-storage coordination. The role of eSSD is moving from a “passive data container” to a core “compute engine” and an “extended memory layer.” On the training side, leveraging ultra-high-capacity QLC eSSD to store Checkpoints can greatly improve GPU runtime efficiency. On the inference side, eSSD takes on tasks such as managing massive long-context state, vector database queries, and model shard loading by caching KV Cache in layers. Measured results show that offloading KV cache to SSD and eliminating prefill computation can reduce the first token generation time (TTFT) by 41 times. Enterprise storage is showing the following technical trends:

Facing the overflow caching demand from massive AI data and KV Cache, high-density QLC becomes a key medium. Ultra-high-capacity QLC solutions at the TB-class scale (hundreds of TB) are the preferred choice. Kioxia (245.76TB), Marvell B^D^P^i (245TB), and SanDisk (SN670方案 up to 256TB) have all showcased breakthrough ultra-high-capacity QLC products above the 200TB level, greatly optimizing space efficiency and TCO.

Controller chips move toward “soft and hard co-design,” filling the gap in medium capabilities. For inference scenarios that create high-frequency random read/write and bandwidth pressure due to KV Cache, controller chips are actively upgrading. PingTouGai ZhenYue 510 supports the ZNS protocol natively and system-level coordination to help QLC scale into commercial use, with cumulative shipments exceeding 500,000 units. Etron Technology introduces technologies such as a KV acceleration engine and predictive prefetching, enabling controllers to transform from “data movers” into proactive “intelligent resource schedulers.”

Interface rapid iteration and liquid cooling innovation to support super large GPU clusters of 100,000 cards. Facing the massive data throughput challenges and high-density heat generation challenges of thousand-card, ten-thousand-card, and even 100,000-card clusters. Samsung displayed a 16-channel PCIe 6.0 SSD PM1763, with input/output performance increasing by 2.0 times on a step-change basis. FADU’s PCIe Gen6 controller “Lhotse” has already taped out, and sequential read performance will reach 28.5GB/s.

▍ AI terminal (consumer-grade) storage trends: on-device AI acceleration rolls out, and compute-storage fusion breaks the memory-usage bottleneck.

On-device environments are extremely stringent on hardware BOM costs, system power consumption, and DRAM memory usage. Therefore, shifting inference pressure from memory (DRAM) to flash (NAND) through “compute-storage fusion,” intelligent scheduling by software and hardware, and advanced caching techniques has become an important supplement for overcoming the bottleneck of deploying on-device large models.

AI PCs and local large models: Hybrid technology reduces pressure from a surge in DRAM capacity requirements. When running 10B- or 100B-parameter large models on-device, memory is a huge challenge. Jiangbo Long introduces a storage processing unit with a 5nm SPU and an iSA storage intelligent agent. In joint tuning and validation, it achieves local deployment of a 397B model on a PC host and reduces DRAM usage by nearly 40% in 256K context scenarios. Phison Electronics introduces Phison Hybrid AI SSD and aiDAPTIV+ technology, which is expected to reduce DRAM usage by over 50%, enabling controllable cost and secure local inference.

Intelligent vehicles and edge computing: moving toward centralized pooling architectures and a unified platform base. Embodied intelligence and advanced driving assistance place global coordination requirements on the underlying architecture. XPeng Motors has clearly pointed out that with compute performance up to 2250 TOPS, DRAM bandwidth has become the core bottleneck for inference latency. The car-grade LPDDR6 era is approaching, and automotive NAND storage is moving from domain-based isolated deployments toward centralized pooling and software-defined architecture.

Smartphones and AIoT (Internet of Things): deep integration of high-speed interfaces and advanced caching technologies. For the response-speed and battery-life requirements of mobile devices and emerging wearables, Everspin Technologies plans to launch a new-generation UFS 4.1 controller SM 2755 and accelerate its layout in AIoT markets such as smartwatches and glasses. SanDisk uses SmartSLC caching technology to achieve high-throughput UFS 4.1 operation at power consumption of only about 2W. Jiangbo Long is also pushing HLC advanced caching technology to be implemented on embedded devices to lower terminal BOM costs.

▍ Risk factors:

Risk of global macroeconomic weakness; downstream demand falling short of expectations; innovation falling short of expectations; risks from changes in international industrial environments and increased trade frictions; risks that compute capacity upgrade progress falls short of expectations; and risks that cloud providers’ capital expenditures fall short of expectations, etc.

▍ Investment strategy:

We are bullish on the storage-capacity upgrade in the Agent AI era driving trends in compute-storage industries. With high enthusiasm for near-memory computing, we like the HBM and CUBE industry chain. At the same time, under storage tightness, mainstream and niche storage will see product shortages and price increases comprehensively. Multiple manufacturers have reported that the 26Q2 price increase relative to Q1 is still similar in magnitude. We expect industry supply to remain short of demand at least through the end of 2027. Core recommendations: storage module companies with strong near-term earnings breakout capability; storage original manufacturers and design companies closely aligned with original manufacturers.

（Source: Jiemian News）

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

1 Likes