Monday, August 7, 2017

The Future of Non Volatile Memory in Cognitive Computing

The confluence of Big Data, IoT, Real-Time Analytics and Cognitive Computing are forcing a complete rethinking of the hardware computing baseline from the ground up. Pressure to stay current with ever larger investments in cloud and enterprise hardware has vendors second guessing whether they will be overcome, or more appropriately engulfed by tomorrow’s architectural developments.

At the base of this stack is the storage crossover from hard disk drives to solid state drives. This change alone decreased latency of data available for processor execution by 100 times. Standardizing protocols such as NVMe have expedited the removal of a good deal of the latency delaying software stack. XPoint memory from Intel/Micron decreases latency again by 10 to 100 times. The idea of loading entire databases into the Memory Channel DRAM DIMMs to support In-Memory and Real-Time Analytics requires the Dynamic RAM memory to be backed up with nonvolatile memory to insure against loss of data during a power outage incident (NVDIMMs).

New NVDIMM memory designs are in progress that not only increase the backup density but also expand the nonvolatile memory to 4 to 16 times beyond the DRAM capacity for low latency system use. The resulting “expanded memory space” NVDIMM contain an on-board controller that shadows the DRAM memory access, bursting data from a prefetch buffer when a DRAM main memory miss is registered thus bringing ever more low latency memory capacity to the server’s multi-core processors at reduced cost. In short, the conquest of latency and depth of the memory channel are seen as major accelerators for tomorrows Enterprise and Cloud servers for In-Memory Database and Real-Time Analytics.

The introduction of the NVMe-oF standard protocol for All-Flash-Arrays is accelerating their development and market acceptance - this fulfills the need for low latency storage closely coupled to the In-Memory NVDIMM memory arrays. The commercial availability of XPoint devices from Micron Technology (QuantX) are currently under consideration for future All-XPoint-Arrays – a new category offering a level of near DRAM performance for system access to multi-terabyte memory arrays. WebFeet’s demand forecast expects these arrays to remain in short supply for an extended period through the year 2020. A wave of new product announcements is expected in this category through 2018.

The list becomes less clear when it comes to implementing Cognitive Computing. The greatly reduced average latency achieved through enlarging the size of both DRAM and nonvolatile memory on NVDIMMs appears sufficient for the time being. A standing issue with the expanded Memory Channel is how it will impact the capability of running AI applications. System partitions already surround server based applications of which Nvidia’s DGX-1 and Googles TPU TensorFlow Pods are representative. The implication is that these lead AI application engines are being run as Application Servers and have yet to be woven into the hyperscale space. Intel’s Xeon Phi based servers are intended to be the first to integrate closely coupled MPP (Massively Parallel Processing) with enlarged persistent memory channels (NVDIMMs). The potential of Intel’s solution remains undelivered at this writing – it does, however, bring a greater range of AI applications to a wider installed base (upgrade market). What is still unsettling to many is the fact that there is currently no real candidate methodology for the inclusion of AI technology into hyperscale systems. Building AI servers is one solution but WebFeet’s analysis indicates that it is still a “bag on the side” instead of being an integral part of a massively parallel compute model.  

From a computing architectural perspective, WebFeet recognizes that the entire eco-system (sensors, gateways, edge, and cloud/data center) are required to run at, or very near to Real-Time to remain competitive. Software layers, large memory, virtual loading of processors, solid state storage and fast interconnect enable today's von Neumann computing databases. Unfortunately, the architecture is rapidly becoming unable to process the ever-increasing data volumes. Lead research now recognizes that change must also begin with the agents of source data. The use of pattern processing at the endpoints not only improves the quality of the data but also reduces the quantity of information exchange necessary to maintain autonomy and integration into the Internet of Things.

Current AI edge solutions run the gamut from In-Memory to a USB stick covering a wide range of capabilities and power requirements. WebFeet views this market as being segmented into multiple categories, which range from being power limited (mobile) to rack mounted hyperscale Cloud and Enterprise application servers. The need to reduce power by orders of magnitude while increasing the device level cognitive span represents a major industry challenge to the implementation of neuromorphic equipped devices and systems. A forward solution path involving persistent resistive memory and persistent gate level technologies is viewed by WebFeet Research as having a high probability of providing the technology base necessary to achieve low cost, low power and the high availability required for low end and the mid-range segments. 

Cognitive Computing systems that mimic the way the human brain works are thought by many to still be in the early stages of development. An interesting scenario has developed which may completely shatter this attitude. Several low-profile, long term research efforts underway for well over a decade are now nearing their market entry milestones. These systems are based on principles of the human neocortex and represent the first semiconductor devices able to perform near real-time learning and pattern identification on heterogeneous data streams. Webfeet forecasts that these market entries will radically change the complexion of what now is considered to be the High Performance Cognitive Computing market within the next several years. These entries will not only spawn the ability to process vast amounts of heterogeneous data but will, in addition, be able to attain predictive, temporal time sequenced data event extraction (something only dreamed of at present) again, all within the next several years. WebFeet projects this same technology will produce a major revision in how the semiconductor market responds to Artificial Intelligence, especially where the von Neumann architecture intersects the memory-like architecture of pattern computing.

WebFeet’s research in this area provides both qualitative and quantitative analysis of this major architectural change.  WebFeet has developed application models that tracks and forecasts
Flash and NVM at each stage of the eco-system:

Sensors - emNVM and standalone NVM components
Gateways - emNVM, NVM components, EFD (Embedded Flash Drives - eMMC, UFS)
Edge devices - NVM components, EFD and emSoC (Memory in Module), SSDs
Cloud/Data Center - NVM components, EFD, SSD, NVDIMM, AFA (All Flash Arrays), AXA (All XPoint Arrays)


WebFeet provides focused discussions and studies on the trends, technical requirements, and market sizing for automotive, IoT, Edge: Mobile, Cloud/Enterprise and other emerging applications. We also include analyses and forecasts on the viability and when the emerging NVM technologies will reach commercial production and for which application. Finally, WebFeet has an In-Memory Database Computing Research Service that explores the transition from von Neumann computing (symbol) to Cognitive Computing (pattern) and Artificial Intelligence with its impact on semiconductors, especially memory and logic. Since, these are on-going topics we would like to engage in bi-monthly forum to provide you with discussions on our written reports, debriefings from trade shows, company releases and targeted research.

No comments:

Post a Comment