Project
Research and development on the utilization of various AI semiconductors and highly efficient computing resources
Research and development will focus on evaluating AI semiconductors and creating a testbed for performance, energy efficiency, and usability. A tool for predicting AI workload execution performance and developing high-efficiency inference systems will also be developed.
News
Overview
To advance the domestic computing infrastructure that supports the development and supply capacity of AI, we will research and develop technologies that evaluate a variety of AI semiconductors and realize high-efficiency computing resources. We will establish a testbed consisting of multiple emerging AI semiconductors that are expected to offer high performance and low power consumption, and conduct multifaceted evaluations of their performance, energy efficiency, usability, operability for AI development, as well as clarify utilization guidelines according to the application. In addition, to improve the efficiency of computing resources, including these AI semiconductors, we will develop a tool for AI workload execution performance prediction and high-efficiency, high-performance inference systems.
Participants
National Institute of Advanced Industrial Science and Technology, 1FINITY Ltd., AI Fukushima Co., Ltd., ELEMENTS Co., Ltd., Fujitsu Limited, TEPCO Systems Corp., RUTILEA Inc., Zeureka Inc.
① R&D in evaluation and efficient utilization for various AI semiconductors
- Testbed construction and development of a fair AI benchmark set to clarify the characteristics and advantages of diverse and cutting-edge AI semiconductors.
- Development and evaluation of efficient technologies used in actual scenarios (drug discovery workloads)
- Inference model porting and execution technologies for diverse AI semiconductors and advanced assistant functions using generative AI.
② R&D of technologies for improving the efficiency of large-scale computing resource utilization
②-1) Improving resource utilization efficiency based on execution performance prediction
- Development of high-precision execution performance prediction technology for learning/inference
- Continuous performance evaluation of learning/inference processing andimproving performance prediction datasets
②-2) Research and development of highly efficient, highly-performant inference systems
- Input data selection control: development of technologies to reduce input data volume from diverse data sources and improve query processing efficiency
- LLM inference API: development of technology to select the appropriate inference engine using speculative decoding for inputs combining Japanese and data manipulation languages
- Configuration management software: development of technologies using dynamic optimization to provision and configure computing resources based on demand
- Scheduler: development of advance scheduling technology to increase the efficiency of AI semiconductors
- KV Cache offload: establish design guidelines for improving the scalability and efficiency of LLM inference