Integrated System Design in 5nm Era

September 1-5, 2025

Registration deadline: August 1, 2025
Payment deadline: August 22, 2025

Course material will be distributed only if fees have been paid by the deadline for payment.
MONDAY, September 1
8:30 am-12:00 pm	Trends in Digital Design and Design Methodologies	Jan Rabaey
1:30-5:00 pm	New Open-Source HW (Single and Multi-Core) Platforms for AI/ML	Frank K. Gurkaynak
TUESDAY, September 2
8:30-10:00 am	New Accelerator-Based Open-Source Hardware and Co-Design Flows	David Atienza
10:30 am-12:00 pm	The Open-Source Hardware Ecosystem: Current State and Future Prospects	Davide Schiavone
1:30-5:00 pm	New Memory Technologies (3D Stacking, Ultra-Low Power, etc.)	Andreas Burg
WEDNESDAY, September 3
8:30 am-12:00 pm	High-Level Synthesis for Optimal and Fast Design of Digital Systems	Nanni De Micheli
1:30-5:00 pm	HANDS-ON Developing and Testing Heterogeneous Accelerator-Based Platforms	David Atienza
THURSDAY, September 4
8:30 am-12:00 pm	The Future of Computing Hardware through Heterogeneous 3D Integration: N3XT 3D MOSAIC, Illusion Scaleup, Co-Design	Subhasish Mitra
1:30-5:00 pm	Design and System Technology Co-Optimization Beyond 5nm CMOS	Julien Ryckaert
FRIDAY, September 5
8:30 am-12:00 pm	Neuro-Vector-Symbolic Architectures: An Algorithmic-Hardware Framework Towards Artificial General Intelligence	Abbas Rahimi
1:30-3:00 pm	Overview of Various System-Level Industrial Case Studies Designs	David Atienza

Scroll to Top

Abstracts

Integrated System Design in 5nm Era
September 1-5, 2025
EPFL Premises, Lausanne, Switzerland

The rapid growth of Artificial Intelligence (AI) and the Internet of Things (IoT), combined with the end of Moore’s Law, is transforming society and industry. Indeed, to sustain our growth with AI and IoT, new hardware architectures and methodologies for designing and optimizing complex integrated systems are needed to exploit the latest nanoscale technologies.

Moreover, the future is even more challenging, with the emergence of AI at the Edge requiring dedicated, sustainable, energy-efficient electronic hardware. Billions of autonomous systems at the Edge are foreseen to feature increased intelligence under a constrained amount of energy, high heterogeneity in technology integration, and a complex packaging and large memory capacity. These systems are critical enablers of future Industry 4.0, autonomous cars, robots, personalized and preventive healthcare, and environmental monitoring for smart cities and a smarter planet.

This course addresses the aforementioned challenges of designing the next generation of digital systems in a highly multidisciplinary manner. It proposes a complete program to understand nanoscale integrated systems’ latest technologies, architectures, and design flows. Lectures will start by covering the progress in nanoscale technologies for computation and storage, with a particular focus on chiplet-based or 2.5D, 3D stacked system design and monolithic integration to boost density and lower wiring congestion by exploiting the vertical dimension. Then, it will include a complete coverage of new open-source computing architectures inspired by the brain’s adaptive information processing and the latest design flows using high-level synthesis (HLS) languages. Next, new open-source hardware architectures and co-design flows will be presented to integrate diverse computing accelerators (like in-memory, systolic arrays, and coarse-grained reconfigurable accelerators) with single- and multi-core system-in-chip architectures. The course will conclude by presenting new design flows to build CMOS circuits using nanosheets and exploiting analog vs digital in-memory computing. On Wednesday, it will include a hands-on session.

This course is part of a successful series of short weeklong courses focused on various aspects of integrated circuit design intended primarily for industry. They have been offered by Mead Education (https://mead.ch/mead/) since 1999, and typically take place at the campus of EPFL (Lausanne, Switzerland), although some virtual courses have been offered since the pandemic.

Trends in Digital Design and Design Methodologies (2 lectures)
Jan Rabaey, UC Berkeley, USA

Digital logic is an essential part of every mixed-signal system-on-a-chip solution. With continued scaling of CMOS technology and the integration of more and more functionality, it has become an ever-important part. Addressing the resulting increase in complexity, caused both by technology and functionality, requires revisiting the prevailing design methodologies. While an in-depth overview in emerging trends is hard in a two-lecture sequence, we will outline some of the major trends and techniques.

Evolution in Digital CMOS Technology and its Impact on Design
Jan Rabaey, UC Berkeley, USA

CMOS technology has continued scaling at a pace set by Moore’s law. However, the nature of that scaling has changed substantially over the past decade. The minimum length of a transistor has pretty much plateaued around 12-13 nm. What has continued increasing is the density (transistors/mm2). This further into an increase in performance and energy efficiency, albeit at a slower pace than before. In this presentation, we will discuss these scaling trends, how they are being accomplished, and how they may extend into the future. We further elaborate on how these developments impact the way digital circuits are designed, optimized and verified.

Design Methodologies for Systems-on-a-Chip
Jan Rabaey, UC Berkeley, USA

Digital circuits are becoming exceedingly complex and now feature many billions of transistors. The most advanced systems-on-a-chip combine a broad range of processors (CPU, GPU), accelerators and neural processors, dedicated memory systems, networks-on-a-chip and fast input-output interfaces. Designing integrated systems of this complexity requires an evolution in design methodology, using higher levels of abstraction and more complicated building blocks. Yet, little of this is reflected in the design flows offered by the major EDA vendors today. This lecture will elaborate on how some of the emerging ideas on how design flows could evolve, including public domain components such as RISC-V, open flows and higher abstraction levels.

New Open-Source HW (Single and Multi-Core) Platforms for AI/ML
Frank K. Gurkaynak, ETHZ, Switzerland

In this lecture, based on the experience of the Parallel Ultra Low Power (PULP) platform project, I will explain the principles that we followed to obtain energy-efficient computing architectures that span a wide range of applications from simpler edge-IoT processors to hardware accelerators for demanding data-centric workloads. All presented architectures have been developed as open-source projects and are available using a permissive license. We will discuss the motivations and how this open-source approach has helped our innovations as well.

New Accelerator-Based Open-Source Hardware and Co-Design Flows
David Atienza, EPFL, Switzerland

This session will discuss novel co-design flows to effectively conceive the next generation of edge AI computing architectures in nano-scale technologies by taking inspiration from how the brain processes incoming information and adapts to changing conditions. In particular, these novel edge AI architectures include two key concepts. First, it exploits the idea of exploiting computing inexactness at the system level to integrate multiple computing accelerators at the hardware level for higher energy efficiency. Second, these edge AI architectures can operate ensembles of neural networks at the software level to improve machine learning (ML) and Deep Learning (DL) outputs robustness at system level, while minimizing memory footprint for the target final application. These two concepts have enabled the creation of new open-source hardware platforms, such as the eXtended and Heterogeneous Energy-Efficient Hardware Platform (called X-HEEP). X-HEEP will be showcased in this presentation to effectively create commercial edge AI systems for different healthcare applications.

The Open-Source Hardware Ecosystem: Current State and Future Prospects
Davide Schiavone, OpenHW Group, Switzerland

Open-Source Hardware (OSH) is a key revolution in the technology landscape, democratizing access to hardware design and fostering innovation through collaboration. This lecture will explore the current state of the OSH ecosystem, examining its key members, the technology-readiness level of the projects, and the community’s impact. We will delve into the successes and challenges faced by OSH initiatives, highlighting key developments that have propelled the movement forward.
As we look to the future, the lecture will discuss potential trajectories for OSH, including heterogeneous systems ranging from ultra-low-power to high-performance SoCs.

New Memory Technologies (3D Stacking, Ultra-Low Power, etc.)
Andreas Burg, EPFL, Switzerland

Any form of computing both in data centers and at the edge always involves both logic as well as memory. While logic has rapidly advanced providing higher density, greater speed, and better energy efficiency with each technology generation, it has also become clear that memory and memory access is now the performance limiting factor. This limitation is known as the memory wall. Large amounts of data (as required for example for AI) must often be stored separately from the compute die, which comes at very high penalty for latency and energy per access. Large on-chip caches mitigate the resulting memory bottleneck, but consume quickly a dominant portion of the compute-die area and power, leading to high cost and limited cache capacity. What makes this situation worse is fact that on-chip memories have stopped scaling at the 5nm node. In fact, a 5nm bitcell is only about 25% smaller compared to a 7nm bitcell and the bitcell size in 2nm can be even larger than its counter part in 5nm. The reason for this trend is the fact that design-technology co-optimization (DCTO) has almost exclusively focused on the continuation of scaling for logic, neglecting the impact on the so important on-chip memories.
In this presentation, I will discuss the memory wall and the latest portfolio of solutions for efficiently storing and accessing data for different system requirements. We will consider the imminent limitations and challenges of on-chip memories associated with scaled technology nodes and corresponding solutions and emerging technologies. I will further review the technologies and trends for realizing and connecting high-density large-scale off-chip random access memory with sufficient bandwidth and good energy efficiency using 2.5D and 3D integration.

High-Level Synthesis for Optimal and Fast Design of Digital Systems
Nanni De Micheli, EPFL, Switzerland

High-level synthesis is a method for transforming models of circuits and systems into an interconnection of modules that implement a data path and a related control unit that executes the operations on the data path. Models are usually expressed in a hardware description language and are first transformed into a set of representative graphs. Scheduling and resource sharing assign then a temporal and spatial dimension to the graphs, thus determining when operations execute and which hardware resource implements each operation. High-level synthesis tools enable designers to quickly map high-level models into circuit structures, determine the series/parallel execution of the operation, thus optimizing latency and hardware resources. Users benefit from high-level synthesis tools by reducing design time, and thus improving their productivity, which is essential for nanometer-scale design. This methodology is particularly helpful to design hardware accelerators for specific functions.

HANDS-ON Developing and Testing Heterogeneous Accelerator-Based Platforms
David Atienza, EPFL, Switzerland

This session (including hands-on exercises and demos) presents the new hardware-software co-design techniques to develop heterogeneous multi-core embedded systems including specialized hardware accelerators in latest technology nodes. The hands-on exercises will be developed using FPGAs and will present how to construct complex multi-core architectures including hardware accelerators described with high-level synthesis tools (HLS) that reduce total execution time and energy consumption for complex tasks. Also, it will cover how to perform fine-grained optimizations in HLS: loops, arrays, memory accesses and scheduling to optimize the final generated hardware, as well as coarse-grained optimizations in HLS: dataflow model, tasks and scheduling to create parallel hardware workers to effectively exploit software parallelization. Finally, this session will explain how to accurately measure performance at system-level, as well as how to perform simulation and on-chip debugging for complete multi-core heterogeneous platforms.

The Future of Computing Hardware through Heterogeneous 3D Integration: N3XT 3D MOSAIC, Illusion Scaleup, Co-Design
Subhasish Mitra, Stanford, USA

The computation demands of 21st-century abundant-data workloads, such as AI/Machine Learning, far exceed the capabilities of today’s computing systems. The next leap in computing performance requires the next leap in integration. Just as integrated circuits brought together discrete components, this next level of integration must seamlessly fuse disparate parts of a system – e.g., compute, memory, inter-chip connections – synergistically for large energy and throughput benefits. Transformative NanoSystems exploit unique characteristics of nanotechnologies for new chip architectures through ultra-dense 3D integration of logic and memory – the N3XT 3D approach. Multiple N3XT 3D chips are integrated through a continuum of chip stacking/interposer/wafer-level integration — the N3XT 3D MOSAIC — to scale with growing problem sizes. Several hardware prototypes, built in industrial facilities resulting from lab-to-fab activities, demonstrate the effectiveness of our approach. We target 1,000X system-level Energy-Delay-Product benefits, especially for abundant-data workloads.

Design and System Technology Co-Optimization Beyond 5nm CMOS
Julien Ryckaert, IMEC, Belgium

In this lecture we will dive into the mechanisms that enable the scaling of technologies in the nm CMOS era. Indeed, beyond the 20nm node, miniaturization alone could not provide an overall system PPA scaling anymore and needed to be assisted by a careful design and system centric approach to technology research. This has led to the so-called Design-Technology Co-Optimization (DTCO) framework. Nowadays, it is accepted that more than 50% of scaling is driven by DTCO. As we scale further we need to enrich our understanding of how technology impacts circuits and systems and in some cases pushes problems to be studied at system level in actual workload conditions. This evolution is referred to as System-Technology Co-Optimization (STCO) and is expected to drive scaling further deep into the angstrom era. This lecture will explain this evolution and describe the various methods put in place to enable DTCO and STCO. We will explore how these frameworks have impacted the course of technology scaling and extrapolate these into the future of scaling.

Neuro-Vector-Symbolic Architectures: An Algorithmic-Hardware Framework Towards Artificial General Intelligence
Abbas Rahimi, IBM, Switzerland

Emerging neuro-symbolic AI approaches display both perception (System I) and reasoning (System II) capabilities, but inherit the limitations of their individual deep learning and classical symbolic AI components. By synergistically combining neural networks and machinery of vector-symbolic architectures, we propose the concept of neuro-vector-symbolic architecture (NVSA). NVSA exploits its powerful operators on high-dimensional composable distributed representations that serve as a common language between neural nets and symbolic AI. We elaborate how NVSA can solve challenging tasks such as few-shot continual learning, visual abstract reasoning, and computationally hard problems (e.g., factorization of perceptual representations, or exhaustive searches involved in abstract reasoning) faster and more accurately than other state-of-the-art methods. Further, we show how efficient realization of NVSA can be informed and benefitted by the physical properties of analog in-memory computing hardware, including O(1) matrix-vector-multiplications, in-situ progressive crystallization, and intrinsic stochasticity of phase-change memory devices.

Overview of Various System-Level Industrial Case Studies Designs
David Atienza, EPFL, Switzerland

The session will cover the major challenges and requirements for the design of different industial case studies based on AI/ML systems. Classical hardware design concepts, such as accuracy, cost, validation or design exploration time have evolved in the new AI/ML era and are drive which hardware systems are used in different industrial applications. These concepts will be illustrated with examples of latest edge AI systems for home automation and healthcare systems.

Scroll to Top