HW Accelerated Machine Learning at the Edge

    On-Line Class
    CET – Central European Time Zone

    Download One-Page Schedule Here

    Week 1: January 16-20, 2023

    Week 2: January 23-27, 2023

    Registration deadline: January 5, 2023
    Payment deadline: January 10, 2023

    registration

    TEACHING HOURS

    DAILY Central European Time CET Eastern Standard Time EST Pacific Standard Time PST India Standard Time IST
    Module 1 4:00-5:30 pm 10:00-11:30 am 7:00-8:30 am 8:30-10:00 pm
    Module 2 6:30-7:30 pm 12:00 am-1:30 pm 9:00-10:30 am 10:30-12:00 pm

    WEEK 1: January 16-20

    Monday, January 16

    4:00-5:30 pm Context: ML Applications, Scenario’s and Constraints for the Edge Marian Verhelst,
    KU Leuven
    6:00-7:30 pm Context: ML Algorithms and Resulting Challenges Marian Verhelst,
    KU Leuven

    Tuesday, January 17

    4:00-5:30 pm Algorithms: Neural Network Compression for the Edge Tijmen Blankevoort,
    Qualcomm
    6:00-7:30 pm Algorithms: Neural Network Quantization for the Edge Tijmen Blankevoort,
    Qualcomm

    Wednesday, January 18

    4:00-5:30 pm HW, CPU: Specializing Processors for ML Luca Benini,
    Uni Bologna/ETHZ
    6:00-7:30 pm HW, CPU: From Single to Multi-Core Low-Power SoCs for ML Luca Benini,
    Uni Bologna/ETHZ

    Thursday, January 19

    4:00-5:30 pm HW, Digital: Concepts Towards ML Acceleration Marian Verhelst,
    KU Leuven
    6:00-7:30 pm HW, Digital: Exploiting Quantization and Sparsity at the HW Level Marian Verhelst,
    KU Leuven

    Friday, January 20

    4:00-5:30 pm HW, Analog: Analog/Mixed-Signal Acceleration Naveen Verma,
    Princeton
    6:00-7:30 pm HW, Tech: Architectural Integration of Emerging Compute Models and Technologies Naveen Verma,
    Princeton

    WEEK 2: January 23-27

    Monday, January 23

    4:00-5:30 pm Tools: Model-centric TinyML Vijay Janapa Reddi,
    Harvard
    6:00-7:30 pm Tools: Data-centric TinyML Vijay Janapa Reddi,
    Harvard

    Tuesday, January 24

    4:00-5:30 pm Tools: Landscape of DL Compilers and Challenges for Inference Tushar Krishna,
    Georgia Tech, &
    Prasanth Chatarasi,
    IBM
    6:00-7:30 pm Tools: Mapping and HW Co-optimization Tushar Krishna,
    Georgia Tech, &
    Prasanth Chatarasi,
    IBM

    Wednesday, January 25

    4:00-5:30 pm System: Efficient Execution of Approximated AI Algorithms on Heterogeneous Edge AI Systems David Atienza,
    EPFL
    6:00-7:30 pm Use Cases: Application-Driven System Design and Optimization flow of Edge AI Use Cases in Industrial and Medical Domains David Atienza,
    EPFL

    Thursday, January 26

    4:00-5:30 pm Emerging ML Paradigms: Neuro-Inspired Computing Jan Rabaey,
    Berkeley
    6:00-7:30 pm Emerging ML Paradigms: Towards Cognitive Systems Jan Rabaey,
    Berkeley

    Friday, January 27

    4:00-5:30 pm Practical Use Cases: Energy Efficient ML Applications for Metaverse Huichu Liu,
    Facebook
    6:00-7:30 pm Panel Discussion Eduard Alarcon,
    UPC
    registration

    Scroll to Top


    Abstracts

    HW Accelerated Machine Learning at the Edge
    On-Line Class
    April 25 – May 6, 2022

    Course Abstract

    Machine learning workloads become increasingly important for IoT devices and intelligent extreme edge devices, due to the desire to move more and more intelligence into the edge. Yet, these workloads come with significant computational complexity, making their execution until recently only feasible on power-hungry server or GPU platforms.
    In this course, we will go over all these aspects of machine learning at the edge, with especially a deeper dive into hardware optimization opportunities. Enough time is also foreseen to discuss practical case studies and end-to-end-optimizations along with introducing the vibrant tinyML® cross-functional ecosystem.

    ML Applications, Scenario’s and Constraints for the Edge
    Marian Ver
    helst, KU Leuven, Belgium

    – Overview of applications
    – Cloud vs Edge vs tinyML (extreme edge)
    – Inference vs learning vs federated learning
    – Application constraints and scenario’s
    – Flavors of ML and AI (types of models)

    ML Algorithms and Resulting Challenges
    Marian Verhelst, KU Leuven, Belgium

    Computational consequences and HW requirements for the EDGE of:
    – Probabilistic models
    – decision trees
    – SVM’s
    – NN’s (deep and non-deep; layer types, …)
    – NN training
    – Hyperdim computing?
    Challenges and requirements for efficient AI at the edge.

    Neural Network Compression for the Edge
    Tijmen Blankevoort, Qualcomm, The Netherlands

    – Take any network, how can we make it smaller structurally?
    – Neural Network Pruning
    – Structured Compression
    – Neural Architecture Search as a compression method.

    Neural Network Quantization for the Edge
    Tijmen Blankevoort, Qualcomm, The Netherlands

    – Quantization Introduction and Simulation
    – Quantization-aware training
    – Post-training quantization techniques
    – Mixed Precision.

    Specializing Processors for ML
    Luca Benini, Università di Bologna, Italy/ETHZ, Switzerland

    – Classical instruction set architectures (ISAs) limitations for ML
    – ISA Extensions for ML
    – Micro-architecture of ML-specialized cores
    – PPA (power performance area) optimization and implementation techniques.

    From Single to Multi-Core Low-Power SoCs for ML
    Luca Benini, Università di Bologna, Italy/ETHZ, Switzerland

    – Single-core ML SoCs – architecture, implementation, PPA analysis
    – Multi-core ML SoCs – architecture, implementation, PPA analysis
    – Integration of cores and Hardwired ML accelerators
    – Memory hierarchy: challenges and solutions.

    Concepts Towards ML Acceleration
    Marian Verhelst, KU Leuven, Belgium

    – ML models / CNN / DNN recap formalization; GeMM
    – GeMM on traditional CPU / GPU
    – Energy/latency losses and opportunities
    – Concepts towards more efficient ML acceleration on single core/single layer
    – Parallelization (spatial unrolling optimization)
    – Stationarity (temporal unrolling optimization)
    – Extending spatial and temporal unrolling to higher levels.

    Exploiting Quantization and Sparsity at the HW Level
    Marian Verhelst, KU Leuven, Belgium

    – Concepts towards more efficient AI acceleration on single core/single layer (ctu)
    – Sparse workloads
    – Quantization – analog domain
    [Optional: – Concepts towards more efficient AI acceleration on multi-core/multi-layer]. 

    Analog/Mixed-Signal Acceleration
    Naveen Verma, Princeton University, USA

    – Review of key ops to accelerate (MACs) and potential energy savings through analog
    – Overview of Approaches for MACs
    • Electronic (current, voltage, charge summing)
    • Optical
    – Overheads and limitations
    • Memory accessing -> motivates in-memory computing
    • Data conversion
    • Technology integration with digital engines/memory
    – Fundamental tradeoffs
    • Energy/throughput vs SNR
    – In-memory computing
    • Different memory techs and approaches.

    Architectural integration of emerging compute models and technologies
    Naveen Verma, Princeton University, USA

    – Dataflow bottlenecks (weight/state loading)
    – Structured memory accessing and benefits of emerging memory
    – Co-design with trainers.

    Training Frameworks for ML at the Edge
    Vijay Janapa Reddi, Harvard University, USA

    – Automatic Dataset Generation
    – Few-shot Keyword Spotting
    – Multilingual Spoken Words Corpus.

    Deploying ML Models at the Edge: from CPU to Accelerator
    Vijay Janapa Reddi, Harvard University, USA

    – TensorFlow Lite Micro
    – CFU Playground
    – Benchmarking Data- and Model-centric ML
    – DataPerf
    – MLPerf

    Tools: Landscape of DL Compilers and Challenges for Inference
    Tushar Krishna, Geogia Tech, USA
    Prasanth Chatarasi, IBM, USA

    TVM, compilers for custom accelerators, runtimes, mappers, compilers, TVM.

    Mapping and HW Co-optimization
    Tushar Krishna, Geogia Tech, USA
    Prasanth Chatarasi, IBM, USA

    – Mapping/HW optimization using rapid cost models
    – HW-aware NAS.

    Efficient Execution of Approximated AI Algorithms on Heterogeneous Edge AI Systems
    David Atienza, EPFL, Switzerland

    i. Major challenges in designing energy-efficient edge AI architectures due to the complexity of AI/CNN.
    ii. Design options to reduce complexity (pruning, quantization, etc.) and benefits of operating edge AI architectures at sub-nominal conditions.
    iii. New architectural design methodologies for edge AI systems, called Embedded Ensemble CNNs (E2CNNs) to conceive pruned CNNs and AI implementations with improved robustness against memory errors in pruned/quantized single-instance ML/CNNs.
    iv. Experimental evaluation of compression methods and design space exploration to produce an ensemble of CNNs for edge AI devices with the same memory requirements as the original architectures but improved error robustness (in different types of memories) for sub-threshold operation.

    Application-Driven System Design and Optimization flow of Edge AI Use Cases in Industrial and Medical Domains
    David Atienza, EPFL, Switzerland

    i. Overview of major key challenges in different industrial case studies for AI/ML systems (computation vs communication and other trade-offs to consider particularly for medical applications in the context of Big Data healthcare).
    ii. Different design option for AI/ML hardware systems using centralized vs. federated approaches on edge AI systems.
    iii. Mapping options for ULP multi-core embedded systems with neural network accelerators for energy-scalable software layers based on target applications.
    iv. Examples of next-generation of smart wearable devices in the healthcare context
    v. Examples of industrial edge AI systems for home automation.

    Emerging ML Paradigms: Neuro-Inspired Computing
    Jan Rabaey, UC Berkeley, USA

    Lessons from the brain and what it means for:
    – ML hardware,
    – Neuromorphic,
    – Hyper-dimensional, etc

    Emerging ML Paradigms: Towards Cognitive Systems
    Jan Rabaey, UC Berkeley, USA

    – Autonomous sensor-control-actuation
    – Model versus data driven
    – Reinforcement learning
    – Symbolic reasoning
    – Probabilistic learning, graphs

    Practical Case Studies: “Energy Efficient ML Applications for Metaverse”
    Huichu Liu, Facebook, USA

    – Overview of AR/VR system features and energy constraints
    – Breakdown different applications running on AR/VR HW and its related ML algorithms
    – HW techniques to enable energy efficient NN execution
    – HW-SW techniques to enable efficient NN mapping
    – Algorithm techniques for practical applications
    – Future applications/challenges/research directions

    registration

    Scroll to Top


Search

Time Zone

  • Lausanne, Delft (CET)
  • Santa Cruz (PST)
  • New-York (EST)
  • India (IST)

Local Weather

Lausanne
10°
light rain
humidity: 69%
wind: 9m/s SW
H 10 • L 6
6°
Mon
7°
Tue
13°
Wed
Weather from OpenWeatherMap