FPL 2023


Title:  Reconfigurable Dataflow Accelerators for the Foundation Model Era
Date: 06 Sep 2023, 08:45-09:40 CET

Abstract: Generative AI applications with their ability to produce natural language, computer code and images are transforming all aspects of society. These applications are powered by huge foundation models such as GTP-3 which are trained on massive unlabeled datasets. Foundation models have 10s of billions of parameters and have obtained state-of-the-art quality in natural language processing, vision and speech applications. These models are computationally challenging because they require 100s of petaFLOPS of computing capacity for training and inference. Future foundation models will have even greater capabilities provided by more complex model architectures with longer sequence lengths, irregular data access (sparsity) and irregular control flow. In this talk I will describe how the evolving characteristics of foundation models will impact the design of the optimized computing systems required for training and serving these models. In my talk, I will explain how Reconfigurable Dataflow Accelerators (RDAs) can accelerate a broad set of data intensive applications including foundation models. RDAs efficiently exploit the hierarchical dataflow parallelism that exists in ML models while minimizing off-chip communication. I will also explain how RDAs can be used to accelerate irregular applications using a new execution model called Dataflow Threads.

Biography: Kunle Olukotun is the Cadence Design Professor of Electrical Engineering and Computer Science at Stanford University. Olukotun is a pioneer in multicore processor design and the leader of the Stanford Hydra chip multiprocessor (CMP) research project. He founded Afara Websystems to develop high-throughput, low-power multicore processors for server systems. The Afara multi-core multi-thread processor, called Niagara, was acquired by Sun Microsystems and now powers Oracle's SPARC-based servers. Olukotun co-founded SambaNova Systems, a Machine Learning and Artificial Intelligence company, and continues to lead as their Chief Technologist. 

 Olukotun is the Director of the Pervasive Parallel Lab and a member of the Data Analytics for What's Next (DAWN) Lab, developing infrastructure for usable machine learning. He is a member of the National Academy of Engineering, an ACM Fellow, and an IEEE Fellow for contributions to multiprocessors on a chip design and the commercialization of this technology. He has received the ACM-IEEE CS Eckert-Mauchly Award and the IEEE Harry H. Goode Memorial Award. 

Title: Grand Challenges for FPGAs in the Next Decade Slides
Date: 06 Sep 2023, 13:30-14:20 CET

Abstract: FPGAs are well-suited for applications like wireless radios, embedded vision, video transcoding, and machine learning due to their ability to handle high data bandwidth and computing performance. The parallel architecture of FPGAs and their high-speed IO provides significant benefits for these applications. However, developers must bridge the gap between the algorithm-centric and hardware-centric world of FPGAs to utilize their potential fully.

As we approach the next decade, there are major obstacles that the academic and industrial sectors must tackle together concerning FPGAs. During the presentation, I will discuss the challenges in the silicon design of FPGAs, and EDA software, to realize future applications.

Biography: Nabeel Shirazi is a Senior Director at Intel's Programmable Solution Group, where he leads the Quartus software development organization.  Before Intel, he worked at Xilinx for 22 years.  He was one of the initial developers of Xilinx's System Generator for DSP in 1998, helped take it to market, and took full responsibility for the tool in 2005. In 2013, his team launched Vivado IP Integrator, the design tool for all Zynq / Zynq UltraScale+ and Versal embedded designers. In 2017, he co-invented Model Composer, Xilinx’s next-generation MATLAB & Simulink bases design tool.  He is now responsible for Intel's next-generation EDA tools for FPGA design.

He holds a Ph.D. in Computing from Imperial College in London, where his dissertation was on Automating the Production of Run-Time Reconfigurable Designs. His MSEE, completed at Virginia Tech, was on ground-breaking work on implementing a 2-D Fast Fourier Transform on an FPGA-Based Custom Computing Platform, which led to one of the first implementations of arbitrary floating-point arithmetic on FPGAs. He has 43 patents, and while at Xilinx, his team was nominated four times for the Technical Innovation of the Year award and won twice.

Title: Computing Near Storage Slides
Date: 07 Sep 2023, 09:00-09:50 CET

Abstract: We live in an age where enormous amount of data is being collected constantly because of smart phones, ubiquitous presence of sensors and the wide-spread use of social media. Useful and cost-effective analysis of this data is the biggest economic driver for the IT industry. Such analyses are often done in data centers or on cluster of machines because they involve applying sophisticated algorithms to terabyte-size graphs, which are extremely irregular and sparse. We will show how low-power appliances for such analyses can be built using flash storage and hardware accelerators. Such appliances are likely to be 10X cheaper than 16-32 node server clusters and will come in the form factor of an SSD to be plugged into your laptop. 

Biography: Arvind is the Head of Computer Science Faculty and the Charles and Jennifer Johnson Professor of Computer Science and Engineering at MIT. Arvind’s group, in collaboration with Motorola, built the Monsoon dataflow machines and its associated software in the late eighties. In 2000, Arvind started Sandburst which was sold to Broadcom in 2006. In 2003, Arvind co-founded Bluespec Inc., an EDA company to produce a set of tools for high-level synthesis. Arvind's current research focus is to enable rapid development of embedded systems and designing complex digital chips with associated correctness proofs. Arvind is a Fellow of IEEE and ACM, and a member of the National Academy of Engineering and the American Academy of Arts and Sciences.

Title: Computing challenges at the HL-LHC
Date: 07 Sep 2023, 13:30-14:20 CET Cancelled

Abstract: As the Large Hadron Collider (LHC) program steps into the exascale epoch, a luminosity upgrade is scheduled for 2029 (HL-LHC), which will yield an estimated exabyte of data annually from each detector. This significant escalation in data volume and complexity heralds an unparalleled computational challenge. In anticipation of this imminent landscape, the LHC experiments have initiated an ambitious research and development (R&D) campaign. Concurrently, the sphere of computing is experiencing multiple transformative technological shifts, including the advent of exascale technologies, the proliferation of accelerated heterogeneous hardware, the burgeoning AI/Machine Learning revolution intertwined with the convergence of AI and High Performance Computing (HPC), and the environmentally crucial green revolution, which emphasizes the reduction of carbon footprint and enhancement of efficiency. For the past two decades, CERN openlab has been instrumental in harnessing such technology revolutions, forming symbiotic relationships with industry partners, thereby reinforcing its unique capacity to instigate innovative R&D. This presentation will delve into the preparatory computational work for the HL-LHC and the research domains being explored through collaborations with industry counterparts.

Biography: As the current Head of CERN openlab, Maria provides strategic leadership in pioneering research and development in areas such as high-performance computing (HPC), artificial intelligence (AI), and advanced storage solutions. Maria has a PhD in particle physics. She also has extensive knowledge in computing for high-energy physics experiments, having worked in scientific computing since 2002. Maria has worked for many years on the development and deployment of services and tools for the Worldwide LHC Computing Grid (WLCG), the global grid computing system used to store, distribute, and analyse the data produced by the experiments on the Large Hadron Collider (LHC). Maria was the founder of the WLCG operations coordination team, which she also previously led. This team is responsible for overseeing core operations and commissioning new services. Throughout 2014 and 2015, Maria was the software and computing coordinator for one of the four main LHC experiments, called CMS. She was responsible for about seventy computing centres on five continents, and managed a distributed team of several hundred people. From 2016 to early 2023, Maria was CERN openlab CTO.Prior to joining CERN, Maria was a Marie Curie fellow and research associate at Imperial College London. She worked on hardware development and data analysis for another of the LHC experiments, called LHCb — as well as for an experiment called ALEPH, built on the accelerator that preceded the LHC.

Title: Adaptive Architectures for Efficient Compute Slides
Date: 08 Sep 2023, 09:00-09:50 CET

Abstract: The AMD XDNA architecture, developed by Xilinx and its Versal adaptive SoC portfolio, is an innovative, highly configurable, and scalable platform for heterogeneous computing. Now that Xilinx is fully integrated into AMD, XDNA incorporates these unique FPGA and Adaptive SoC innovations to the existing AMD portfolio.

In this talk, I will share product examples illustrating how the synergy between teams and technologies has opened new possibilities in silicon architecture. The combination has created a broad set of opportunities, with new capabilities for addressing the semiconductor industry’s toughest challenges. I will discuss those challenges and how we, as an industry, can apply adaptable technologies to deliver more efficient solutions.

Biography: Trevor Bauer, AMD's Corporate VP of Silicon Architecture, heads the engineering team responsible for product definition of FPGAs and Adaptable SoCs within AMD's Advanced and Embedded Computing Group. He began his career at Xilinx in 1994 after completing his studies at MIT, where he demonstrated FPGAs for reconfigurable computing in his master's thesis. Trevor holds over 70 patents in the field of programmable logic. Additionally, he serves as the site leader for AMD's engineering facility in Longmont, Colorado.