Thakkar, Ishan G

Ishan G Thakkar Lab

Introduction Statement of Research Activities in PI Thakkar’s Group

PI Thakkar’s group’s research interest lies in contributing innovative solutions for the design of fundamentally high-speed, energy-efficient, reliable, and secure computing systems. We believe in forging new, unconventional but stronger technological pillars of computing systems evolution, by employing the design principles that are inherently found in nature, such as the principles of integrated heterogeneity, parallelism, physical unclonability, and reconfigurability. We reason that these kinds of principled technological pillars are expectedly capable of sustaining the evolution trajectory of computing systems for very long in the future. More specific unconventional technology interests include (1) Silicon photonics; (2) Optical computing; (3) Neuromorphic computing; (4) In-memory computing; (5) Stochastic computing; (6) Monolithic 3D (M3D) integration; (7) Polymer and transparent conductive oxides based photonic devices and sensors. Our rationale behind employing these specific technologies is that these technologies have shown disruptive potential to eliminate the perils of the Moore’s law and von Neumann architecture-based traditional designs of electronic computers. These unconventional technologies have manifold benefits when they are employed for propelling the computing systems’ performance evolution.

Our research at UK aims to appropriate these unconventional technological pillars for their widespread feasible adoption, by focusing on contributions in three specific research projects. The summary of these research projects with their specific computational requirements is provided below.

Energy-Efficient, Reliable, and Secure Multi-Terabit Photonic Interconnects:

One of the perils of von Neuman computing systems is that such systems need to employ interconnection subsystems as their integral part, to provision to their constituent processing subsystems the data and program instructions that typically reside in the separate constituent memory subsystems. With the recent deluge of data-centric workloads (related to AI, IoT, and cybersecurity) and the paradigm shift to manycore computing architectures, the demand for performance and energy-efficiency from the interconnection subsystems is dramatically increasing. To meet this demand, there is irresistible impetus to realize interconnection subsystems with multi-terabit (multiple tera/10¹² bits per second per watt) performance. However, the traditional electronic interconnects are deemed unfit to meet this performance demand due to their impedance-dependent, poor scalability.

To realize multi-terabit interconnects, our research at UK has focused on exploring Integrated Photonics (IPh) as the unconventional technological pillar of computing evolution. In fact, IPh technology has been widely regarded as the most mature unconventional technology that can potentially enable multi- terabit interconnects. However, there are energy-efficiency, reliability, and security related roadblocks, due to the high optical power consumption caused by unavoidable optical signal losses, high static power consumption for thermal stability, increased vulnerability to data snooping-based security attacks, and high crosstalk noise due to the non-ideal spectral effects. To overcome these roadblocks, our research at UK focuses on a holistic approach with cross-coupled contributions across the device, circuit, and architecture layers of the hardware design stack.

Students:

UK supported research Ph.D. advisees:

Venkata Sai Praneeth Karempudi

Samrat Patel

Postdoc advisees: Dr. Justin Woods Undergraduate advisees:

David Pippen, UG, Added on 5/25/2022 on LCC resources

Bobby Bose, UG, Added on 5/25/2022 on LCC resources

Computational requirements:

Polynomial time algorithms, benchmark applications executed on high- level behavioral models (C++ or SystemC or Python based models) of multi-core GPUs and CPUs Software requirements: Matlab, Ansys/Lumerical (Windows), internally built python-based software (Unix/Linux)

UK Collaborators:

Dr. J. Todd Hastings (CeNSE ECE UK, KY Multiscale)

Extreme-Scale Computing with Integrated Photonics

To date, Moore’s law has guided the advancement of computing hardware performance. But unfortunately, in recent years, Moore’s law has been facing serious challenges as the nanofabrication technology is experiencing physical limitations due to exceedingly small size of transistors. This has motivated the need for a new paradigm that can replace Moore’s law and continue to provide progressively faster and more efficient computing hardware for many years to come. Fortunately, Integrated Photonics (IPh) has been identified as one such promising technology. IPh technology is CMOS compatible and provides several advantages, such as subpicosecond speeds, low power consumption and ability to multiplex multiple wavelengths in a single waveguide. In recent years, IPh based computing has gained momentum and several prototypes of IPh based circuits and processing units for computing have been introduced.

Our research at UK in this project so far has mainly employed IPh as the unconventional technology for computing evolution to make the following contributions: (i) design of a specialized hardware accelerator for delayed feedback reservoir computing based time-series analysis, (ii) silicon-photonic microring resonators based polymorphic logic gates and arithmetic-logical units, and (iii) design space exploration of silicon-photonic microring resonators based neuromorphic hardware architectures for high-speed tensor processing. It has been shown that our designed IPh based computing architectures, as well as several such units from the prior works of other research groups, can provide disruptive throughputs in the few POPs/sec (10¹⁵/peta operations per sec), with the footprint efficiencies in the tens of TOPs/s/mm². To further improve the performance of IPh based computing architectures, by inculcating in them the qualities of high error-resilience, reduced area consumption, and increased precision flexibility, our research group has recently started exploring a merger of IPh based computing circuits with another unconventional technology Stochastic Computing. Our efforts in this new direction have recently received a new research grant support from the National Science Foundation (NSF).

Students:

Sairam Sri Vatsavai

Samrat Patel

Venkata Sai Praneeth Karempudi

Computational requirements:

polynomial time algorithms, benchmark applications executed on high- level behavioral models (C++ or SystemC or Python based models) of multi-core GPUs and CPUs Software requirements: Matlab, Ansys/Lumerical (Windows), internally built python-based software (Unix/Linux)

UK Collaborators:

Dr. J. Todd Hastings (CeNSE ECE UK, KY Multiscale)

Dr. Sayed Ahmad Salehi (ECE UK)

Grants:

OSPA-managed project (through an NSF EAGER grant) https://nsf.gov/awardsearch/showAward?AWD_ID=2139167&HistoricalAwards=false

Processing-Using-Memory Based Computing Architectures

One critical disadvantage of von Neuman computing systems is that such systems require enormous amount of data movements between their constituent memory and processing subsystems, which extravagantly increases the latency and energy consumption of information/data processing using such systems. Due to this unfortunate disadvantage, the demand for increasingly high computational capacity and energy efficiency from von Neuman computing systems may soon become unsustainable. In efforts to avoid this undesired outcome, alternate computing paradigms and architectures that can provide radical improvements in computational energy efficiency are being pursued. To this end, the newly invented paradigm of Processing-Using-Memory (PUM) has gained increased attention, because PUM based computing architectures, which are typically used to accelerate deep learning workloads, can use the operating principles and analog computing capabilities of memory modules to implement basic arithmetic functions (e.g., multiplication, addition) directly inside memory modules, to consequently eliminate unnecessary data movements between processor and memory units. Several prototypes of PUM based computing architectures from prior works have demonstrated computational energy efficiencies in the few tens to hundreds of TOPs/J (tera/10¹² operations per joule), which are ~10²-10³× better than the traditional von Neumann computing architectures.

Prior works on PUM based deep learning architectures have mainly focused on the following circuit- level and logic-level optimizations for improving the processing throughput and energy efficiency: (i) decrease energy consumption and latency by reorganizing DRAM cell arrays and peripherals; and (ii) simplify implementation by employing novel strategies for computing (e.g., lookup tables, stochastic computing, ternary computing). However, these PUM architectures from prior works incur very long execution times to perform single-frame inferences of deep learning models (e.g., CNNs), which hinders their applicability for latency critical applications (e.g., autonomous vehicles). To address this shortcoming, our research at UK has explored a merger of two unconventional technologies: Processing Using Memory (PUM) and Stochastic Computing. By merging these two technologies, we have been able to reduce the latency and execution time for deep learning inference tasks by up to 7×. Our specific contributions in this area include: (i) highly scalable and error-resilient stochastic number generation with precision-independent latency, and (ii) a novel bit-pArallel sTochastic aRithmetic based In-DRAM Accelerator (ATRIA) for energy-efficient and high-speed inference of convolutional neural networks (CNNs).

Students:

Type of research support: UK supported research Ph.D. advisees:

Supreeth Mysore Shivanandamurthy

Sairam Sri Vatsavai

Elaheh Hosseinkhani

Computational requirements:

Polynomial time algorithms, benchmark applications executed on high- level behavioral models (C++ or SystemC or Python based models) of multi-core GPUs and CPUs Software requirements: Matlab (Windows), Mentor Graphics or Cadence tools, internally built python- based software (Unix/Linux)

UK Collaborators:

Dr. Sayed Ahmad Salehi (ECE UK)