In-Memory Transformer Self-Attention Mechanism Using Passive Memristor Crossbar
Cai, Jack, Kaleem, Muhammad Ahsan, Genov, Roman, Azghadi, Mostafa Rahimi, and Amirsoleimani, Amirali (2024) In-Memory Transformer Self-Attention Mechanism Using Passive Memristor Crossbar. In: Proceedings of the IEEE International Symposium on Circuits and Systems. From: ISCAS 2024: IEEE International Symposium on Circuits and Systems, 19-22 May 2024, Singapore.
|
PDF (Published Version)
- Published Version
Restricted to Repository staff only |
Abstract
Transformers have emerged as the state-of-the-art architecture for natural language processing (NLP) and computer vision. However, they are inefficient in both conventional and in-memory computing architectures as doubling their sequence length quadruples their time and memory complexity due to their self-attention mechanism. Traditional methods optimize self-attention using memory-efficient algorithms or approximate methods, such as locality-sensitive hashing (LSH) attention that reduces time and memory complexity from O(L<sup>2</sup>) to O(L log L). In this work, we propose a hardware-level solution that further improves the computational efficiency of LSH attention by utilizing in-memory computing with semi-passive memristor arrays. We demonstrate that LSH can be performed with low-resolution, energy-efficient 0T1R arrays performing stochastic memristive vector-matrix multiplication (VMM). Using circuit-level simulation, we show our proposed method is feasible as a drop-in approximation in Large Language Models (LLMs) with no degradation in evaluation metrics. Our results set the foundation for future works on computing the entire transformer architecture in-memory.
| Item ID: | 87423 |
|---|---|
| Item Type: | Conference Item (Research - E1) |
| ISBN: | 9798350330991 |
| ISSN: | 0271-4310 |
| Keywords: | Backpropagation, In-Memory, Memristor, Neural Network Training, Self-Attention, Transformer |
| Copyright Information: | © 2024 IEEE |
| Date Deposited: | 26 Nov 2025 23:49 |
| FoR Codes: | 40 ENGINEERING > 4008 Electrical engineering > 400801 Circuits and systems @ 70% 46 INFORMATION AND COMPUTING SCIENCES > 4611 Machine learning > 461104 Neural networks @ 30% |
| SEO Codes: | 28 EXPANDING KNOWLEDGE > 2801 Expanding knowledge > 280110 Expanding knowledge in engineering @ 100% |
| More Statistics |
