Readit News logoReadit News
emacs28 commented on Karatsuba Matrix Multiplication and Its Efficient Hardware Implementations   arxiv.org/abs/2501.08889... · Posted by u/emacs28
oofbey · 5 months ago
They're proposing "new hardware architectures" to take advantage of this idea. Anybody with a background in GPU floating point math comment on how realistic this is?
emacs28 · 5 months ago
First author here. The hardware architectures are realistic - we developed & evaluated real example hardware implementations for them, validated on FPGA, and they achieved state-of-the-art ResNet performance in a deep learning accelerator system implementation compared to prior accelerators evaluated on similar FPGAs. See the associated accelerator system source code here:

https://github.com/trevorpogue/algebraic-nnhw

The hardware architectures focused on in the paper are systolic array designs, an efficient type of hardware design for matrix multiplication (e.g., the Google TPU uses this), as opposed to more SIMD-like vector architectures like GPUs. It may be possible to extend the proposed KMM algorithm to other types of hardware architectures also in future work. Regarding floating point - this work is applicable for integer matrix multiplication acceleration, it may be possible to extend the concept to floating point data types in future work also.

emacs28 commented on Show HN: Matrix Multiplication with Half the Multiplications   github.com/trevorpogue/al... · Posted by u/emacs28
adastra22 · a year ago
I’ve only glanced at it so someone correct me if I’m wrong, but IIUC this is not a replacement for matrix multiplication but rather an approximation that only gives decent-ish results for the types of linear systems you see in AI/ML. But for that use case it is totally fine?
emacs28 · a year ago
It produces identical/bit-equivalent results as conventional/naive matrix multiplication for integer/fixed-point data types
emacs28 commented on Show HN: Matrix Multiplication with Half the Multiplications   github.com/trevorpogue/al... · Posted by u/emacs28
pclmulqdq · a year ago
There are a lot of matrix multiplication algorithms out there with a lot of pluses and minuses. It's always a balance of accuracy, runtime, and scaling. This one probably has bad accuracy in floating point.
emacs28 · a year ago
For everyone discussing the reduced accuracy/numerical stability of the algorithms in floating-point, this is true. But note that the application of the algorithms in the work is explored for fixed-point MM/quantized integer NN inference, not floating-point MM/inference. Hence, there is no reduction in accuracy for that application of it compared to using conventional fixed-point MM.

u/emacs28

KarmaCake day345October 18, 2021View Original