emacs28 (u/emacs28) - Readit News

emacs28 commented on Karatsuba Matrix Multiplication and Its Efficient Hardware Implementations arxiv.org/abs/2501.08889... · Posted by u/emacs28

oofbey · 5 months ago

They're proposing "new hardware architectures" to take advantage of this idea. Anybody with a background in GPU floating point math comment on how realistic this is?

emacs28 · 5 months ago

First author here. The hardware architectures are realistic - we developed & evaluated real example hardware implementations for them, validated on FPGA, and they achieved state-of-the-art ResNet performance in a deep learning accelerator system implementation compared to prior accelerators evaluated on similar FPGAs. See the associated accelerator system source code here:

https://github.com/trevorpogue/algebraic-nnhw

The hardware architectures focused on in the paper are systolic array designs, an efficient type of hardware design for matrix multiplication (e.g., the Google TPU uses this), as opposed to more SIMD-like vector architectures like GPUs. It may be possible to extend the proposed KMM algorithm to other types of hardware architectures also in future work. Regarding floating point - this work is applicable for integer matrix multiplication acceleration, it may be possible to extend the concept to floating point data types in future work also.

emacs28 commented on Show HN: Matrix Multiplication with Half the Multiplications github.com/trevorpogue/al... · Posted by u/emacs28

adastra22 · a year ago

I’ve only glanced at it so someone correct me if I’m wrong, but IIUC this is not a replacement for matrix multiplication but rather an approximation that only gives decent-ish results for the types of linear systems you see in AI/ML. But for that use case it is totally fine?

emacs28 · a year ago

It produces identical/bit-equivalent results as conventional/naive matrix multiplication for integer/fixed-point data types

emacs28 commented on Show HN: Matrix Multiplication with Half the Multiplications github.com/trevorpogue/al... · Posted by u/emacs28

pclmulqdq · a year ago

There are a lot of matrix multiplication algorithms out there with a lot of pluses and minuses. It's always a balance of accuracy, runtime, and scaling. This one probably has bad accuracy in floating point.

emacs28 · a year ago

For everyone discussing the reduced accuracy/numerical stability of the algorithms in floating-point, this is true. But note that the application of the algorithms in the work is explored for fixed-point MM/quantized integer NN inference, not floating-point MM/inference. Hence, there is no reduction in accuracy for that application of it compared to using conventional fixed-point MM.