Abstract: In this paper, we design and develop a hardware accelerator for computing the LU decomposition of an input matrix. Our accelerator consists of two simple linear arrays of Processing Engines ...