A vector can be viewed as a particular sort of a matrix, with one dimension equal to 1. So matrix-vector multiplications are a special case of matrix-matrix multiplications.
A tensor is a possibly-higher-dimensional generalization of a matrix. A scalar is a rank-0 tensor, a vector is a rank-1 tensor, a matrix is a rank-2 tensor, and then there are higher ranks as well.
In actual mathematics, vectors and tensors are not mere arrays of numbers; they are objects that live in “vector spaces” or “tensor products of vector spaces”, and the numbers are their coordinates; you can change coordinate system and the numbers will change in certain well-defined ways. But when e.g. Nvidia sell you a GPU with “tensor cores” they just mean something that can do certain kinds of matrix arithmetic quickly.
In e.g. one version of Google’s TPUs, there’s a big systolic array of multiply-accumulate units, which is good for dot-product-like operations, and you program it with instructions that do things like an Nx256-by-256x256 matrix multiplication, for whatever value of N you choose. If you need to handle arrays of different sizes, you’d build the calculations out of those units.
An example of a systolic algorithm might be designed for matrix multiplication. One matrix is fed in a row at a time from the top of the array and is passed down the array, the other matrix is fed in a column at a time from the left hand side of the array and passes from left to right. Dummy values are then passed in until each processor has seen one whole row and one whole column. At this point, the result of the multiplication is stored in the array and can now be output a row or a column at a time, flowing down or across the array.https://en.wikipedia.org/wiki/Systolic_array
Matrix multiplication means multiplying matrices.
A vector can be viewed as a particular sort of a matrix, with one dimension equal to 1. So matrix-vector multiplications are a special case of matrix-matrix multiplications.
A tensor is a possibly-higher-dimensional generalization of a matrix. A scalar is a rank-0 tensor, a vector is a rank-1 tensor, a matrix is a rank-2 tensor, and then there are higher ranks as well.
In actual mathematics, vectors and tensors are not mere arrays of numbers; they are objects that live in “vector spaces” or “tensor products of vector spaces”, and the numbers are their coordinates; you can change coordinate system and the numbers will change in certain well-defined ways. But when e.g. Nvidia sell you a GPU with “tensor cores” they just mean something that can do certain kinds of matrix arithmetic quickly.
In e.g. one version of Google’s TPUs, there’s a big systolic array of multiply-accumulate units, which is good for dot-product-like operations, and you program it with instructions that do things like an Nx256-by-256x256 matrix multiplication, for whatever value of N you choose. If you need to handle arrays of different sizes, you’d build the calculations out of those units.
An example of a systolic algorithm might be designed for matrix multiplication. One matrix is fed in a row at a time from the top of the array and is passed down the array, the other matrix is fed in a column at a time from the left hand side of the array and passes from left to right. Dummy values are then passed in until each processor has seen one whole row and one whole column. At this point, the result of the multiplication is stored in the array and can now be output a row or a column at a time, flowing down or across the array.https://en.wikipedia.org/wiki/Systolic_array