Why are linear functions between finite-dimensional vector spaces representable by matrices? And why does matrix multiplication compose the corresponding linear maps? There’s geometric intuition for this, e.g. presented by 3Blue1Brown. I will alternatively present a category-theoretic analysis. The short version is that, in the category of vector spaces and linear maps, products are also coproducts (hence biproducts); and in categories with biproducts, maps between biproducts decompose as (generalized) matrices. These generalized matrices align with traditional numeric matrices and matrix multiplication in the category of vector spaces. The category-theoretic lens reveals matrices as an elegant abstraction, contra The New Yorker.
I’ll use a standard notion of a vector space over the field R. A vector space has addition, zero, and scalar multiplication defined, which have the standard commutativity/associativity/distributivity properties. The category Vect has as objects vector spaces (over the field R), and as morphisms linear maps. A linear map f:U→V between vector spaces U, V satisfies f(u1+u2)=f(u1)+f(u2) and f(au)=af(u). (Advanced readers may see nlab on Vect.)
Clearly, R is a vector space, as is 0 (the vector space with only one element, which is zero). 0 is both an initial and a terminal object. Any linear map from 0 must always return the zero vector; and so must any linear map into 0. Therefore, by definition 0 is a category-theoretic zero object.
If U and V are vector spaces, then the direct sum U⊕V, which has as elements pairs (u, v) with u∈U,v∈V, and for which addition and scalar multiplication are element-wise, is also a vector space. The direct sum is both a product and a coproduct.
To show that the direct sum is a product, let U and V be vector spaces, and let π1:U⊕V→U and π2:U⊕V→V be the projections of the direct sum onto its elements. Let us suppose a third vector space T and linear maps f:T→U and g:T→V. Let ⟨f,g⟩:T→U⊕V be defined as ⟨f,g⟩(t)=(f(t),g(t)). Now ⟨f,g⟩ uniquely commutes:
To show that the direct sum is a coproduct, let U and V be vector spaces, and let i1:U→U⊕V be defined as i1(u)=(u,0), and similarly let i2:V→U⊕V be defined as i2(v)=(0,v). Let us suppose a third vector space W and linear maps f:U→W and g:V→W. Let [f,g]:U⊕V→W be defined as [f,g](u,v)=f(u)+g(v). Now [f,g] uniquely commutes:
The upshot is that the direct sum is both a product and a coproduct. Let 0A,B:A→B be a zero function; since the direct sum also satisfies the identities π1∘i1=idU,π2∘i2=idV,π1∘i2=0V,U,π2∘i1=0U,V, by definition it is a biproduct. We can now abstract from the category Vect to semiadditive categories, which are by definition categories with a zero object and all pairwise biproducts. Let C stand for any semiadditive category, with biproduct ⊕.
Biproducts enable powerful decomposition of morphisms (such as linear maps). Given h:T→U⊕V (in C), we may uniquely decompose it as h=⟨f,g⟩ for some f:T→U and g:T→V, specifically f=π1∘h,g=π2∘h. And similarly, we may uniquely decompose h:U⊕V→W as h=[f,g] for some f:U→W and g:V→W, specifically f=h∘i1,g=h∘i2.
Biproducts generalize from binary to n-ary. Suppose n is natural and Ui is an object for natural 1≤i≤n. Now the n-ary biproduct is ⨁ni=1Ui=U1⊕…⊕Un. We take the empty biproduct to be 0. We can also generalize the projections πi, the injections ii, the “row-wise” combination ⟨f,g⟩, and the “column-wise” combination [f,g], from binary to n-ary.
This generalization to n-ary biproducts enables conceiving of matrices categorically. Let m, n be natural, and let Ui and Vj be objects in C, for natural 1≤i≤m and 1≤j≤n. Suppose h:⨁mi=1Ui→⨁nj=1Vj. We first decompose h “row-wise”, as h=⟨h1,…,hn⟩ where hj=πj∘h. Then we decompose each row “column-wise”, as hj=[hj,1,…,hj,m] where hj,i=πj∘h∘ii. We can now write h in matrix style, as h=⟨[h1,1,…,h1,m],…,[hn,1,…,hn,m]⟩; the notation ⟨…⟩ can be visualized as vertical matrix concatenation, and […] can be visualized as horizontal matrix concatenation.
This is the core abstract idea, but how to apply it more concretely? Back in Vect, we can form the Euclidean space Rn=⨁ni=1R. Now a map h:Rm→Rn decomposes as a n×m matrix of linear maps hj,i:R→R. This is not quite a traditional matrix, but note that linear maps of type R→R are always multiplication by a constant real slope. Representing each hj,i by its slope yields a more traditional numeric matrix.
We can generalize matrix representation of linear maps to finite-dimensional vector spaces (which by definition have finite bases), by noting that each of these is isomorphic to Rn for some natural n. Specifically, if a space U has a basis {u1,…,un}, then the linear map f:Rn→U defined as f(x1,…,xn)=∑ni=1xiui is an isomorphism. Hence, matrix representation extends to maps between finite-dimensional vector spaces.
So far, we have a treatment of matrix-vector multiplication, but not matrix-matrix multiplication. We would like to show that composition of linear maps leads to the matrix representations multiplying in the expected way. Let m,n,p be natural, and let Ui,Vj,Wk be objects in C (for naturals 1≤i≤m,1≤j≤n,1≤k≤p). Suppose we have maps f:⨁mi=1Ui→⨁nj=1Vj and g:⨁nj=1Vj→⨁pk=1Wk. We can write f in matrix form (fj,i=πj∘f∘ii), and similarly g (gk,j=πk∘g∘ij).
Now we wish to find the matrix form of the composition h=g∘f. We fix i, k and consider the entry hk,i=πk∘g∘f∘ii. Now note πk∘g=[gk,1,…,gk,n] and f∘ii=⟨f1,i,…,fn,i⟩. Therefore
hk,i=[gk,1,…,gk,n]∘⟨f1,i,…,fn,i⟩
This expression is a row-column matrix multiplication, similar to a vector dot product. In the case of Vect, we can more explicitly write:
hk,i(u)=∑nj=1gk,j(fj,i(u))
Since Hom(U,V) in Vect, the set of linear maps from U to V, naturally forms a vector space, hk,i can also be written:
hk,i=∑nj=1(gk,j∘fj,i)
In the case where each Ui,Vj,Wk is R, this aligns with traditional matrix multiplication; composing linear maps of type R→R multiplies their slopes.
There is a way to generalize the row-column matrix multiplication [gk,1,…,gk,n]∘⟨f1,i,…,fn,i⟩=∑nj=1(gk,j∘fj,i) to semiadditive categories in general; for details, see Wikipedia on additive categories.
To summarize general lessons about semiadditive categories:
A map between biproducts h:⨁mi=1Ui→⨁nj=1Vj can be represented in matrix form as ⟨[h1,1,…,h1,m],…,[hn,1,…,hn,m]⟩ with hj,i=πj∘h∘ii.
If we have maps f:⨁mi=1Ui→⨁nj=1Vj and g:⨁nj=1Vj→⨁pk=1Wk, the matrix entry hk,i of the composition h=g∘f is the row-column multiplication [gk,1,…,gk,n]∘⟨f1,i,…,fn,i⟩.
And in the case of Vect, these imply the standard results:
Linear maps between finite-dimensional vector spaces, with fixed bases, are uniquely represented as numeric matrices.
The matrix representation of a composition of linear maps equals the product of the matrices representing these maps.
This is a nice way of showing the standard results, and the abstract results generalize to other semiadditive categories, such as the category of Abelian groups. For more detailed category-theoretic study of linear algebra, see Filip Bár’s thesis, “On the Foundations of Geometric Algebra”. For an even more abstract treatment, see “Graphical Linear Algebra”.
Matrices map between biproducts
Link post
Why are linear functions between finite-dimensional vector spaces representable by matrices? And why does matrix multiplication compose the corresponding linear maps? There’s geometric intuition for this, e.g. presented by 3Blue1Brown. I will alternatively present a category-theoretic analysis. The short version is that, in the category of vector spaces and linear maps, products are also coproducts (hence biproducts); and in categories with biproducts, maps between biproducts decompose as (generalized) matrices. These generalized matrices align with traditional numeric matrices and matrix multiplication in the category of vector spaces. The category-theoretic lens reveals matrices as an elegant abstraction, contra The New Yorker.
I’ll use a standard notion of a vector space over the field R. A vector space has addition, zero, and scalar multiplication defined, which have the standard commutativity/associativity/distributivity properties. The category Vect has as objects vector spaces (over the field R), and as morphisms linear maps. A linear map f:U→V between vector spaces U, V satisfies f(u1+u2)=f(u1)+f(u2) and f(au)=af(u). (Advanced readers may see nlab on Vect.)
Clearly, R is a vector space, as is 0 (the vector space with only one element, which is zero). 0 is both an initial and a terminal object. Any linear map from 0 must always return the zero vector; and so must any linear map into 0. Therefore, by definition 0 is a category-theoretic zero object.
If U and V are vector spaces, then the direct sum U⊕V, which has as elements pairs (u, v) with u∈U,v∈V, and for which addition and scalar multiplication are element-wise, is also a vector space. The direct sum is both a product and a coproduct.
To show that the direct sum is a product, let U and V be vector spaces, and let π1:U⊕V→U and π2:U⊕V→V be the projections of the direct sum onto its elements. Let us suppose a third vector space T and linear maps f:T→U and g:T→V. Let ⟨f,g⟩:T→U⊕V be defined as ⟨f,g⟩(t)=(f(t),g(t)). Now ⟨f,g⟩ uniquely commutes:
To show that the direct sum is a coproduct, let U and V be vector spaces, and let i1:U→U⊕V be defined as i1(u)=(u,0), and similarly let i2:V→U⊕V be defined as i2(v)=(0,v). Let us suppose a third vector space W and linear maps f:U→W and g:V→W. Let [f,g]:U⊕V→W be defined as [f,g](u,v)=f(u)+g(v). Now [f,g] uniquely commutes:
The upshot is that the direct sum is both a product and a coproduct. Let 0A,B:A→B be a zero function; since the direct sum also satisfies the identities π1∘i1=idU,π2∘i2=idV,π1∘i2=0V,U,π2∘i1=0U,V, by definition it is a biproduct. We can now abstract from the category Vect to semiadditive categories, which are by definition categories with a zero object and all pairwise biproducts. Let C stand for any semiadditive category, with biproduct ⊕.
Biproducts enable powerful decomposition of morphisms (such as linear maps). Given h:T→U⊕V (in C), we may uniquely decompose it as h=⟨f,g⟩ for some f:T→U and g:T→V, specifically f=π1∘h,g=π2∘h. And similarly, we may uniquely decompose h:U⊕V→W as h=[f,g] for some f:U→W and g:V→W, specifically f=h∘i1,g=h∘i2.
Biproducts generalize from binary to n-ary. Suppose n is natural and Ui is an object for natural 1≤i≤n. Now the n-ary biproduct is ⨁ni=1Ui=U1⊕…⊕Un. We take the empty biproduct to be 0. We can also generalize the projections πi, the injections ii, the “row-wise” combination ⟨f,g⟩, and the “column-wise” combination [f,g], from binary to n-ary.
This generalization to n-ary biproducts enables conceiving of matrices categorically. Let m, n be natural, and let Ui and Vj be objects in C, for natural 1≤i≤m and 1≤j≤n. Suppose h:⨁mi=1Ui→⨁nj=1Vj. We first decompose h “row-wise”, as h=⟨h1,…,hn⟩ where hj=πj∘h. Then we decompose each row “column-wise”, as hj=[hj,1,…,hj,m] where hj,i=πj∘h∘ii. We can now write h in matrix style, as h=⟨[h1,1,…,h1,m],…,[hn,1,…,hn,m]⟩; the notation ⟨…⟩ can be visualized as vertical matrix concatenation, and […] can be visualized as horizontal matrix concatenation.
This is the core abstract idea, but how to apply it more concretely? Back in Vect, we can form the Euclidean space Rn=⨁ni=1R. Now a map h:Rm→Rn decomposes as a n×m matrix of linear maps hj,i:R→R. This is not quite a traditional matrix, but note that linear maps of type R→R are always multiplication by a constant real slope. Representing each hj,i by its slope yields a more traditional numeric matrix.
We can generalize matrix representation of linear maps to finite-dimensional vector spaces (which by definition have finite bases), by noting that each of these is isomorphic to Rn for some natural n. Specifically, if a space U has a basis {u1,…,un}, then the linear map f:Rn→U defined as f(x1,…,xn)=∑ni=1xiui is an isomorphism. Hence, matrix representation extends to maps between finite-dimensional vector spaces.
So far, we have a treatment of matrix-vector multiplication, but not matrix-matrix multiplication. We would like to show that composition of linear maps leads to the matrix representations multiplying in the expected way. Let m,n,p be natural, and let Ui,Vj,Wk be objects in C (for naturals 1≤i≤m,1≤j≤n,1≤k≤p). Suppose we have maps f:⨁mi=1Ui→⨁nj=1Vj and g:⨁nj=1Vj→⨁pk=1Wk. We can write f in matrix form (fj,i=πj∘f∘ii), and similarly g (gk,j=πk∘g∘ij).
Now we wish to find the matrix form of the composition h=g∘f. We fix i, k and consider the entry hk,i=πk∘g∘f∘ii. Now note πk∘g=[gk,1,…,gk,n] and f∘ii=⟨f1,i,…,fn,i⟩. Therefore
hk,i=[gk,1,…,gk,n]∘⟨f1,i,…,fn,i⟩
This expression is a row-column matrix multiplication, similar to a vector dot product. In the case of Vect, we can more explicitly write:
hk,i(u)=∑nj=1gk,j(fj,i(u))
Since Hom(U,V) in Vect, the set of linear maps from U to V, naturally forms a vector space, hk,i can also be written:
hk,i=∑nj=1(gk,j∘fj,i)
In the case where each Ui,Vj,Wk is R, this aligns with traditional matrix multiplication; composing linear maps of type R→R multiplies their slopes.
There is a way to generalize the row-column matrix multiplication [gk,1,…,gk,n]∘⟨f1,i,…,fn,i⟩=∑nj=1(gk,j∘fj,i) to semiadditive categories in general; for details, see Wikipedia on additive categories.
To summarize general lessons about semiadditive categories:
A map between biproducts h:⨁mi=1Ui→⨁nj=1Vj can be represented in matrix form as ⟨[h1,1,…,h1,m],…,[hn,1,…,hn,m]⟩ with hj,i=πj∘h∘ii.
If we have maps f:⨁mi=1Ui→⨁nj=1Vj and g:⨁nj=1Vj→⨁pk=1Wk, the matrix entry hk,i of the composition h=g∘f is the row-column multiplication [gk,1,…,gk,n]∘⟨f1,i,…,fn,i⟩.
And in the case of Vect, these imply the standard results:
Linear maps between finite-dimensional vector spaces, with fixed bases, are uniquely represented as numeric matrices.
The matrix representation of a composition of linear maps equals the product of the matrices representing these maps.
This is a nice way of showing the standard results, and the abstract results generalize to other semiadditive categories, such as the category of Abelian groups. For more detailed category-theoretic study of linear algebra, see Filip Bár’s thesis, “On the Foundations of Geometric Algebra”. For an even more abstract treatment, see “Graphical Linear Algebra”.