Why Linear Algebra is the Natural Language of Quantum Mechanics

Quantum mechanics is perhaps one of the more counter-intuitive fields in physics as it goes against our classical understanding of the physical world. However, not only is the physics quite bizarre, I believe the math is too when you first learn it. While in most other field of physics, the connection between the physics and the mathematics used tends to be relatively clear, when it comes quantum mechanics the connection is really not obvious. How can an area about abstract vector spaces and matrices have anything to do with the probabilistic nature of particles? Well there is actually quite a deep and eye-opening relation as we will see in this post.

This will definitely be one of my longer posts as it is impossible to get through everything in a short amount of time but if you are interested feel free to read until the end. Also, just as a prerequisite, throughout this post I will be assuming a high school level understanding of quantum mechanics and some understanding of linear algebra from vector spaces to eigenvectors.

The Wave Function

I think the best way to understand the intuition behind the math of quantum mechanics is to begin with the (position) wave function Ψ(x, t). The wave function can essentially be thought of as a mathematical way to represent the state of a particle, and has the special property that its square is the probability density of the position:

Where I let P(a, b) be the probability of finding the particle between the points x = a and x = b.

Now, if you know a bit of statistics you may know that the expectation value of a probability density can be found in the following way:

Where ⟨x⟩ denotes the expectation value of the position. I rewrote the expression by using the fact that |Ψ|² = Ψ*Ψ where Ψ* is the complex conjugate of Ψ (the reason for rewriting it like this will come clear later). To give a bit of intuition on what ⟨x⟩ represents, it is essentially the average of the position measurements on an infinite number of particles that are in the state Ψ. Interestingly enough, we can get a very similar equation for the expectation value of the momentum, p, which looks like the following (I will skip the derivation as it is not really relevant to this post):

Where i is the imaginary number and ħ is the reduced Planck’s constant. You may be wondering why I haven’t factored out those constants but I have kept it in this form to compare it with the case of ⟨x⟩. As you can probably see, these two equations are in the exact same form, with some expression being sandwiched by Ψ* and Ψ and then integrated with respect to x. If we then focus on the expression being sandwiched, we can call these the ‘operators’ associated with whatever we are measuring. For example the position operator will be x and the momentum operator will be -iħ∂/∂x. In fact, any dynamical quantity, Q, will have some operator associated with it which will yield the expected value of said quantity if sandwiched between Ψ*Ψ and then integrated. Note that if the operator has a derivative for example, it will act on the function in front of it which is why the order does actually matter although it doesn’t in the case of ⟨x⟩. While this notion of ‘operators’ probably seems quite strange and too good to be true, everything does work out quite nicely from them and will lead naturally into the underlying linear algebra of quantum mechanics.

Generalizing Vectors and Matrices to Functions and Operators

We can now begin to finally connect quantum mechanics to linear algebra. Oftentimes when someone thinks of linear algebra they may think about the idea of vectors. While at a basic level, vectors can be thought of as something with magnitude and direction, a more general definition people use is that they are an array of numbers. However, in quantum mechanics we actually generalize vectors even further to functions. Yes, functions. This seems absurd considering our previous definitions of a vector, but it is actually completely valid if we consider the idea of a vector space. A vector space has certain conditions (such as being closed under vector addition or scalar multiplication) that need to be met for it to be a valid space, but these conditions make no reference to how we wish to define a vector. In fact, there is actually a space known as “Hilbert space” which is a space for specific type of functions (namely square integrable functions) and this space happens to satisfy the conditions to be a vector space, making such functions valid generalizations of vectors.

Here’s where the operators we discussed earlier come into play. While functions are generalizations of vectors, we can consider operators as generalizations of matrices. Just like how matrices transform vectors into another vector, operators can transform functions into another function. For example consider the momentum operator I defined earlier. If we have the operator act on some function, f(x), we will get -iħ∂f/∂x, which is indeed another function with respect to x.

We can also generalize the idea of the inner product by defining it in the following way:

Notice that if we have some operator act on g in the above equation and then let both f and g be Ψ, we will get the expected value we were talking about earlier:

Where Q is some quantity we are measuring and the Q with the hat being the corresponding operator.

Now that we have generalized vectors and matrices to functions and operators, we will see why such generalization are useful.

Hermicity of Observable Operators

Naturally, the operators that we are most interested in are ones that correspond to a measurable quantity (such as position and momentum) and such quantities are often called observables as we are able to ‘observe’ them. Perhaps one of the most important properties of operators for observables is that they are hermitian. If you have studied linear algebra before you may have come across this term before, but to put it simply an operator is hermitian if:

For any function f. Actually, it is not too hard to prove that observable operators are hermitian because if we let f be Ψ above, the right hand side would just be ⟨Q⟩. The left hand side, would instead be ⟨Q⟩* since we can reverse the order of the inner product at the expense of taking the conjugate. Now, since the expectation value of an observable quantity must be purely real (it makes no sense to have an imaginary position or energy), ⟨Q⟩ must be equal to ⟨Q⟩* and hence the operator is hermitian. Why this is important is because the eigenvalues and eigenvectors (eigenfunctions in this case) of a hermitian operator have very special properties as we will see. Just to refresh your memory on the eigenvalue problem here is the following equation that illustrates it:

Eigenfunctions and eigenvalues of an operator

The eigenfunctions of some operator are the functions such that transforming them using the operator will yield a scalar product of the original eigenfunction. In the equation above, fₙ denotes the n’th eigenfunction and qₙ denotes the corresponding eigenvalue.

So, what’s so special about the eigenfunctions of hermitian operators? Well, first of all, distinct eigenfunctions can be made so that they are orthogonal to one another (i.e. their inner product is zero). Secondly, they are complete, meaning that any other function in Hilbert space can be expressed as a linear combination of the eigenfunctions. Combining these two properties, we can say that the eigenfunctions of a hermitian matrix can create an orthonormal basis that spans Hilbert space. Just like how the i j k unit vectors are an orthonormal basis for ℝ³, eigenfunctions of any hermitian operator (and hence any observable operator) compose an orthonormal basis within Hilbert space.

If we look back on our wave function, Ψ, the above properties allow us to express it in terms of a linear combination of the eigenfunctions for the position operator. In fact, the general state of the system could be expressed in terms of the basis of any observable’s eigenfunctions, whether that be momentum or even energy. Additionally, it actually turns out that each eigenfunction that compose our state correspond to one defined measurement. To see what I mean, let’s say our particle is defined by exactly one of the eigenfunctions of some observable operator Q. Then, the expectation value is:

Where in the last line I used the fact that fₙ is normalized so its inner product must be one. Additionally, it can also be shown that the standard deviation of Q is zero in the state fₙ, meaning that it doesn’t deviate at all from the mean value which is qₙ. Hence, if the particle is exactly in the state fₙ, we must always get the value qₙ when we measure it. Therefore, if our particle is in a more general state and we can express it as a linear combinations of different eigenfunctions, it makes sense that the particle could have multiple possible results upon measurement since the state is composed of different eigenfunctions that correspond to different values. This means the state of any particle is in nature a linear combinations of the eigenfunctions for any observable operator with the coefficient we stick to the eigenfunctions in our linear combination corresponding to the probability of picking up that specific eigenfunction when making the measurement. This is precisely the probabilistic nature of particles I was talking about before because instead of just being one defined value, the measurable quantities of a particle are always a combinations of different possible outcomes.

What’s so amazing about this link between linear algebra and quantum mechanics is that it provides a much more illuminating representation of the underlying physics more than just fancy and confusing techniques for solving differential equations. For example, there is an operator known as the Hamiltonian (you may have heard this before from classical mechanics) that corresponds to the total energy of the system. It turns out, that the eigenvalues of this operator are discrete, meaning that it can only take up specific values as opposed to say position which has a continuous spectrum of eigenvalues. Since the eigenvalues correspond to what the operator represents (total energy in this case) this shows that particles can only have a discrete spectrum of possible energy values which is perhaps something you have heard before. This also explains why the absorption/emission spectrum of the hydrogen atom is discrete.

Closing Remarks

While there is still so much more I could talk about, I hope you can sort of see how generalizing vectors to functions and expressing the state of a particle as a linear combinations of eigenfunctions actually has a deep connection with the probabilistic nature of quantum mechanics. While I didn’t mention it in this post, occasionally vectors and matrices are actually used (instead of functions and operators) as a useful representation for a concept known as spin (try looking into the Pauli spin matrices for example). Additionally, there is a more convenient and useful notation called Dirac notation which I never properly used in the post but is also something you might want to look into if you are interested.

Otherwise, that is all I have to say for now. Thank you for reading.

References

Griffiths, D. J., & Schroeter, D. F. (2020). Introduction to quantum mechanics. Cambridge University Press.