On a crisp fall New England day during my junior year of college, I was walking past a subway entrance when a math problem caught my eye. A man was standing near a few brainteasers he had scribbled on the wall, one of which asked for the construction, with an imaginary straightedge and compass, of a cube with a volume twice that of a different, given cube.
This stopped me in my tracks. I had seen this problem before. In fact, the challenge is more than two millennia old, attributed to Plato by way of Plutarch. A straightedge can be used to extend a line segment in any direction, and a compass can be used to draw a circle with any radius from the chosen center. The catch for this particular puzzle is that any points or lengths appearing in the final drawing must have been either present at the start or constructable from previously provided information.
To double a cube’s volume, you start with its side length. Here that value might as well be 1 because it is the only unit of measurement given. To construct the larger cube, you have to figure out a way to draw one of its sides with the new required length, which is ∛2 (the cube root of two), using just the straightedge and compass as tools.
It is a tough problem. For more than 2,000 years no one managed to solve it. Finally, in 1837, Pierre Laurent Wantzel explained why no one had succeeded by proving that it was impossible. His proof used cutting-edge mathematics of the time, the foundations of which were laid by his French contemporary Évariste Galois, who died at 20 in a duel that may have involved an unhappy love affair. At the ripe old age of 20 myself, I had achieved considerably less impressive mathematical accomplishments, but I at least understood Wantzel’s proof.
Here is the idea: Given a point as the origin and a length of distance 1, it is relatively straightforward to use the straightedge and compass to construct all points on a number line whose coordinates are rational numbers (ignoring, as mathematicians tend to do, the impossibility of actually plotting infinitely many points in only a finite amount of time).
Wantzel showed that if one uses only these tools, each newly constructed point must be a solution to a quadratic polynomial equation ax2 + bx + c = 0 whose coefficients a, b and c are among the previously constructed points. In contrast, the point ∛2 is a solution to the cubic polynomial x3 − 2 = 0, and Galois’s theory of “field extensions” proves decisively that you can never get the solution to an irreducible cubic polynomial by solving quadratic equations, essentially because no power of 2 evenly divides the number 3.
Armed with these facts, I could not restrain myself from engaging with the man on the street. Predictably, my attempt to explain how I knew his problem could not be solved did not really go anywhere. Instead he claimed that my education had left me closed-minded and unable to “think outside the box.” Eventually my girlfriend managed to extricate me from the argument, and we continued on our way.
But an interesting question remains: How was I, a still-wet-behind-the-ears undergraduate in my third year of university study, able to learn to comfortably manipulate abstract number systems such as Galois’s fields in just a few short weeks? This material came at the end of a course filled with symmetry groups, polynomial rings and related treasures that would have blown the minds of mathematical giants such as Isaac Newton, Gottfried Leibniz, Leonhard Euler and Carl Friedrich Gauss. How is it that mathematicians can quickly teach every new generation of undergraduates discoveries that astonished the previous generation’s experts?
Part of the answer has to do with recent developments in mathematics that provide a “birds-eye view” of the field through ever increasing levels of abstraction. Category theory is a branch of mathematics that explains how distinct mathematical objects can be considered “the same.” Its fundamental theorem tells us that any mathematical object, no matter how complex, is entirely determined by its relationships to similar objects. Through category theory, we teach young mathematicians the latest ideas by using general rules that apply broadly to categories across mathematics rather than drilling down to individual laws that apply only in a single area.
As mathematics continues to evolve, mathematicians’ sense of when two things are “the same” has expanded. In the past few decades many other researchers and I have been working on an extension of category theory to make sense of this new expanded notion of uniqueness. These new categories, called infinity categories (∞-categories), broaden category theory to infinite dimensions. The language of ∞-categories gives mathematicians powerful tools to study problems in which relations between objects are too nuanced to be defined in traditional categories. The perspective of “zooming out to infinity” offers a novel way to think about old concepts and a path toward the discovery of new ones.
Like many other mathematicians I know, I was drawn into the subject partly because of my poor memory. This confounds many people who remember high school mathematics as rife with formulas to memorize—the trigonometric identities come to mind. But I took comfort in the fact that the most commonly used formulas could be rederived from sin2θ + cos2θ = 1, which itself has an elegant geometric explanation: it is an application of the Pythagorean theorem to a right triangle with a hypotenuse of length 1 and an acute angle of θ degrees.
This utopian vision of mathematics where everything just “makes sense” and nothing needs to be memorized falls apart to some extent at the university level. At that point students get to know the zoo of mathematical objects that have been conjured into existence in the past few centuries. “Groups,” “rings” and “fields” belong to an area of mathematics known as algebra, a word derived from a ninth-century book by Persian mathematician and astronomer Muhammad ibn Musa al-Khwarizmi, the title of which is sometimes translated as The Science of Restoring and Balancing. Over the next millennium, algebra evolved from the study of the nature of solutions to polynomial equations to the study of abstract number systems. Because no real number x satisfies the equation x2 + 1 = 0, mathematicians built a new number system—now known as the complex numbers—by adding an imaginary number i and imposing the stipulation that i2 + 1= 0.
Algebra is only one of the subjects in a mathematics undergraduate’s curriculum. Other cornerstones include topology—the abstract study of space—and analysis, which begins with a rigorous treatment of the calculus of real functions before branching into the more exotic terrains of probability spaces and random variables and complex manifolds and holomorphic functions. How is a student supposed to make sense of it all?
A paradoxical idea in mathematics is that of simplification through abstraction. As Eugenia Cheng puts it in The Art of Logic in an Illogical World, “a powerful aspect of abstraction is that many different situations become the same when you forget some details.” Modern algebra was created in the early 20th century when mathematicians decided to unify their studies of the many examples of algebraic structure that arose in the consideration of solutions to polynomial equations or of configurations of figures in the plane. To connect investigations of these structures, researchers identified “axioms” that describe their common properties. Groups, rings and fields were introduced to the mathematical universe, along with the idea that a mathematical object could be described in terms of the properties it has and explored “abstractly,” independently of the scaffolding of particular examples or constructions.
John Horton Conway famously pondered the curious ontology of mathematical things: “There’s no doubt that they do exist but you can’t poke and prod them except by thinking about them. It’s quite astonishing and I still don’t understand it, despite having been a mathematician all my life. How can things be there without actually being there?”
But this world of mathematical objects that can exist without actually being there created a problem: Such a world is vastly too large for any person to comprehend. Even within algebra, there are just too many mathematical things to study for there to be time to make sense of them all. Around the turn of the 20th century, mathematicians began to investigate so-called universal algebra, referring to a “set,” which could be a collection of symmetries, of numbers in some system or something else entirely, together with various operations—for instance, addition and multiplication—satisfying a list of relevant axioms such as associativity, commutativity or distributivity. By making different choices—Is an operation partially or totally defined? Is it invertible?—one arrives at the standard algebraic structures: the groups, rings and fields. But the subject is not constrained by these choices, which represent a vanishingly small portion of an infinite array of possibilities.
The proliferation of new abstract mathematical objects brings its own complexity. One way to simplify is to introduce a further level of abstraction where, astonishingly, we can prove theorems about a wide variety of mathematical objects simultaneously without specifying exactly what kinds of objects we are talking about.
Category theory, which was created in the 1940s by Samuel Eilenberg and Saunders Mac Lane, does just this. Although it was originally introduced to give a rigorous definition of the colloquial term “natural equivalence,” it also offers a way to think universally about universal algebra and other areas of mathematics as well. With Eilenberg and Mac Lane’s language, we can now understand that every variety of mathematical object belongs to its own category, which is a specified collection of objects together with a set of transformations depicted as arrows between the objects. For example, in linear algebra one studies abstract vector spaces such as three-dimensional Euclidean space. The corresponding transformations in this case are called linear transformations, and each must have a specified source and target vector space indicating which kinds of vectors arise as inputs and outputs. Like functions, the transformations in a category can be “composed,” meaning you can apply one transformation to the results of another transformation. For any pair of transformations f: A → B (read as “f is a transformation from A to B”) and g: B → C, the category specifies a unique composite transformation, written as g ∘ f: A → C (read as “g composed f is a transformation from A to C”). Finally, this composition law is associative, meaning h ∘ (g ∘ f) = (h ∘ g) ∘ f. It is also unital: each object B has an “identity transformation” commonly denoted by 1B with the property that g ∘ 1B = g and 1B ∘ f = f for any transformations g and f whose source and target, respectively, equal B.
How do categories help the hapless undergraduate confronted with too many mathematical objects and not enough time to learn about them all? Any class of structures you can define in universal algebra may be distinct from all others, but the categories these objects inhabit are very similar in ways that can be expressed precisely through categorical language.
With sufficient experience, mathematicians can know what to expect when they encounter a new type of algebraic structure. This idea is reflected in modern textbooks on the subject that develop the theories of groups, rings and vector spaces in series, essentially because the theories are parallel. There are other, looser analogies among these categories and the ones students encounter in topology or analysis courses, and these similarities enable them to absorb the new material more quickly. Such patterns allow students to spend more time exploring the special topics that distinguish individual mathematical subdisciplines—although research advances in mathematics are often inspired by new and surprising analogies between previously unconnected areas.
The cascading levels of abstraction, from concrete mathematical structures to axiomatic systems and then beyond to the general objects that belong to categories, present a new challenge: it is no longer very clear what it means to say that one thing is “the same” as another thing. Consider, for instance, a group, which in math is an abstract collection of symmetries whose elements Amie Wilkinson of the University of Chicago likes to describe as “moves” that flip or rotate an object before settling it into something like the original position.
For example, we might explore the symmetries of a T-shirt. One symmetry can be thought of as the “identity move,” where a person simply wears the T-shirt as it is usually worn. Another symmetry corresponds to a move where the wearer takes their arms out of the arm holes and, with the T-shirt still around their neck, rotates the shirt 180 degrees to put their arms in the opposite holes: the T-shirt remains right-side out but is now being worn backward. Another symmetry corresponds to a move where the T-shirt is removed entirely, flipped inside out and put back on in such a way that each arm goes through the hole it was originally in. The T-shirt is now inside out and backward. A final symmetry combines these two moves: atypically for groups, these moves can be performed in any order without changing the end result. Each of these four moves counts as a “symmetry” because they result in the shirt being worn in essentially the same way as when you started.
Another group is the “mattress-flipping group,” which describes the symmetries of a mattress. In addition to the identity move, which applies when the mattress is left in its original position, a person can move the mattress by rotating it top to bottom, flipping back to front or performing both moves in sequence. (Mattresses typically are not square, but if they were, there would be more symmetries than described here.) Although a T-shirt does not have much to do with a mattress, there is a sense in which the two symmetry groups have the same “shape.” First, both groups of symmetries have the same number of moves (in this case, four), and, crucially, you can pair each move in the T-shirt group with a move in the mattress-flipping group such that the compositions of corresponding moves also correspond. In other words, you can match up moves from the two groups (match the identity with the identity, the flip with the flip, the rotation with the rotation, and so on). Second, if you take two moves from one group and perform them in sequence, the final position will match with the end result of performing the corresponding moves from the other group in sequence. In technical terms, these groups are connected by an “isomorphism,” a term whose etymology—from the Greek isos, meaning “equal,” and morphe, meaning “form”—indicates its meaning.
We can define the notion of isomorphism in any category, which allows us to transport this concept between mathematical contexts. An isomorphism between two objects A and B in a category is given by a pair of transformations, f: A → B and g: B → A, with the property that the composites g ∘ f and f ∘ g equal the respective identities 1A and 1B. In the category of topological spaces, the categorical notion of isomorphism is represented by an inverse pair of continuous functions. For instance, there is a continuous deformation that would allow you to convert an unbaked doughnut into a shape like a coffee mug: the doughnut hole becomes the handle, and the cup is formed by a depression you make with your thumb. (For the deformation to be continuous, you must do this without tearing the dough, which is why the doughnut should not be baked before the experiment is attempted.)
This example inspired the joke that a topologist cannot tell the difference between a coffee mug and a doughnut: as abstract spaces, these objects are the same. In practice, many topologists are arguably much less observant than this because it is common to adopt a more flexible convention concerning situations when two spaces are “the same,” identifying any two spaces that are merely “homotopy-equivalent.” This term refers to the notion of isomorphism in the more exotic homotopy category of spaces. A homotopy equivalence is another type of continuous deformation, but in this case, you can identify distinct points. For instance, imagine starting with a pair of pants and then shrinking the lengths of the legs until you are left with a G-string, another “space” with the same fundamental topological structure—there are still two holes for legs—even though the original two-dimensional garment has been shrunk down to a one-dimensional bit of string.
Another homotopy equivalence collapses the infinite expanse of three-dimensional Euclidean space down to a single point via a “reverse big bang” in which each point flies back to its origin, with the speed of this motion increasing with the distance from the location of the initial big bang.
The intuition that we can substitute isomorphic things for one another without fundamentally changing the nature of a construction or an argument is so strong that in fact category theorists have redefined the word “the” to mean something closer to “a” in colloquial English. For example, there is a concept known as the disjoint union of two sets A and B. Like the ordinary union, the disjoint union A ⨆ B has a copy of every element of A and a copy of every element of B. Unlike in the ordinary union, however, if A and B have an element in common, then the disjoint union A ⨆ B has two copies of that element, one of which somehow remembers that it came from A, and the other somehow remembers it came from B.
There are many different ways to construct the disjoint union using the axioms of set theory, which will not produce exactly the same set but will, necessarily, produce isomorphic ones. Rather than wasting time arguing about which construction is the most canonical, it is more convenient to just sweep this ambiguity under the rug and refer to “the” disjoint union when one means to consider any particular set that satisfies the desired universal property. In another example, mathematicians refer to both the T-shirt symmetry group and the mattress-flipping group as “the Klein four-group.”
An oft-told story about the origin of the fundamental theorem of category theory is that a young mathematician named Nobuo Yoneda described a “lemma,” or helper theorem, to Mac Lane at the Gare du Nord train station in Paris in 1954. Yoneda began explaining the lemma on the platform and continued it on the train before it departed the station. The consequence of this lemma is that any object in any category is entirely determined by its relation to the other objects in the category as encoded by the transformations to or from this object. So we can characterize a topological space X by probing it with continuous functions f: T → X mapping out other spaces T. For instance, the points of the space X correspond to continuous functions x: * → X, whose domain is a space with a single point. We can answer the question of whether the space X is connected or disconnected by considering mappings p: I → X, whose domain is an interval I = [0,1]. Each such mapping defines a parameterized “path” in the space X from the point p(0) to the point p(1), which can be thought of as a possible trajectory an ant might take when walking around the space X.
We can use the points and paths of a space to translate problems of topology into problems of algebra: each topological space X has an associated category π1X called the “fundamental groupoid” of X. The objects of this category are the points of the space, and the transformations are paths. If one path can be deformed into another in the space while its end points remain fixed, the two paths define the same transformation. These deformations, which are technically called homotopies, are necessary for the composition of paths to define an associative operation, as is required by a category.
A key advantage of the fundamental groupoid construction is that it is “functorial,” meaning that a continuous function f: X → Y between topological spaces gives rise to a corresponding transformation π1f: π1X → π1Y between the fundamental groupoids. This assignment respects composition and identities, meaning π1(g ∘ f) = π1g ∘ π1f and π1(1X) = 1π1X, respectively. These two properties, which collectively go by the name “functoriality,” suggest that the fundamental group captures some essential information about topological spaces. In particular, if two spaces are not homotopy-equivalent, then their fundamental groupoids are necessarily inequivalent.
The fundamental groupoid is not a complete invariant, however. It can easily distinguish between a circle and the solid disk that circle bounds. In the fundamental groupoid of the circle, the different wiggling versions of a path between two points can be labeled by integers that record the number of times the trajectory winds around the circle and a + or − sign indicating, respectively, a clockwise or counterclockwise direction of transit. In contrast, in the fundamental groupoid of the disk, there is only one path up to homotopy between any pair of points. The fundamental groupoid of the space formed by the inflatable exterior of a beach ball, a sphere in topological terms, also has this description: there is a unique path up to homotopy between any two points.
The big problem with the fundamental groupoid is that points and paths do not detect the higher-dimensional structure of a space, because the point and interval are themselves zero- and one-dimensional, respectively. A solution is to also consider continuous functions from the two-dimensional disk, called homotopies, and “higher homotopies,” defined by continuous functions from the solid three-dimensional ball and similarly for other balls in 4, 5, 6 or more dimensions.
It is natural to ask what kind of algebraic structure the points, paths, homotopies and higher homotopies in a space X form: this structure π ∞ X (“pi infinity X”), referred to as the fundamental ∞-groupoid of X, defines an example of an ∞-category, an infinite-dimensional analogue of the categories first introduced by Eilenberg and Mac Lane. Like an ordinary category, an ∞-category has objects and transformations visualized as one-dimensional arrows, but it also contains “higher transformations” depicted by two-dimensional arrows, three-dimensional arrows, and so on. For example, in π ∞ X the objects and arrows are the points and the paths—no longer considered up to wiggling—while the higher-dimensional transformations encode the higher homotopies. Like in an ordinary category, the arrows in any fixed dimension can be composed: if you have two arrows f: X → Y and g: Y → Z, there must also be an arrow g ∘ f: X → Z. But there is a catch: in attempts to capture natural examples such as the fundamental ∞-groupoid of a space, the composition law must be weakened. For any composable pair of arrows, there must exist a composite arrow, but there is no longer a unique specified composite arrow.
This failure of uniqueness makes it challenging to define ∞-categories in the classical set-based foundations of mathematics because we can no longer think of composition as an operation resembling those appearing in universal algebra. Although ∞-categories are increasingly central to modern research in many areas of mathematics, from quantum field theory to algebraic geometry to algebraic topology, they are often considered “too hard” for all but specialists and are not featured regularly in curricula, even at the graduate level. Nevertheless, many others and I see ∞-categories as a revolutionary new direction that can enable mathematicians to dream of new connections that would otherwise have been impossible to rigorously state and prove.
The Future Horizon
Historical experience suggests, however, that the most exotic mathematics of today will eventually be thought of as easy enough to teach to mathematics undergraduates in the future. It is fun to speculate, as a researcher in ∞-category theory, about how this subject could be simplified. In this case, there is a linguistic trick—a supercharged version of the categorical “the”—that could make ∞-categories as easy for late 21st-century undergraduates to think about as ordinary categories are today. The key axiom in an ordinary category is the existence of a unique composite transformation g ∘ f: X → Z for each composable pair of transformations f: X → Y and g: Y → Z, chosen from all the elements of the set of transformations from X to Z. In contrast, in an ∞-category, there is a space of arrows leading from X to Z, which in the fundamental ∞-groupoid can be understood as a kind of “path space.” The correct analogue of the uniqueness of composites in an ordinary category is the assertion that in an ∞-category, the space of composites is “contractible,” meaning that each of its points can be continuously collapsed via a reverse big bang to a single point of origin.
Note that contractibility does not imply that there is a unique composite: indeed, as we have seen in the fundamental ∞-groupoid, there can be a large number of composite paths. But contractibility guarantees that any two composite paths are homotopic, any two homotopies relating two composite paths are connected by a higher homotopy, and so on.
This idea of uniqueness as a type of contractibility condition is a central one in a new foundation system for mathematics proposed by Vladimir Voevodsky and others. Mathematicians around the world are collaborating to develop new computer-based “proof assistants” that can check a formal proof of a mathematical result line by line. These proof assistants have a mechanism that mimics the common mathematical practice of transferring information about one thing to another thing that is understood to be the same via an explicit isomorphism or homotopy equivalence. In this case, the mechanism allows the user to transport a proof involving one point in a space along a path that connects it to any other point, giving a rigorous formulation of the topological notion of sameness.
In a 1974 essay, mathematician Michael Atiyah wrote, “The aim of theory really is, to a great extent, that of systematically organizing past experience in such a way that the next generation, our students and their students and so on will be able to absorb the essential aspects in as painless a way as possible, and this is the only way in which you can go on cumulatively building up any kind of scientific activity without eventually coming to a dead end.” Category theory arguably plays this role in modern mathematics: if mathematics is the science of analogy, the study of patterns, then category theory is the study of patterns of mathematical thought—the “mathematics of mathematics,” as Eugenia Cheng of the School of the Art Institute of Chicago has put it.
The reason that we can cover so much ground in an undergraduate course today is that our understanding of various mathematical concepts has been simplified through abstraction, which might be thought of as the process of stepping back from the specific problem being considered and taking a broader view of mathematics. A lot of fine details are invisible from this level—numerical approximations, for instance, or really anything having to do with numbers at all—but it is a remarkable fact that theorems in algebra, set theory, topology and algebraic geometry sometimes are true for the same underlying reason, and when this is the case, these proofs are expressed in the language of category theory.
What is on the horizon for the future? The emerging consensus in certain areas of mathematics is that the natural habitats of 21st-century mathematical objects are ∞-categories in the same way that 20th-century mathematical objects inhabit ordinary categories. The hope is that the dizzying tower of arrows in each dimension that one needs to do deep work in an ∞-category will at some point recede into the background of the collective mathematical subconscious, with each contractible space of choices collapsed down to a unique point. And one can only wonder: If this much progress was made during the 20th century, where will mathematics be at the end of the 21st?