Kruskal-Katona

Kruskal-Katona

If G is a finite abstract simplicial complex, a finite set of non-empty sets closed under the operation of taking non-empty subsets, we can ask about what f-vectors f=(f_0,f_1, \dots, f_d) can occur if f_k = |G_k| counts the number of sets of cardinality k in G. The case of the complete complex G=K_n with f_k=B(n,k) := \left( \begin{matrix} n \\ k \end{matrix} \right) gives a hint that the answer is given in terms of Binomial coefficients. Joseph Kruskal (1928-2010) (the brother of Martin Kruskal known in soliton theory) in 1962 and Gyula Katona (1941- ) in 1968 gave the answer.

Similarly to the binary expansion of an integer n = 2^{n_1} + 2^{n_2} + \cdots + 2^{n_m} one can define for a given k \geq 0 a k-cascade expansion B(n_k,k) + B(n_{k-1},k-1) + \cdots + B(n_m,m) with n_k > n_{k-1} > \cdots  > n_m. This expansion is unique (one could demand k \leq n_k/2 to see the uniqueness more easily, but this actually follows). Now define the k-shadow map \delta_k n = B(n_k,k-1) + B(n_{k-1},k-2) + \cdots + B(n_m,m-1). While |G_{k-1}| \leq k |G_k| is obvious, one has the

Kruskal-Katona Theorem: |G_{k-1}| \geq \delta_k |G_k|.

In the talk, I illustrate it with an example given in the TAOCP of Donald Knuth: how many triangles can a graph with a million edges have? The Kruskal-Katona theorem gives the answer. First write n=1'000'000=B(1414,2)+B(1009,1) then invert the 3-shadow map to get \delta_3^{-1} n = B(1414,3)+B(1009,2) = 470'700'300. I explain it with my favorite graph G=K(1,2,1), the kite graph which has n=5 edges. Now since 5=B(3,2)+B(2,1) we have maximally B(3,3) + B(2,2) = 1+1 triangles. Any graph with 5 edges has either 0,1 or 2 triangles. Actually, the kite graph is the only one with a maximal number.

I did not get into the proof of the theorem. Peter Frankel, a student of Katona gave in 1984 a very short proof using double induction with respect to k and n. Because the paper is so short, why not just give it here in a gallery below. Click on a page to see it larger.

By the way, at the beginning of the talk, I again stress the generality of delta sets. It is actually bewildering, how many different finite approaches have been considered. It is always a bit tricky to talk about “generality” and “what structure is the most general”. For me, delta sets are the most general because almost any finite constructs which has been considered over the centuries are delta sets. A simplicial set for example is a delta set, because one can just forget at about the s maps. Both simplicial sets and delta sets are functor categories but because the simplex category with strict inclusion has less morphisms, there are more functors possible. One could argue that a finite set is more general than a delta set because every delta set becomes just a set by forgetting about the Dirac operator D on it, but one can see it also differently: the category of finite sets is included in the category of delta sets by just looking at zero dimensional delta sets, where automatically, the Dirac matrix must be the zero matrix. To make my point: delta sets are the right set-up, I just shifted the language a bit and instead of clunky face maps d_j^n which really nobody really uses in an actual implementation, talk about the Dirac matrix D which incorporates everything. While the path from the face maps to the exterior derivative d and so D =d+d* is obvious, the other direction, constructing face maps $d_i^n$ from the Dirac matrix D is a bit more cumbersome because the axioms ask for exactly n+2 face maps from G_{n+1} \to G_n. This requires some of the inclusion maps to be bundled. A 3-dimensional cell in the Cartesian product K_2 \times K_2 \times K_2 for example has 6 faces while in a tetrahedron we have n+2=4 faces. Like discrete CW complexes (which are also easily seen to be delta sets if one looks at the Dirac matrix (G,D,r) data structure rather than the face map picture), delta sets allow cells to be much more general than forcing them to be simplicial like in a simplicial complex. Now, in the 1990ies, especially in the context of digital topology, one has come up with other terminology but some of these approaches fail to get a decent cohomology. For me, a geometric structure needs a reasonable non-trivial chain complex. It is the absolute minimum which a “geometry” must have. A general set of sets (called hypergraph) does not provide it. The Dirac matrix data structure immediately cuts to the what we actually need: a calculus! And a calculus is what is given by an exterior derivative. And an exterior derivative can be best encoded by a single matrix D. If we compare the the tensor calculus needed in the continuum,the discrete is so much simpler. The reason of course is that in the discrete we have direct access to the individual cells. In the continuum, we need sheaf theoretical methods like differential forms to access the smallest k-dimensional parts of a d-dimensional space.

The pictures were taken early in the morning on May 8th 2024 in Winchester. A bad weather front was just starting to move in and later in the morning it would rain. It was also the day of the final exam of my 1a calculus course which I enjoyed a lot teaching: it was a nice class. The exam was in the afternoon and went well.