Do circles have the “largest” area (Part 2)

So where did we leave? Ah here:

Nice! I can “see” why the fact is true. Can we do better? Can you make me “feel” why it’s true? We’ll see that in the next post!

Part 1

And the answer is…… Yes, we can! But we would have to consult the physicists. We are unlucky in one aspect: relativity. They (“who?” is also a rather interesting question to ask) have taken out all the things we see in our surroundings out of math, and made it abstract. But the fact that we can always take them back, is a plus point.

Gases in trapped in the Curve

Imagine a rope enclosing a two-dimensional gas, with vacuum outside the rope. The gas will expand, pushing the rope to enclose a maximal area at equilibrium.

When the system is at equilibrium, the tension in the rope must be constant, because if there were a tension gradient at some point, there would be a non-zero net force at that point in the direction of the rope, but at equilibrium the net force must be zero in all directions.

The gas exerts a force outward on the rope, so tension must cancel this force. Take a small section of rope, so that it can be thought of as a part of some circle, called the osculating circle. The force on this rope segment due to pressure is PL where ‘P’ is pressure and ‘L’ length. The net force due to tension is 2T sin⁡(L/2R) with ‘T’ tension and ‘R’ the radius of the osculating circle.

Net force from gas is upward, and the angle theta and length L tend to zero (we focus on an infinitesimal length on the boundary of the curve)

Because the pressure is the same everywhere, and the force from pressure must be canceled by the force from tension, the net tension force must be the same for any rope segment of the same length. That means the radius of the osculating circle is the same everywhere, so the rope must be a circle.

A big merit of the above solution is that you can actually imagine and see the gas molecules pushing the curve and making it round. Plot twist: the Greeks could see planets in the sky, they did not have to imagine them. But being believers of the geocentric theory, they wanted to explain the retrograde motion of the planets. So they found an argument using epicycles that was extremely complicated. But soon it was discovered that

“Epicycles are everything”

Fourier Analysis

So let us use the snake’s oil- Fourier Analysis– in our next proof. A neat proof using Green’s Theorem and Cauchy-Schwarz is given in [3] (These two proof are really the same :P).

Fourier Analysis or…

Nah, it’s not that fancy.

… what?

And no, not that stupid. I would prefer to have a brief recap of linear algebra that we need here rather than diving right in because that would illustrate the point I want to make better.

Here we go!

What really are independent elements? The definition of linear independence that ∑λ_i v_i ↔ λ=0, but what does it really mean?

Linear independence is a symbol of heterogeneity. You can really “separate” out the different parts as you can in a heterogeneous mixture of substances. The only problem in every particular case is how.

Suppose we are in real numbers, and we want to filter out √2 out of 9√2+7√3 then how should we go about doing this? One way is to define an additive  f:R→R using Hamel basis (you can also check out handouts of Evan Chen in case you don’t know what this term means) of \mathbb{R} over \mathbb{Q} such that f(√2)≔√2 and f(h)≔0 for all other basis elements. Then f(9√2+7√3)=9f(√2)+7f(√3)=9√2 done! But this is not the only way we can separate the elements. There is a more “intrinsic” way of separation. And that is …

Come on!

The idea at heart of Fourier Analysis is rather nice – orthogonality. We will need some linear algebra preliminaries (learning which is left as an exercise for the diligent reader).

Orthogonal Elements

Fourier Analysis wants to use the orthogonal basis {1,sin(⁡x),sin(⁡2x), …} and {1,cos⁡(x),cos(⁡2x), …} for real-valued functions and (e^{ix},e^{i2x}, \ldots) for complex-valued functions. Their respective inner products are \int_0^{2\pi}fg and \int_0^{2\pi}f\overline{g} . Can you show that the elements are orthogonal with respect to the respective inner products?

Cool Style, huh?

All unspecified sums are from -\infty to \infty . Let us assume that f(x) has the Fourier expansion \sum a_n e^{inx} . Then notice the following curious thing: \overline{f(x)}=\sum a_n e^{-inx} and so,

|f(x)|^2=\sum |a_n |^2+\sum_{k\neq j}a_k a_j e^{i(k-j)x}

Now we know that when t \in \mathbb{Z} and t\neq 0  then \int_0^{2\pi} e^{itx} dx=0 and therefore integrating in that range,

\int_0^{2\pi}|f(x)|^2 dx=2\pi\sum |a_n |^2

The idea of bilinear Parseval’s, informally, is not very different. In that case assuming g(x) has the Fourier expansion \sum b_n e^{inx} we have

f(x) \overline{g(x)}=\sum a_n \overline{b_n}+\sum_{k+j\neq 0}a_k \overline{b_j} e^{i(k+j)x}  

So similarly,

\int_0^{2\pi} f(x) \overline{g(x)} dx=2\pi \sum a_n \overline{b_n}

The above manipulations disregard convergence issues. We urge the interested  reader to look it up in [5]. We remark that in case of real-valued functions, a_{-n}=\overline{a_n} .

Epicycles are almighty

Get happy people! Your hard work will now pay off. Let us call the original curve C = (x(s),y(s)) where 0≤s≤2π and normalize L=2π. Then we can parametrize it by arc-length so that the speed \sqrt{x'(s)^2+y'(s)^2}=1 always.

A parametrization of a convex curve in which C does not move with uniform speed (which we had in the earlier parametrization).

Let us consider their Fourier series

x(s) \sim \sum a_n e^{ins}

y(s)\sim\sum b_n e^{ins}

We clearly have

\int_0^{2\pi}(x'(s)^2+y'(s)^2) ds=2\pi

Thus, by Parseval’s identity,

\sum n^2 (|a_n |^2+|b_n |^2 )=1

By continuous shoelace formula (for geometers) By Green’s formula (for others) and bilinear Parseval’s Identity,

A=0.5 |\int_0^{2 \pi} (xy'-x'y) ds|= \pi | \sum n (a_n \overline{b_n}-\overline{a_n} b_n) |

Now by the triangle inequality,

|a_n \overline{b_n}-\overline{a_n}b_n |\leq 2|a_n b_n |\leq |a_n |^2+|b_n |^2

Thus, because |n|≤|n|^2, therefore,

A≤π ∑ n^2 (|a_n |^2+|b_n |^2 ) = π

And yay, we are done! All these have been short and tricky proofs. There are many more that I will tell about later because you have to internalize the ideas, and I do not wish to test your patience. Anyway, here are some exercises to test whether you understood what I just said above. In case you are able to solve them, kudos! But don’t post your solutions below until enough people have seen the post (alternately, use something like spoiler tags in discord, but I don’t know about their availability here).

Exercise 1. How can the inner product be used for “separating” orthogonal elements?

Exercise 2. I said that 1,e^{ix},e^{i2x},\ldots are linearly independent and orthogonal. Considering functions that are finite sums of these, can you conclude their Fourier coefficients will be unique?

Exercise 3. Prove that any two elements of the set {1,sin(x),cos(x),sin(2x),cos(2x),…} are orthogonal with respect to the real inner product.

Exercise 4. Does the “dents to bumps” process take arbitrarily moves to terminate in general if we are not wise? Can you provide an example?

Exercise 5. “Separate” √2 and √3 using modular arithmetic. Can you generalize your idea?

Exercise 6. Justify, “Epicycles are everything”.

Exercise 7. Complete the hand-wavy sketches and find answers to in-text questions. Also, try reading the formal proofs and details.

Exercise 8. Prove the following inequality (or explain modulo formalism). It gives yet another proof of the theorem. Can you find it?

Try before googling

References:

[1] Jakub Konieczny https://math.stackexchange.com/users/10674/jakub-konieczny

Among all shapes with the same area, a circle has the shortest perimeter, URL (version: 2017-04-13): https://math.stackexchange.com/q/389362

[2] Mark Eichenlaub https://math.stackexchange.com/users/3305/mark-eichenlaub

Why does a circle enclose the largest area?, URL (version: 2020-07-26): https://math.stackexchange.com/q/26213

[3] Agustí Roig https://math.stackexchange.com/users/664/agust%c3%ad-roig

Why does a circle enclose the largest area?, URL (version: 2011-10-11): https://math.stackexchange.com/q/4818

[5]Fourier Analysis, Elias M. Stein and Rami Shakarchi

Advertisement

Do circles have the “largest” area? (Part 1)

Today we are going to look at a very “intuitive” result – the isoperimetric problem. Such results are routinely called “verification of intuition” by many math people. Let us see what Wikipedia has to tell us about the problem.

Wikipedia: Isoperimetric Inequality

Quantitative bounds are good, but as Steele says it in his wonderful book The Cauchy-Schwarz Master Class,

“Mathematical progress depends on the existence of a continuous stream
of new problems, yet the processes that generate such problems may
seem mysterious. To be sure, there is genuine mystery in any deeply
original problem, but most new problems evolve quite simply from well-
established principles. One of the most productive of these principles
calls on us to expand our understanding of a quantitative result by first
focusing on its
qualitative inferences.”

Steele

The classical isoperimetric problem dates back to antiquity. The problem can be stated as follows:

This problem is conceptually related to the principle of least action in physics, in that it can be restated: what is the principle of action which encloses the greatest area, with the greatest economy of effort?

The first progress towards a solution was made by Swiss (hardcore) geometer Jakob Steiner in 1838, using a geometric method called Steiner symmetrization. But his solution did not show the existence: it only showed that existence implies circle. He began by showing convexity and then symmetry. The one shape that is perfectly convex and symmetrical is the circle, although this, in itself, does not represent a rigorous proof of the isoperimetric theorem.

Steiner Symmetrization

Firstly, we can assume that the figure is convex. Why? Because we can “flip the dent out”.

Making bumps out of dents

Now we use an ingenious argument to show symmetry. This can be called descent, but not exactly. Let us assume the existence of the optimal curve for now (we’ll get back to it in a moment). Call this curve C. Choose A,B on C such that AB divides the perimeter into two equal halves of length, say, L/2. We claim that the two parts have equal area.

Flipping the Dominant Partner

If one of the parts is larger then just flip it over AB. The overall area will be increased, while the perimeter will become 2\cdot L/2=L. This contradicts the assumption that C is the optimal shape! It follows that half of our solution curve C must solve the following problem:

The Reduced Problem

Now we’ll show that the solution to this new problem is a semi-circle, so that the solution to the original problem must be a full circle.

The curve after perturbing

Suppose the arc AOB shown at the left below solves this new problem. It is sufficient to show that every inscribed angle, such as the one at O, is a right angle. Suppose, to the contrary, that the angle at O is not 90 degrees. Then we can view the arc as hinged at O, and either open it or close it to make the angle at O exactly 90. This will not change the length of the arc, nor change the two shaded areas, but will increase the triangular area (why?), and therefore increase the total area enclosed by the arc AOB and the line segment AB. But this was already maximized by the original arc, so its inscribed angle at O is 90 and the arc itself a semi-circle. We are done.

Another geometric method due to Steiner is called “Four Hinges”. Steiner, figuratively, found certain properties of the circle. Then he showed that the optimal curve must have those properties. Finally, if the properties were restrictive enough, then he concluded that the optimal figure was a circle. Guess what property he used in the “proof” below.

“If the curve bounding the region is not a circle, then there are four points on the boundary which are not cyclic. Consider the quadrilateral with these vertices, and regard as the rest of the domain rigidly attached to the sides of the quadrilateral. Suppose that the boundary can be articulated at these points. Then, flexing the quadrilateral so to move the points to a circle results in a larger area by the quadrilateral inequality, with the same boundary length.”

The “quadrilateral inequality” in the above quote refers to

Steiner argued that for non-circular figures, we could always increase the area and therefore the circle has the largest area. But this is not the complete proof. We have to show existence. Why?

It’s like trying to argue that 1 is the greatest natural number by
showing that for every other number X there is a larger one,
namely X^2 .

O. Perron

Existence was first proved by Weierstraß(1875) and Schwarz(1884) using the calculus of variations.
Caratheodory and Study(1909) gave a rigorous treatment of Steiner’s method without the calculus of variations.

“Calculation replaces thinking, while geometry simulates it.”

Jakob Steiner

Now you realize why I said he was a hardcore geometer. So in that spirit, here’s a way to prove existence geometrically-by the method of “polygonal approximation”.

Among all 2n-gons with a given perimeter, the regular polygon has the largest area

Suppose that two adjacent edges AB and BC had different lengths. Then we could cut off triangle ABC from our polygon, and replace it with an isosceles triangle AB’C in which AB’ + B’C = AB + BC, and which has a larger area It follows that on the area-maximizing 2n-gon, all the edges are equal.

Following Steiner’s pattern, we cut such a maximizer into two polygonal n-gons, each of length L/2, and then show that each is inscribed in a semi-circle through its endpoints. It follows that the entire maximizer is inscribed in a circle, and hence (since its edges are all equal), is regular.

Now we can assert that equality holds only for a circle by taking a series of polygonal approximations that get closer to the curve as the number of sides →∞. At one point, the precision will be so great that we can say that the curve has lesser area than the corresponding circle if it were not a circle.

Nice! I can “see” why the fact is true. Can we do better? Can you make me “feel” why it’s true? We’ll see that in the next post!

Announcing the W. W. Sawyer Prize for Mathematical Exposition

The art of reasoning consists in getting hold of the subject at the right end, of seizing on the few general ideas that illuminate the whole, and of persistently organizing all subsidiary facts round them.

W. W. Sawyer, Prelude to Mathematics

We are pleased to announce a new prize for math blogging – the W. W. Sawyer Prize!

How it works

Readers are invited to send blogposts on a mathematical topic to be published on the Monsoon Math Blog. We are happy to help with editing and feedback. It is totally fine if your post has already been published somewhere else, such as your own personal blog, or AoPS.

Every month, one of the blogposts published that month will be chosen by Monsoon Math staff to receive the W. W. Sawyer Prize for that month. The prize consists of Rs 1000 + unspecified goodies. 

Who was W. W. Sawyer ?

Walter Warwick Sawyer (1911-2003) was a mathematician and educator who taught in England, Ghana, New Zealand, the United States, and Canada. In 1943 he wrote “Mathematician’s Delight”, one of the first (and arguably one of the best) “pop math” books.

He was well known for his strikingly clear and well-motivated approach to mathematical exposition, and we hope the Prize and its blog posts will reflect similar qualities.

The website https://www.wwsawyer.org/default.html links to many of his shorter articles. It may also be possible to find Prelude to Mathematics, and Mathematician’s Delight, his most well known general works, somewhere online.

Note: Monsoon Math and the Prize are not endorsed by and have no relationship with the Sawyer estate.

Monsters and Tiling Problems

Sutanay Bhattacharya

This is based on a lecture I gave at MOOC 2021.

1. Cold Open

You know how sometimes movies or TV shows start at some random point halfway into the plot, and then follow that up with a title card and a proper beginning, in a joint effort to tease what’s to come and to feign profundity where there’s none? Weird, right? On a completely unrelated note, we will start by looking at a problem that we will only partially attack for now, and return to it later once we have developed enough machinery.

Here’s the problem:

Characterise all rectangles that can be tiled by squares.

Let’s try to make a reasonable guess at what the answer must be. Of course, rectangles that are squares themselves are tileable. So are {2\times 1} rectangles: that’s just two equal squares glued together; same goes for any {n\times 1} squares. In fact all {m\times n} rectangles can be tiled, {m,n} being positive integers: we simply arrange {mn} unit squares in an {m\times n} grid.

The next observation is that scaling the whole rectangle changes nothing; so it’s only the ratio of the sides of the rectangle that matters. This says rectangles whose ratio of height and width is {m/n} for integers {m,n} can be tiled by squares. This leaves rectangles whose ratio is irrational: say, a {1\times \sqrt 2} sized rectangles.

Some fiddling with possible construction strategies should quickly convince you no such tiling exists for this. But how do we prove something like that? To answer that question, we have to review some facts about, oddly enough, Cauchy’s functional equation.

2. May contain additives

2.1. Beyond rationality

Functions that satisfy Cauchy’s classical functional equation {f(x+y)=f(x)+f(y)}, so called additive functions, are not hard to determine over the integers, or even the rationals. As soon as we enter the real world, things get out of whack (life lesson!).

So suppose we are looking for functions {f:{\mathbb R}\rightarrow {\mathbb R}} so that {f(x+y)=f(x)+f(y)}. You are perhaps familiar with the proof that this implies {f(q)=qf(1)} for {q\in{\mathbb Q}}. But there’s no way to use this equation to pin down the value of, say {f(\sqrt 2)}. From a glass-half-full perspective, this means, we can declare {f(\sqrt 2)} to be anything we like, and that will never cause a contradiction with the already known values. One can, for example, require {f(1)=1}, {f(\sqrt 2)=2021}, {f(\sqrt 3)=42}, {f(\pi)=e}, and there should be an additive function that does satisfy these constraints; after all, these constraints are not self-contradictory in any obvious way. In a way, we have a certain amount of freedom in constructing an additive function.

One can’t be too nonchalant in declaring such things, however. If we, for example, require {f(1)=1} and {f(\sqrt 2)=2021}, then {f(1+\sqrt 2)} can’t be whatever we want; it’s value is determined by {f(1+\sqrt 2)=f(1)+f(\sqrt 2)}. Similarly, it would be foolish to also impose {f(1-2\sqrt 2)=\frac{22}7} or {f(2\sqrt 2)=3.14}; we are not free to choose these as soon as we fix {f(1)} and {f(\sqrt 2)}. A nifty branch of mathematics called linear algebra makes this intuitive notion of freedom precise.

2.2. The true meaning of independence

A set of reals {S} is called linearly dependent over {{\mathbb Q}} if it’s possible to find elements {x_1,\cdots, x_n} in {S} and rational numbers {q_1,\cdots, q_n} (not all zero), so that

\displaystyle q_1x_1+q_2x_2+\cdots+q_nx_n=0.

The set {{5,1/2}} is linearly dependent over {{\mathbb Q}}, because one can write {\frac35\cdot 5+(-6)\cdot \frac12=0}. So is the set {{\sqrt 2, 5, 3-2\sqrt 2, 19}} since {2\cdot \sqrt 2+\frac{-3}{5}\cdot 5+1\cdot (3-2\sqrt 2)}.

Two things are worth pointing point out here: the {q_i}‘s obviously don’t need to be inside {S}, and not all elements need to appear as {x_i}‘s. In particular, an infinite set {S} can be linearly dependent if some finite subset {\{x_1,\cdots, x_n\}} satisfies the above equation. To make sure the definition makes sense, try to convince yourself of the following: (1) any set containing {0} is linearly independent, (2) if set {S} is linearly dependent, so it any set {T} containing {S}, (3) any set with two or more rational numbers is linearly independent.

A set {S} is called linearly independent over {{\mathbb Q}} if, you guessed it, it’s not linearly dependent over {{\mathbb Q}}.

This is exactly the kind of dependence we observed above when we tried to extend our {f}. Of course we can’t fix {f} on {1,\sqrt 2} and {1+\sqrt 2} simultaneously: the set {\{1,\sqrt 2,1+\sqrt 2\}} is linearly dependent, which means there are non-trivial relations between these numbers, and they impose relations on the {f}-values of these numbers; in this case, the relation is {f(1)+f(\sqrt 2)+(-1)f(1+\sqrt 2)=0}. The set {\{1,\sqrt 2,\sqrt 3,\pi\}} is, however, linearly independent, so no such difficulty arises. (Not easy to prove from scratch, so just trust me on this one.)

2.3. The rational basis review

Take some linearly independent set, say {S=\{1,\sqrt 2,\sqrt 3\}}. There many numbers we can form by forming rational linear combinations of these: {1+\sqrt 2-\sqrt 3, \frac34\cdot 1-2\cdot\sqrt 3}, {\sqrt 2+\sqrt 3} and so on. More importantly, any such presentation of a number is unique: we’ll never run into a situation where two different expressions of this sort turn out to be equal. Something bizarre like

\displaystyle \frac 25\cdot 1-\frac37\sqrt 2+\frac65\cdot\sqrt 3=-\frac67 1+\frac{12}{17}\sqrt 2-5\cdot \sqrt 3

would never hold (do you see why this is a consequence of their linear independence?).

But of course, this doesn’t cover all real numbers: there’s no way you could write, say, {\sqrt 5} or {e} like this. If it did, we’d have what’s called a basis. In other words: {\small A set {S\subset {\mathbb R}} is called a Hamel basis (of {{\mathbb R}} over {{\mathbb Q}}) if it is linearly independent, and any number {r\in {\mathbb R}} can be written as a finite sum {q_1x_1+\cdots+q_nx_n} where {q_i}‘s are non-zero rationals, and {x_i}‘s are in {S}. Because of linear independence, such an expression would always be unique.} Since a Hamel basis is linearly independent by definition, one should be able to define an additive function arbitrarily on each of its elements. And since every number {r} can be written as {q_1x_1+\cdots+q_nx_n}, {f(x)} must then be {q_1f(x_1)+\cdots+q_nf(x_n)} (see if you can prove this from the additivity of {f}). This definition of {f} does end up giving an additive function; I’ll leave the checking to someone with a higher boredom tolerance than me.

The only question now is, does there even exist a Hamel basis? The answer turns out to be yes. In fact, the following is true:

Let {S\subset {\mathbb R}} be linearly independent over {{\mathbb Q}}. Then {S} is contained in a Hamel basis.

We won’t be seeing a proof of this here; it requires invoking the Axiom of Choice in some form, and the formalism is best left to an actual textbook. But the simple statement has an important consequence for us:

Given any linearly independent subset of {{\mathbb R}}, say {\{1,\sqrt 2,\sqrt 3\}}, we can extend that to a Hamel basis and get an additive function whose values on the basis elements we can choose freely (using the recipe outlined above). In other words, for any linearly independent subset {S\subset R}, we can find an additive function whose values on elements of {S} we are free to choose. We already surmised some freedom of this sort above when we were trying to find {f} with {f(1)=1} and {f(\sqrt 2)=2021}; the above proposition makes this notion precise, and will be of importance in what’s to come. Following Evan’s article, such functions will be called monsters.

3. Back to square one

Let’s return to the problem we started with. We are hoping to show an {1\times x} rectangle isn’t tileable by squares, where {x} is irrational.

The key idea is looking at areas. Given any {a\times b} rectangles, we are used to calling {ab} its area. This has the remarkable property that when you tile a rectangle with smaller ones, the area adds up: the area of the whole rectangle is the sum of the smaller ones; one could think of this as some form of “additivity”.

As we have seen before, the boring linear functions are hardly ever the only examples of additive functions, especially when irrationals are involved. What happens if we define the area of a {a\times b} rectangle to be {f(a)\times f(b)}, where {f} is some monster? Leaving aside the natural question of `why on earth would we do that’, one notices that this new notion of area also has “additivity” as shown before. Indeed, if rectangles of equal height are joined end-to-end, the area clearly adds up (check!), and same goes for when the rectangles are arranged in a grid-like manner. For a general tiling, you can always turn that into a grid-like tiling by adding some extra lines as in the figure.

Every one of the old tiles get split into a grid of subtiles, so the area of those small subtiles add up to the area of the original tiles. But the subtiles themselves also cover the big rectangle as a grid, so the total area does add up.

Now let’s see what this means in our situation. Each tile is a square, and a square has non-negative area: if it is an {a\times a}, its area is {f(a)\times f(a)=(f(a))^2\ge 0}. Hence the total area of the rectangle, {f(1)f(x)}, must be non-negative. But recall we have some freedom on the values of {f}: since {\{1,x\}} is linearly dependent one could have picked a monster so that {f(1)f(x)} is in fact negative (setting {f(1)=1,f(x)=-1} would work). This is a contradiction!

4. Other areas of interest

We have seen that {f(x)f(y)} is a nice definition for the area of an {x\times y} that adds up in tilings. The next natural generalisation is {f(x)g(y)} for any two additive functions {f,g}. In fact, they don’t even have to be non-linear monsters: {xg(y)} and {f(x)y} are also valid areas that suit our purpose. The proof that this area does add up nicely over tilings is simple to check: see if you can follow the strategy shown before to complete the argument. Further, the sum of two valid area functions is one as well, so this gives us a way to construct a fairly large family of nice areas.

Expanding our arsenal of area functions lets us tackle more difficult questions that would be hard to deal with even with the power of monsters. Consider this for example: what if we allowed “negative” tiles?

Assume we also allow “negative” tiles. One is allowed to stack tiles, but when tiles of different signs are stacked together, they cancel each other out. This is illustrated in the following diagram, where we have used red to denote positive tiles and cyan for negative. Darker shades of a color indicate a deeper stacking.

Is it possible to tile a {1\times x} rectangle by square tiles now? By tiling, we mean that every point outside the rectangle must have zero depth, and everything inside must have a constant non-zero depth. } The answer is still no, but the strategy in the last section does not apply directly. The area of a negative tile of dimensions {a\times b} would have to be {-f(a)f(b)} to have the all-important additivity, but then tiles can have negative areas and the rest of the argument falls apart?

The clever idea is to make the areas of the individual tiles not just nonnegative, but zero; since now we have a wider array of area functions, this should be possible. Indeed, look at the function area defined by {af(b)-bf(a)}; this is zero when {a=b} for a square, but for the entire rectangle, this is {f(x)-xf(1)}; since we are free to choose {f(x)} and {f(1)}, this can be made non-zero easily.

If you are comfortable with some linear algebra, we can actually prove something much stronger than “squares can’t tile a rectangle”. The following is a result by Max Dehn

Suppose {R} is a {a\times b} rectangle tiled by finitely many rectangles {R_1,\cdots, R_n}, where {R_i} has dimensions {a_i\times b_i}. If {r_i=a_i/b_i}, then {r=a/b} is a rational function in the {r_i}‘s.

Footnote: If you have heard of the so-called Dehn invariant used to solve Hilbert’s Third Problem, the name should ring a bell. Coincidentally, the Dehn invariant is another clever piece of math that can be written in terms of monsters!

In case you happen to have some familiarity with fields and vector spaces, you can actually try proving this yourself: I’ll leave some hints in the footnote\footnote{Suppose {F} is the field of rational functions of {r_1,\cdots, r_n}, and extend {\{a,b\}} to a basis of {{\mathbb R}} over {F}. Let {f} and {g} be {F}-linear functions so that {f(a)=g(b)=1}, and they send every other basis element to zero. Now use {f(x)g(y)-f(y)g(x)} as the area of a {x\times y} rectangle.}. If not, you can still silently marvel at the sheer power of techniques developed out of good old Cauchy’s FE.

We end with a problem with a slightly different flavor than the last few ones.

Suppose {R} is a rectangle tiled by finitely many rectangles, each with a rational perimeter. Show that finitely many copies of {R} can be used to tile a rectangle with rational perimeter.

Intuitively, the tiles look something like {\sqrt 2\times (2-\sqrt 2)}; maybe it has some other interestings irrational instead of {\sqrt2}, but this is the rough form. If you play around with some small cases, you’d notice tiles of this form can only make up rectangles of the form {a\sqrt 2\times (b-c\sqrt 2)}. In this case, we can form a {c\times a} grid of {R}‘s, and the entire thing would have dimensions {ca\sqrt 2\times (ab-ca\sqrt 2)}, which does have rational perimeter.

How do we formalise this idea? Let’s start with our good old area function {f(x)f(y)}. If {x+y=r\in{\mathbb Q}}, this is {f(x)f(r-x)=f(x)(r)-f(x)^2}. If we can make {f(r)=0} for rational {r\in{\mathbb Q}}. This is perfectly possible, in fact we can make {f} zero at precisely the rational inputs by extending {{1}} to a Hamel basis, sending {1} to {0} and every other basis element to themselves (check the details by yourself here). This means every tile has non-positive area (it being zero if and only if the tile has rational sides), so {R} has non-positive area too. Let {R} be an {A\times B} rectangle; this means {f(A)f(B)\le0}.

Can we equality hold? For {f(A)f(B)=0} to happen, we need equality everywhere: so every tile has rational sides, and then so does {R}. There’s really not much to prove if that happens.

So let’s assume {f(A)f(B)<0}. If {{1,A,B}} was linearly independent, then {f} could have been defined as above on a basis extending this set, then {f(A)f(B)} would be {AB>0}. This means the set is in fact not linearly independent. So one can find rationals (and hence by scaling, integers) {m,n,p} so that {mA+nB=p.}

Well, that’s pretty good; if {R} was of the form {a\sqrt 2\times (b-c\sqrt 2)} as we surmised, we would indeed have {mA+nB=p} for some integers. The problem is that the same would be true in case {R} was of the form {a\sqrt 2\times (b\sqrt 2-c)}, and there seems to be no obvious way to tile a rational-perimeter rectangle with {R}‘s of this second form. So the observation {mA+nB=p} isn’t enough to finish on it own.

Peering a bit deeper, one sees that there’s a crucial difference between the two possibilities discussed above: in the favorable {a\sqrt 2\times (b-c\sqrt 2)}, {m,n} both have the same signs; however, in the other annoying case {a\sqrt 2\times (b\sqrt 2-c)} they are of different signs. Is it somehow impossible for that to happen?

The final step isn’t too hard now: make an {m\times n} grid out of copies of {R}. This has dimensions {mA} and {nB}, and thus has rational perimeter!