\( \newcommand\cyc{\sum_{cyc}} \newcommand\pcyc{\prod_{cyc}} \newcommand\R{\mathbb{R}} \newcommand\N{\mathbb{N}} \newcommand\Z{\mathbb{Z}} \newcommand\median{\mathbf{median}} \newcommand\dot{\mathbf{dot}} \newcommand\dx[1]{\,\mathbf{d#1}} \) The Primal Trick vs. The Dual Trick

The Primal Trick vs. The Dual Trick

This document introduces the distinction between the primal trick and the dual trick: The primal trick is the trick used to solve the problem, while the dual trick is the trick used to generate a puzzle that the primal trick solves. They have an interesting relationship, which we now discuss.

Example proof

Consider the following proof from AoPS: Let \( a, b, c > 0 \) and prove that \[ \cyc \frac{a^2}{b^2 + c^2} \geq \cyc \frac{a}{b+c} \] Proof. \begin{align*} \cyc \left( \frac{a^2}{b^2 + c^2} - \frac{a}{b+c} \right) & = \cyc \frac{a^2(b+c) - a(b^2+c^2)}{(b^2+c^2)(b+c)} & \text{merge} \\ & = \cyc \frac{a^2b + a^2c - ab^2 - ac^2}{(b^2+c^2)(b+c)} & \text{expand} \\ & = \cyc \frac{(a^2b - ab^2) + (a^2c - ac^2)}{(b^2+c^2)(b+c)} & \text{group (magic)}\\ & = \cyc \frac{ab(a - b) + ac(a - c)}{(b^2+c^2)(b+c)} & \text{factor}\\ & = \cyc \left( \frac{ab(a-b)}{(b^2+c^2)(b+c)} + \frac{ac(a-c)}{(b^2+c^2)(b+c)} \right) & \text{split} \\ & = \cyc \left( \frac{ab(a-b)}{(b^2+c^2)(b+c)} - \frac{ca(c-a)}{(b^2+c^2)(b+c)} \right) & \text{negate (magic)} \\ & = \cyc \left( \frac{ab(a-b)}{(b^2+c^2)(b+c)} - \frac{ab(a-b)}{(c^2+a^2)(c+a)} \right) & \text{cycle (magic)} \\ & = \cyc \frac{ab(a-b)\left[ (c^2+a^2)(c+a) - (b^2+c^2)(b+c) \right]}{(b^2 + c^2)(b+c)(c^2+a^2)(c+a)} & \text{merge} \\ & = \cyc \frac{ab(a-b)\left[ (a-b)(\cyc a^2 + \cyc ab)\right]}{(b^2 + c^2)(b+c)(c^2+a^2)(c+a)} & \text{factor} \end{align*} What magic!

The primal trick

In this case, the primal trick is arguably something like:

Action sequence: split the numerator, cycle one of the summands, and re-merge.
Precondition: subterm matching \( \cyc \frac{\sum_i ?x_i}{?y} \).
Triggers: unclear.
Progress signs: cycling exposes a shared factor.

It is relatively easy to implement the nondeterministic action sequence: given the pattern \( \cyc \frac{\sum_i ?x_i}{?y} \), choose any (presumably balanced) 2-partition of the addends, split, cycle, and re-merge. It is also easy to implement the progress sign mentioned above.

However, the challenge with writing the primal trick is that the pattern appears often in proof steps. Do we need to try all \(\mathrm{nCk}(n,n/2)\) possibilities at every proof step? Do we need to continue each possibility until the progress sign is determined to be absent? Accumulated wisdom suggests that it is hard to write down appropriate heuristics for when to try, how to group, and when to give up on the path.

Also, is it even its own trick? Grouping, splitting, cycling, merging are all in the action space already. We will consider this question again later on.

The dual trick

In this case, the dual trick seems to be something like the following. Start with the theorem: \[ \cyc \frac{f(a,b,c) (a-b)^2 g(a,b,c)}{h(a,b,c) h(b,c,a)} \geq 0\] where \[ (a-b) g(a,b,c) = h(b,c,a) - h(a,b,c) \] and \( f, g, h \) are all non-negative. Given any \( g, h \) that satisfy the constraint, and any \( f \), we can generate a puzzle as follows: \begin{align*} \cyc \frac{f(a,b,c) (a-b)^2 g(a,b,c)}{h(a,b,c) h(b,c,a)} & = \cyc \frac{f(a,b,c) (a-b)(h(b,c,a)-h(a,b,c))}{h(a,b,c) h(b,c,a)} \\ & = \cyc \frac{f(a,b,c) (a-b)}{h(a,b,c)} - \frac{f(a,b,c) (a-b)}{h(b,c,a)} \\ & = \cyc \frac{f(b,c,a) (b-c)}{h(b,c,a)} - \frac{f(a,b,c) (a-b)}{h(b,c,a)} \\ & = \cyc \frac{f(b,c,a) (b-c) - f(a,b,c)(a-b)}{h(b,c,a)} \end{align*} producing the problem: \begin{align*} \cyc \frac{f(b,c,a) (b-c) - f(a,b,c)(a-b)}{h(b,c,a)} & \geq 0 \end{align*} which can be further obfuscated e.g. by expanding the numerator.

It is not immediately obvious how to generate many variants of the puzzle. In the case of this problem, the problem of finding the \( h \) looked like this: \begin{align*} (a-b) g(a,b,c) &= (a-b) \cyc (a^2 + ab) \\ &= a^3 + ab^2 + ac^2 + a^2b + abc + a^2 c - ba^2 - b^3 - bc^2 - b^2a - b^2c - bca & \text{expand} \\ &= a^3 + ac^2 + a^2 c - b^3 - bc^2 - b^2c & \text{cancel} \\ &= a^3 + ac^2 + a^2 c - b^3 - bc^2 - b^2c + (c^3 - c^3) & \text{add0 (magic)} \\ &= (c^3 + a^3 + ac^2 + a^2 c) - (c^3 + b^3 - bc^2 - b^2c) & \text{group (magic)}\\ &= (c^2+a^2)(c+a) - (b^2+c^2)(b+c) & \text{factor} \\ &= h(a,b,c) - h(b,c,a) \end{align*} This seems as amazing as the primal trick, and does not seem like a recipe that would easily yield a large amount of data.

Big tricks vs. small tricks

One preliminary takeaway is that the big primal trick showcased above is unlikely to ever be used again, because it requires a corresponding big dual trick to create the perfect circumstances for it. Ideally we can find less constrained dual processes that produce proofs requiring small (primal) tricks applied in various ways in various orders. As always, it may be challenging to find unconstrained processes that produce sufficiently interesting proofs.

An unconstrained dual trick

Let \( a,b,c > 0\) and prove that: \[ 2 (\cyc a)^2 (\cyc \frac{1}{a(a+b)}) \geq 27 \] Proof. \begin{align*} 2 (\cyc a)^2 (\cyc \frac{1}{a(a+b)}) & = 2 (\cyc a) (\cyc a) (\cyc \frac{1}{a(a+b)}) & \text{expand} \\ & = (\cyc 2a) (\cyc a) (\cyc \frac{1}{a(a+b)}) & \text{push} \\ & = (\cyc (a + a)) (\cyc a) (\cyc \frac{1}{a(a+b)}) & \text{defuse (magic1)} \\ & = (\cyc (a + b)) (\cyc a) (\cyc \frac{1}{a(a+b)}) & \text{cycle (magic2)} \\ & \geq \left( \cyc \frac{(a+b)a}{a(a+b)} \right)^3 & \text{holder} \\ & = 3^3 \\ & = 27 \end{align*}

Arguably the primal trick is

Action sequence: massage products of sums (with some fractions) so everything cancels after Holder's.
Precondition: product of sums, upto some "noise"
Triggers: I can only speculate.
Progress signs: cancellation after Holder.

What about the dual trick? Arguably it is: \begin{align*} n & = \sum_{i=1}^n 1 \\ & = \sum_{i=1}^n \prod_{j=1}^m f_{ij}(x) f_{ij}(x)^{-1} \\ & = \sum_{i=1}^n \prod_{k=1}^K f_{ik}(x) & \text{group (arbitrarily)} \\ & = \sum_{i=1}^n \prod_{k=1}^K (f_{ik}(x)^{K})^{1/K} & \text{group (arbitrarily)} \\ & = \sum_{i=1}^n \prod_{k=1}^K h_{ik}(x)^{1/K} & \text{define} \\ & \leq \prod_{k=1}^K \left( \sum_{i=1}^n h_{ik}(x) \right)^{1/K} & \text{holder} \\ & = \left( \prod_{k=1}^K \sum_{i=1}^n h_{ik}(x) \right)^{1/K} \end{align*} and then obfuscate the \( \sum_{i=1}^n h_{ik}(x) \) , e.g. \( \cyc (a+b) \to \cyc 2a \to 2 \cyc a \).

In contrast to the first dual trick discussed above, this process is unconstrained. As long as the obfuscation is reasonably interesting, it seems this dual trick can produce an infinite number of puzzles with solutions.

The value of an unconstrained dual trick

What is the value of such a dual trick? We can generate an infinite number of puzzles, but it is unlikely that our prover will ever face a puzzle from this family in the wild. So what do we gain by training on it? For that matter, what does a human gain from seeing the primal trick shown above? To keep these documents small and modular, we discuss this in a follow-up document.