How to derive formulas for the sum of powers?

Gagarine Yaikhom

9 September 2012, Sunday

I have wondered how the closed form for the sum of squares for the first \(n\) natural numbers was derived. Given the formula for the sum,

\[1^2 + 2^2 + 3^2 + \cdots + n^2 = \frac{n(n+1)(2n+1)}{6}\]

I learned to prove its correctness using mathematical induction. However, I never understood how the formula was derived in the first place! Knuth writes in The Art of Computer Programming, Volume 1, Third Edition, pp. 32 [Addison-Wesley, June 2009]:

“Most textbooks would simply state those formulas, and prove them by induction. Induction is, of course, a perfectly valid procedure; but it does not give any insight into how on earth a person would ever have dreamed the formula up in the first place, except by some lucky guess. …”

Sum of the first \(n\) natural numbers

Carl Friedrich Gauss used the following approach, apparently while still in primary school, to derive a closed form for the sum of the first \(n\) positive natural numbers,

\[S = 1 + 2 + 3 + \cdots + (n - 2) + (n - 1) + n.\]

He first reordered the sum as

\[S' = n + (n - 1) + (n - 2) + \cdots + 3 + 2 + 1,\]

and added \(S\) and \(S'\), giving

\[S + S' = (1 + n) + (1 + n) + \cdots + (1 + n) = n(1 + n).\]

Since \(S = S'\), we get the sum as \(S = n(n + 1) / 2\). So, are there similarly simple derivations for the sum of squares? As it turns out, there are several.

A simple derivation of the sum of squares

While randomly browsing the web, I fortunately stumbled upon a bulletin message1 that gave a simple derivation of the closed form, as follows:

Let \(S\) be the sum of squares of the first \(n\) natural numbers, such that

\[S = 1^2 + 2^2 + 3^2 + \cdots + n^2.\]

Our aim is to derive a closed form formula for \(S\) in terms of \(n\). We have

\[ \begin{aligned} (x+1)^3 & = (x + 1)(x^2 + 2x + 1)\\ & = x^3 + 2x^2 + x + x^2 + 2x + 1\\ & = x^3 + 3x^2 + 3x + 1. \end{aligned} \]

Using this, we write the following sequence of equations where \(1 \le x \le n\):

\[ \begin{aligned} (1+1)^3 & = 1^3 + 3.1^2 + 3.1 + 1\\ (2+1)^3 & = 2^3 + 3.2^2 + 3.2 + 1\\ (3+1)^3 & = 3^3 + 3.3^2 + 3.3 + 1\\ & \cdots\\ (n+1)^3 & = n^3 + 3.n^2 + 3.n + 1 \end{aligned} \]

If we add up each of the five columns separately, we get

\[ \begin{aligned} (1+1)^3 + (2+1)^3 + \cdots + (n+1)^3 = \, & 1^3 + 2^3 + \cdots + n^3 +\\ & 3.(1^2 + 2^2+\cdots+n^2) +\\ & 3.(1 + 2 + 3+ \cdots +n) +\\ & 1 + 1 + 1 + \cdots + 1. \end{aligned} \]

After simplification and substitution, we get

\[ \begin{aligned} 2^3 + 3^3 + \cdots + (n+1)^3 = \, & 1^3 + 2^3 + \cdots + n^3 +\\ & 3S + 3n(n+1)/2 + n \end{aligned} \]

which, after subtracting the sum of cubes, gives us

\[ \begin{aligned} (n+1)^3 & = 1 + 3S + 3n(n+1)/2 + n \mbox{, or}\\ 3S & = (n+1)^3 - 3n(n+1)/2 - n - 1 \mbox{, or}\\ 6S & = 2n^3 + 3n^2 + n, \quad\mbox{finally giving us}\\ S & = n(n+1)(2n+1)/6. \end{aligned} \]

Some historical notes

After studying further, I found out that there are several methods for deriving this and similar results. In Concrete Mathematics by Graham, Knuth, and Patashnik, Second Edition, pp. 41–46 [Addison-Wesley, December 2004], several different approaches are discussed. Some involve simple manipulations of sums, while others require integral calculus.

As the history goes, Archimedes derived the formula for the sum of squares a long time ago. I have investigated the derivations and proofs in two principal translations of Archimedes’ work: The Works of Archimedes by T. L. Heath [Cambridge University Press, 1897], and Archimedes by E. J. Dijksterhuis [Copenhagen: Ejnar Munksgaard, 1956].

Dijksterhuis provides a more faithful translation of the original writing in Greek; however, the work due to Heath presents a fuller study using modern formulation. In fact, with regards to understanding the derivations, Heath’s exposition is easier to follow, as compared to Dijksterhuis’ use of cryptic notations (for instance, Dijksterhuis denotes the square of a value \(x\) as \(T(x)\) instead of as \(x^2\), which makes the equations harder to follow). Furthermore, while Dijksterhuis’ treatment ignores some of the steps in the derivation, Heath gives a fuller account of each step.

At the end of the introduction to the treatise On Conoids and Spheroids (pp. 119 in Dijksterhuis’ translation of De Conoidibus et Sphaeroidibus), Archimedes mentions the following lemma without proof:

“If any number of magnitudes be given, which exceed one another by an equal amount equal to the least, and also other magnitudes, equal in number to the former, but each equal in quantity to the greatest, all the magnitudes each of which is equal to the greatest will be less than the duplicate of all those exceeding one another by an equal amount and more than the duplicate of these minus the greatest.”

This is another way of saying \(\Sigma_{i=1}^n i = n(n+1)/2\), which we discussed previously using Gauss’ derivation2. Then, in proposition 10 of the treatise On Spirals (Dijksterhuis’ translation of De lineis Spiralibus), Archimedes states and proves the sum of squares:

“If a series of any number of lines be given, which exceed one another by an equal amount; and the difference be equal to the least, and if other lines be given equal in number to these and in quantity to the greatest, the squares on the lines equal to the greatest, plus the square on the greatest and the rectangle contained by the least and the sum of all those exceeding one another by an equal amount will be the triplicate of all the squares on the lines exceeding one another by an equal amount.”

In modern formulation, as given on pp. 107–109 of Heath’s translation, the derivation proceeds from:

“If \(A_1, A_2, A_3, \ldots, A_n\) be \(n\) lines forming an ascending arithmetical progression in which the common difference is equal to the least term \(A_1\), then \[(n + 1) A_n^2 + A_1(A_1 + A_2 + A_3 + \cdots + A_n)\\ = 3 (A_1^2 + A_2^2 + A_3^2 + \cdots + A_n^2)\]

to produce, \[(n + 1) (na)^2 + a(a + 2a + \cdots + na) = 3 [a^2 + (2a)^2 + \cdots + (na)^2]\]

where \(a\) is the common difference. From this, Archimedes finally derives the result:

\[\sum_{i=1}^n i^2= \frac{n(n+1)(2n+1)}{6}.\]

Derivation of other powers

Further to the derivation of the sum of squares above, here are the derivations for the sum of cubes and the sum of the fourth powers. The same technique may be used for deriving the sum of higher powers.

Sum of cubes

Let \(S_3\) be the sum of cubes of the first \(n\) natural numbers, such that

\[S_3 = 1^3 + 2^3 + 3^3 + \cdots + n^3.\]

Our aim is to derive a closed form formula for \(S_3\) in terms of \(n\). We have

\[ \begin{aligned} (x+1)^4 = x^4 + 4x^3 + 6x^2 + 4x + 1. \end{aligned} \]

Using this, we write the following sequence of equations where \(1 \le x \le n\):

\[ \begin{aligned} (1+1)^4 & = 1^4 + 4.1^3 + 6.1^2 + 4.1 + 1\\ (2+1)^4 & = 2^4 + 4.2^3 + 6.2^2 + 4.2 + 1\\ (3+1)^4 & = 3^4 + 4.3^3 + 6.3^2 + 4.3 + 1\\ & \cdots\\ (n+1)^4 & = n^4 + 4n^3 + 6n^2 + 4n + 1 \end{aligned} \]

If we add up each of the six columns separately, we get

\[ \begin{aligned} (1+1)^4 + (2+1)^4 + \cdots + (n+1)^4 = \, & 1^4 + 2^4 + 3^4 + \cdots + n^4\\ & + 4(1^3 + 2^3 + 3^3 + \cdots + n^3)\\ & + 6(1^2 + 2^2 + 3^2 + \cdots + n^2)\\ & + 4(1 + 2 + 3+ \cdots +n)\\ & + 1 + 1 + 1 + \cdots + 1. \end{aligned} \]

After subtracting the fourth powers on both sides of the equation, and substituting the closed form for the sum of the lower powers, we get

\[ \begin{aligned} (n+1)^4 = 1 + 4S_3 + 6\left[\frac{n(n+1)(2n+1)}{6}\right] + 4\left[\frac{n(n+1)}{2}\right] + n \end{aligned} \]

or,

\[ \begin{aligned} n^4 + 4n^3 + 6n^2 + 4n + 1 = 1 + 4S_3 + (2n^3 + 3n^2 + n) + (2n^2 + 2n) + n, \end{aligned} \]

which, after further simplification gives

\[ \begin{aligned} S_3 = \frac{n^4+2n^3+n^2}{4} = \left[\frac{n(n+1)}{2}\right]^2. \end{aligned} \]

Sum of the fourth powers

Let \(S_4\) be the sum of the fourth powers of the first \(n\) natural numbers, such that

\[S_4 = 1^4 + 2^4 + 3^4 + \cdots + n^4.\]

Our aim is to derive a closed form formula for \(S_4\) in terms of \(n\). We have

\[ \begin{aligned} (x+1)^5 = x^5 + 5x^4 + 10x^3 + 10x^2 + 5x + 1. \end{aligned} \]

Using this, we write the following sequence of equations where \(1 \le x \le n\):

\[ \begin{aligned} (1+1)^5 & = 1^5 + 5.1^4 + 10.1^3 + 10.1^2 + 5.1 + 1\\ (2+1)^5 & = 2^5 + 5.2^4 + 10.2^3 + 10.2^2 + 5.2 + 1\\ (3+1)^5 & = 3^5 + 5.3^4 + 10.3^3 + 10.3^2 + 5.3 + 1\\ & \cdots\\ (n+1)^5 & = n^5 + 5.n^4 + 10n^3 + 10n^2 + 5n + 1 \end{aligned} \]

If we add up each of the seven columns separately, we get

\[ \begin{aligned} (1+1)^5 + (2+1)^5 + \cdots + (n+1)^5 = \, & 1^5 + 2^5 + 3^5 + \cdots + n^5\\ & + 5(1^4 + 2^4 + 3^4 + \cdots + n^4)\\ & + 10(1^3 + 2^3 + 3^3 + \cdots + n^3)\\ & + 10(1^2 + 2^2 + 3^2 + \cdots + n^2)\\ & + 5(1 + 2 + 3+ \cdots +n)\\ & + 1 + 1 + 1 + \cdots + 1. \end{aligned} \]

After subtracting the fifth powers on both sides of the equation, and substituting the closed form for the sum of the lower powers, we get

\[ \begin{aligned} (n+1)^5 = 1 + 5S_4 + 10\left[\frac{n(n+1)}{2}\right]^2 + 10\left[\frac{n(n+1)(2n+1)}{6}\right] + 5\left[\frac{n(n+1)}{2}\right] + n \end{aligned} \]

or,

\[ \begin{aligned} n^5 + 5n^4 + 10n^3 + 10n^2 + 5n + 1 = \, & 1 + 5S_4 + \frac{5}{2}(n^4+2n^3+n^2)\\ & + \frac{5}{3}(2n^3 + 3n^2 + n) + \frac{5}{2}(n^2 + n) + n, \end{aligned} \]

which, after further simplification gives

\[ \begin{aligned} S_4 = \frac{6n^5 + 15n^4 + 10n^3 - n}{30} = \frac{n}{30}(6n^4 + 15n^3 + 10n^2 - 1) \end{aligned} \]

Sum of the \(n\)th powers

Using binomial coefficients, we generalise our technique as follows:

\[(n+1)^k = \binom{k}{0} + \binom{k}{1}S_{k-1} + \binom{k}{2}S_{k-2} + \cdots + \binom{k}{k}S_0\textrm{, where }S_j = \sum_{x=1}^nx^j.\]

Using this equation, we calculate the value of \(S_{k-1}\) recursively by applying the sum of the lower powers as follows:

\[ \begin{aligned} S_{k-1} = \frac{(n+1)^k - 1 - \displaystyle\sum\limits_{i=0}^{k-2} \binom{k}{k-i}S_i}{k}\textrm{, and }S_0 = n. \end{aligned} \]

Another approach to finding the sum of squares

In the previous sections, I was interested in finding simple approaches for determining the closed form formula for the sum of squares for the first \(n\) natural numbers: \[1^2 + 2^2 + 3^2 + \cdots + n^2 = \frac{n(n+1)(2n+1)}{6}\]

While studying nonparametric statistical analysis methodologies, I came across another approach, which I think is beautiful. The method is described in Practical Nonparametric Statistics, Second Edition, pp. 39-40 [John Wiley and Sons, 1980] by W.J. Conover.

Let \(S\) be the sum of squares of the first \(n\) natural numbers, such that

\[S = 1^2 + 2^2 + 3^2 + \cdots + n^2.\]

This can be rewritten as:

\[ \begin{aligned} S = 1 + 2 + 3 + 4 + \cdots + n + &\\ 2 + 3 + 4 + \cdots + n + &\\ 3 + 4 + \cdots + n + &\\ 4 + \cdots + n + &\\ \cdots &\\ n &. \end{aligned} \]

Each row of sums is the sum of the first \(n\) natural numbers, minus the sum of the first \((i - 1)\) natural numbers, where \(i\) is the row number. For instance, the third row of sums \((3 + 4 + \cdots + n)\) can be rewritten as \((1 + 2 + 3 + 4 + \cdots + n) - (1 + 2)\). In general, therefore, the sum of row \(i\) can be written as:

\[R_i = \frac{n(n + 1)}{2} - \frac{i(i - 1)}{2} = \frac{n^2 + n + i - i^2}{2}.\]

Since we have \(n\) rows, the sum of squares is given by:

\[ \begin{aligned} S &= \sum_{i = 1}^{n}R_i\\ &= \sum_{i = 1}^{n}(\frac{n^2 + n + i - i^2}{2})\\ &= \sum_{i = 1}^{n}\frac{n^2}{2} + \sum_{i = 1}^{n}\frac{n}{2} + \sum_{i = 1}^{n}\frac{i}{2} - \sum_{i = 1}^{n}\frac{i^2}{2}\\ &= \frac{n^3}{2} + \frac{n^2}{2} + \frac{n(n+1)}{4} - \frac{S}{2}\\ &= \frac{2n^3 + 2n^2 + n^2 + n - 2S}{4}\\ &= \frac{2n^3 + 3n^2 + n - 2S}{4}. \end{aligned}\]

This gives us

\[4S + 2S = 2n^3 + 3n^2 + n = n(n+1)(2n+1),\]

and therefore,

\[S = \frac{n(n+1)(2n+1)}{6}.\]

Update history


  1. I have discussed the cited works in the order that I discovered them, beginning with the bulletin message http://mathforum.org/library/drmath/view/56920.html I read on 9 June 2010 Wednesday, 23:49.↩︎

  2. Did Gauss know about Archimedes’ work? In Dijksterhuis’ translation, a diagrammatic argument is presented using two sequences of bars, each representing a sequence of natural number, one ascending and the other descending, which when combined produce a rectangle. It is easy to see that this is a visual representation of Gauss’ derivation.↩︎