11.1 Solving Equations

If we have a linear equation, such as \(5x - 3 = 0\), there is a straightforward procedure for solving it. You apply "the golden rule of equations": do unto the left side exactly what you do unto the right side. And you do it until all you have on the left is \(x\).

Thus with this example you would add \(3\) to both sides, getting rid of the \(-3\) on the left, and then divide by \(5\), with the result: \(x = \frac{3}{5}\).

Suppose however, we have a more complicated equation, such as

\[\sin(x) - \exp(x) + 2 = 0\]

Our task here is to find a solution, or all the solutions, of such an equation. We are assuming that the functions in our equation are continuous and differentiable in the domain of interest to us.

First note that it is always a good idea to plot the left hand side here and observe, crudely, where it changes sign or comes very near to \(0\). This will tell you roughly where it becomes \(0\).

In the old days this was an extremely tedious task, in general, and people tried to solve equations without plotting, which is a bit like flying blind. It’s OK if you can do it, but why try if you don't have to do so?

There is a standard technique for solving such equations apparently goes back to Newton. And here it is.

You start with a guess for the solution you seek, picking an argument, call it \(x_0\). You then find the linear approximation to your function, \(f\), at argument \(x_0\), and solve the equation that states that this linear approximation is \(0\). Call the argument for which the linear approximation is \(0\), \(x_1\).

Now you do exactly the same thing, starting at \(x_1\): you find the linear approximation to \(f\) at \(x_1\) and solve the equation that this linear approximation is \(0\) to determine \(x_2\). And you continue this as long as you need to.

In the old days this was an extremely tedious thing to do, for any function. Finding \(x_{j+1}\) from \(x_j\) is quite easy, but doing it over and over again is a real bore.

Now with a spreadsheet, you can set this up and find solutions, with practice, in under a minute. You only have to do each step once, and copy.

How?

First let's see how to get \(x_{j+1}\) from \(x_j\).

The linear approximation to \(f\) at \(x_j\) is given by

\[Lf_j(x) = f(x_j) + (x-x_j) f'(x_j)\]

If we set this to \(0\) at argument \(x_{j+1}\) we get

\[f(x_j) + (x_{j+1} - x_j) f'(x_j) = 0\]

which has solution, obtained by dividing and subtracting from both sides appropriately

\[x_{j+1} = x_j - \frac{f(x_j)}{f'(x_j)}\]

So what do I do on a spreadsheet?

Suppose we put our first guess in box A1. We will put it and subsequent guesses in column A starting say, with A3 (just to leave room for labels).

We can then put \(f\) in column B and \(f'\) in column C.

To do this we need make the following entries:

In A3, enter =A1 (this puts starting guess \(x_0\) in A3)
In B3, =f(A3) (this computes \(f(x_0)\))
In C3, =f'(A3) (this computes \(f'(x_0)\))
In A4, =A3-B3/C3 (this applies the algorithm to get the new guess)

If you now copy A4 (not A3!) and B3 and C3 down the A, B and C columns, you have implemented the algorithm.

You can change your starting guess by changing A1, and change your function by changing B3 and C3 appropriately, and copying the results down.

Does this really work?

This method converges very rapidly most of the time. If you start near a \(0\) of \(f\), and are on "the good side" it will always converge. Otherwise it stands a good chance of doing so, but strange things can happen.

What is the "good side"?

Suppose you start above the solution, call the solution \(z\), so \(x_0\) is greater than \(z\). Then if \(f\) and the second derivative of \(f\) are both positive between \(z\) and \(x_0\), you are on the good side.

Why?

That the second derivative of \(f\) is positive, between \(z\) and \(x_0\), means that the first derivative of \(f\) is increasing between \(z\) and \(x_0\), which means that the slope of \(f\) is biggest, in the range between \(z\) and \(x_0\), right at \(x_0\).

All this means that the linear approximation to \(f\) at \(x_0\) will dive down to \(0\) faster than \(f\) does as you near the solution \(z\), so that \(x_1\) will lie somewhere between \(z\) and \(x_0\). And each successive \(x_j\) will lie between z and the previous one. As we get closer to \(z\), \(f\) will look more and more like a straight line, which will mean it will look more and more like its linear approximation, so you will get closer and closer to \(z\) faster and faster.

Suppose we want to solve the equation \(\sin x - e^x + 2 = 0\), and we start with \(x = 0.3\) as a guess. The derivative of the left side is \(cos x - e^x\).

Our spreadsheet instructions when filled in should look as follows:

In A1, enter 0.3
In A2, enter xj. In B2, f(xj). In C2, f'(xj).
In A3, =A1. In B3, =sin(A3)-exp(A3)+2. In C3, =cos(A3)-exp(A3)
In A4, =A3-B3/C3. In B4, =sin(A4)-exp(A4)+2. In C4, =cos(A4)-exp(A4)
Copy down columns A, B, and C

Number of steps
Number of digits after decimal point

What happens when you start at \(5\) instead of \(0.3\)? At \(0\)? At \(10\)?

Exercises:

11.1 Suppose \(f\) is negative at \(x_0\) and \(x_0\) is bigger than \(z\). What condition on \(f''\) between \(z\) and \(x_0\) will mean you are on the good side? What is the condition when \(f\) is positive at \(x_0\) but \(x_0\) is less than \(z\) for you to be on the good side as discussed here?

11. 2 What will happen if \(f''\) has the wrong sign but the same sign between your guess and \(z\)?

Still and all, the method can do bizarre things. If \(f' = 0\) at a guess, the iteration won't even make sense because you will divide by \(0\) in it. If \(f'\) is very near \(0\), the new guess will be very far from the old one, and it can zip around weirdly.

The following applet allows you to plot and view the method just by entering the function. (which is only slightly simpler than starting from scratch with a spreadsheet).

Exercises:

11.3 What happens if you look for the solution to \(x^{-\frac{1}{3}} - 0.001 = 0\), and you try to use this method directly? How about \(\tan x = 1\)?

11.4 Find all solutions to \(\sin (x) - \exp(x) + 3 = 0\) for \(x\) positive, accurate to ten decimal places.

Do I have to differentiate \(f\) to apply this algorithm?

No! You can choose a value of d that is very small relative to the scale of your function and put =(f(A3+d)-f(A3-d))/(2*d) in C3 instead of =f'(A3).

This will do just about as well as the regular Newton's method, practically always.

Exercise 11.5 Redo exercise 11.5 with the entry, =(f(A3+B$1)-f(A3-B$1)/(2*B$1) in C3. How is your answer affected?

What can go wrong?

For one thing, it is possible that our equation has no real solutions. In that case neither method can ever find one. Plotting \(f(x)\) against \(x\) will confirm this.

Another problem arises when your equation has more than one solution. Then the one you get depends on where you start. The Applet illustrates that.

In general, if you arrive at a point \(x_j\) at which \(f'(x_j)\) is near zero, while \(f\) is not near zero, \(x_{j+1}\) will be very far away and the successive values of \(x\) could easily zoom around like crazy, even when you were once near the solution you were looking for.

But the method is fun to use anyway, and you can easily tell when it is not working.

Divide and Conquer

There is another method for solving an equation that gets closer to a solution by a factor of \(2\) at each iteration, if you can find \(2\) arguments at which your function has opposite signs. You then look at the midpoint between them and replace the endpoint in which the function has the same sign as it has at the midpoint, with the midpoint.

Exercise: Figure out how to implement this method on a spreadsheet. (hint: you can enter things like =if(D5*F5>0,C5,A5) which gives C5 if D5 and F5 have the same sign and A5 otherwise.)