If you have a function defined on a manifold,
and you want to find maxima and minima of that function
restricted to some compact submanifold, you can use Lagrange multipliers.

For example, on the xy plane, consider the distance function \rho = x^2 + y^2.
Suppose you have an ellipse given by x^2/a^2 + y^2/b^2 = 1,
and you want to know what points inside the ellipse are farthest from 0.
There are two possibilities for such a point;
either it is on the ellipse itself or in the interior of the ellipse.

If the point isn't on the ellipse itself, it must be a local maximum,
so you can find this by differentiating x^2 + y^2.
d\rho = d(x^2 + y^2) = 2x dx + 2y dy = (2x, 2y) . (dx, dy).
This is 0 only when 2x = 2y = 0, at the origin.
The origin is in the ellipse, so keep track of it;
the distance is \rho = 0^2 + 0^2 = 0.

For the points *on* the ellipse, use Lagrange multipliers.
First, find a function F such that F = 0 precisely on the ellipse.
Obviously, you want F = x^2/a^2 + y^2/b^2 - 1 (or a multiple thereof).
There is a (usually) unique number \lambda (the Lagrange multiplier)
such that, at each extremum point, d\rho = \lambda dF.
(This is the theorem Lagrange proved which justifies the method.)
dF = d(x^2/a^2 + y^2/b^2 - 1) = 2x dx/a^2 + 2y dy/b^2 = (2x/a^2,2y/b^2).(dx,dy).
If d\rho = \lambda dF, then (2x, 2y) = \lambda (2x/a^2, 2y/b^2).
This, along with the equation F = 0, is a system of equations
whose solutions for (x, y, \lambda) are (+-a, 0, b^2) and (0, +-b, a^2).
(I assume a^2 != b^2; if a^2 = b^2, anything on the circle is a solution.)
If (x, y) = (+-a, 0), \rho = a^2; if (x, y) = (0, +-b), \rho = b^2.

Therefore, there is a minimum at (x, y) = 0,
and the maxima are at (x, y) = (+-a, 0) if a^2 > b^2 or (0, +-b) if b^2 < a^2.
(If a^2 = b^2, then \rho = a^2 = b^2 at any point on the circle,
so they are all maxima.)

In general, when you use Lagrange multipliers,
you have a function like \rho and one or more constraints like F.
If the constraint functions are F_1, ..., F_n,
then Lagrange's theorem is that there are numbers \lambda_1, ..., \lambda_n
such that d\rho = \lambda_1 dF_1 + ... + \lambda_n dF_n at the extrema.
This equation would give you just enough to solve for the extrema points
if you knew the n \lambdas; but you get n more equations from F_i = 0.
So, you can solve the equations (at least in principle) to get your answer.