Scaling an image to an appropriate number of pixels for your project

Today I'm writing a Python module that works with image data (check back in a few days for my post about it) and I wanted to arrive at the optimal dimensions an image must be to have a certain number of pixels.

For example, if I want to see how fast my algorithm is for, say, 1 million RGB vectors, then I want to scale my example image to 1 megapixel.

In this post I'm working with an image that has dimensions \((3840,2160)\), or a size of about 8.3 million pixels. But what should the \(X\) and \(Y\) dimensions of my image be then, if I want it to be as close to 1 million pixels as possible?

First Attempt

Let's write a Python function that preserves the aspect ratio of an image's dimensions and allows us to vary its size.
from typing import Tuple


def scale(X: int, Y: int, factor: float =1.0) -> Tuple[float, Tuple[float, float]]:
    """
    Scale our image's dimensions by some factor, preserving aspect ratio.
    """
    X_, Y_ = factor * X, factor * Y
    return X_ * Y_, (X_, Y_)
With this function we can make guesses until we arrive at a solution that's 'close enough'. But what if we want a nearly-exact answer?

Second Attempt: Gradient Descent

The cost or error of our scale function is the difference between our expected number of pixels and the product of the resultant dimensions after applying some scale factor \(\gamma\in\mathbb{R}^{+}\). In mathematical terms, \begin{align}\text{cost}\left(\gamma\right)&=\text{expected}-\gamma^{2}XY.\end{align}In this equation we'd like to optimize the variable \(\gamma\). But there's a problem - if we take the partial derivative of this function as-is, we'll lose an important constraint! Specifically, the value of '\(\text{expected}\)'; so let's square our error function.
\begin{align}\text{cost}\left(\gamma\right)&=\dfrac{1}{2}\left[\text{expected}-\gamma^2XY\right]^2.\end{align}Applying a partial derivative to this function, we arrive at
\begin{align}\partial_{\gamma}\,\text{cost}\left(\gamma\right)&=2XY\gamma\left[\gamma^2XY-\text{expected}\right].\end{align}This is just as simple in Python.
def nabla_scale(pixels: float, expected: float, factor: float =1.0) -> float:
    return 2 * pixels * factor * (factor ** 2 * pixels - expected)
Now let's use the gradient descent algorithm I provided a definition for in a previous post to arrive at an optimal value of \(\gamma\).
\begin{align}\gamma_{\kappa+1}:=\gamma_{\kappa}-2XY\gamma_{\kappa}\left[\gamma_{\kappa}^2XY-\text{expected}\right].\end{align} In Python this is just looping and updating.
def gd() -> None:
    scalingFactor = 1.0     # the factor we wish to optimize
    iterations = 100
    rate = 0.15             # `learning' rate

    X, Y = 3840, 2160
    pixels = X * Y          # number of pixels currently in the image

    expectedValue = 1e6     # this is the number of pixels we want

    # Scale the data to [0,1] or this problem will blow up :P
    expectedValue /= pixels
    pixels = 1.0

    for _ in range(iterations):
        scalingFactor -= rate * nabla_scale(pixels, expectedValue, scalingFactor)

    print(f'{scalingFactor:.4f}', (round(scalingFactor * X), round(scalingFactor * Y)))


## 0.34729 (1334, 750)
Running this function, I arrive at an optimal scaling factor of \(0.3472...\) and image dimensions (\(1334\), \(750\)), whose product is \(1000500\), which is pretty darn close.

But is this the best solution? Playing around with the values, I found the dimensions \(1333,750\) to be even closer at \(999750\) pixels, or half the error. The solution to this problem is approaching the optimal \(\gamma\) from top and bottom. I've modified the values and function some to reflect this below.
def gd() -> None:
    scaleFactorAbove = 1.0  # the factors we wish to optimize
    scaleFactorBelow = 0.01
    iterations = 100
    rateAbove = 0.15        # `learning' rate
    rateBelow = 0.8

    X, Y = 3840, 2160
    pixels = X * Y          # number of pixels currently in the image

    expectedValue = 1e6     # this is the number of pixels we want

    # Scale the data to [0,1] or this problem will blow up :P
    expectedValue /= pixels
    pixels = 1.0

    # Plot the values vs. iteration
    above = []
    below = []

    for _ in range(iterations):
        above.append(scaleFactorAbove)
        scaleFactorAbove -= rateAbove * nabla_scale(pixels, expectedValue, scaleFactorAbove)
        below.append(scaleFactorBelow)
        scaleFactorBelow -= rateBelow * nabla_scale(pixels, expectedValue, scaleFactorBelow)

    print(f'{scaleFactorAbove:.4f}', (round(scaleFactorAbove * X),
                                      round(scaleFactorAbove * Y)))
    print(f'{scaleFactorBelow:.4f}', (round(scaleFactorBelow * X),
                                      round(scaleFactorBelow * Y)))

## 0.3473 (1334, 750)
## 0.3472 (1333, 750)  # Notice this approaches the more-optimal value
Also, here's a plot of the scale factor vs. time, so we can see how it asymptotically approaches the optimal value.

Third Attempt: Algebra

But of course there's an even simpler solution we can work out by setting
\begin{align}0&=\text{cost}\left(\gamma\right)\\&=\text{expected}-\gamma^2XY\\\implies\gamma&=\sqrt{\dfrac{\text{expected}}{XY}}.\end{align}This problem has a closed-form solution, in other words. In Python this is almost trivial.
def gd() -> None:
    X, Y = 3840, 2160
    pixels = X * Y          # number of pixels currently in the image
    expectedValue = 1e6     # number of pixels we want
    solution = (expectedValue / (X * Y)) ** (1/2)
    print(f'{solution:.4f}', (round(solution * X), round(solution * Y)))

## 0.3472 (1333, 750)
I like this post because it shows my thought process: at first, I think I have a pretty difficult problem that requires some moderately-complex hardware to solve. Then, after sitting on the problem overnight and taking a 5 mile walk, I realize it required little more than basic math after all.

It's still a cool example problem I think others would appreciate who are just getting started with optimization, because we can check ourselves with basic tools.