Digital Media File Sizes May Be About to Change. Radically

Since the dawn of the digital age, there’s been one truism for digital media: size matters. File size, that is.

From digital music to digital photos and videos, the size of the file generally corresponds to its quality—the larger the file, the better the quality. As compression techniques have advanced, this relationship has become less stark, but the trade-off remains. For visual content, particularly, the onus on hardware manufacturers always points in the same direction: more. Sensors must be packed with more pixels. Image and video codecs in capture devices must preserve more and more information (preferably all the information). Storage devices, both in-camera and externally, must grow bigger and faster to accommodate this never-ending push for more data.

But that may be about to change in potentially radical ways.

Developments in artificial intelligence, specifically neural networks and generative adversarial networks (GANs) are beginning to reshape the contours of what’s considered usable information inside a digital media file. These developments could have profound ramifications for the entire digital ecosystem—from capture to storage, display and transmission.

To understand just how transformative the change may be, it’s worth examining the one area where this technology is at its most mature and (recently) commercialized: digital imaging.

In 2016 and 2017, academic papers detailing how to generate “super-resolution” images from low resolution inputs began appearing in open-access online journals. While the precise approaches varied, the over-arching technique involved a pair of neural networks. One network was given low resolution input images with the goal of turning them into higher-resolution outputs. The second, so-called “adversarial” network was trained to spot imperfections in the high resolution outputs generated by the first network—imperfections that the human eye would recognize. Then, the first network was tasked with a second job: generate high-resolution images from low-resolution inputs that could pass muster with the second, adversarial network.

What the academics discovered is that as these two neural networks churned away in an iterative process, they groped their way toward generating very high-resolution images from low-resolution inputs that would look pleasing to human observers. But they did more than that: they produced quality results that far surpassed upscaling performed using, older non-AI methods.

Armed with this research, developers at several imaging software companies set about building their own models and neural nets to create upscaling software that could run on local computers (the academic researchers had the luxury of powerful servers). One of the first fruits of this work came from Topaz Labs, which launched a program called A.I. Gigapixel this year that can upscale an image by up to 600 percent. Other imaging software developers are preparing to launch programs using similar, AI-powered approaches.

While kinks remain in bringing this technology to the masses, the implications for this form of upscaling software are profound. It erodes, if not completely erases, the competitive advantage of having a higher megapixel digital camera. If a 12-megapixel photo can become a 72-megapixel photo once it’s in a computer, what’s the point of having a 72-megapixel image sensor? What’s the point of buying terabytes worth of storage to house these images?

In the future, the final size of an image file can be dynamic: if you need to produce a print, you can upscale as necessary, but if you want to share on a smartphone, you need only send a file sized for the phone’s display. When it comes to longer-term digital archiving, you can save a very small file (preserving disk space), knowing full well that you can upscale it in the future if necessary.

This form of upscaling needn’t be restricted to still images. It can be applied to both digital music and video. The video use-case is particularly compelling (and challenging) for two reasons. First, file sizes are so large and, secondly, streaming services are constantly angling for ways to more efficiently encode media to travel across bandwidth-constrained networks. Next month, we’ll examine how these techniques could reshape the video market.