Researchers find way to boost self-supervised AI models’ robustness

In self-supervised learning, an AI technique where the training data is automatically labeled by a feature extractor, the said extractor not uncommonly exploits low-level features (known as “shortcuts”) that cause it to ignore useful representations. In search of a technique that might help to remove those shortcuts autonomously, researchers at Google Brain developed a framework — a “lens” — that makes changes enabling self-supervised models to outperform those trained in a conventional fashion.

As the researchers explain in a preprint paper published this week, in self-supervised learning, extractor-generated labels are used to create a pretext task that requires learning abstract, semantic features. A model pre-trained on the task can then be transferred to tasks for which labels are expensive to obtain, for example by fine-tuning the model for a given target task. But defining pretext tasks is often challenging because models are biased toward exploiting the simplest features, like logos, watermarks, and color fringes caused by camera lenses.

Fortunately, the features a model can use to solve a pretext task can be used by an adversary to make the pretext task harder. The researchers’ framework, then — which targets self-supervised computer vision models — processes images with a lightweight image-to-image model called lens, which is trained adversarially to reduce pretext task performance. Once trained, the lens can be applied to unseen images, so it can be used when transferring the model to a new task. In addition, the lens can help to visualize the shortcuts by spotlighting the differences between the input and the output images, providing insights into how shortcuts differ.

In experiments, the researchers trained a self-supervised model on an open source data set — CIFAR-10 — and tasked it with predicting the correct orientation of images rotated slightly. To test the lens, they added shortcuts to the input images designed to contain directional information and allow solving of the rotation task without the need to learn object-level features. They report that representations learned from by the model (without lens) from the data with synthetic shortcuts performed poorly, while feature extractors learned from the lens performed “dramatically” better overall.

In a second test, the team trained a model on over a million images in the open source ImageNet corpus and had it predict the relative location of one or more patches contained within the images. They say that for all tested tasks, adding the lens lead to an improvement over the baseline.

“Our results show that the benefit of automatic shortcut removal using an adversarially trained lens generalizes across pretext tasks and across data sets. Furthermore, we find that gains can be observed across a wide range of feature extractor capacities,” wrote the study’s coauthors. “Apart from improved representations, our approach allows us to visualize, quantify and compare the features learned by self-supervision. We confirm that our approach detects and mitigates shortcuts observed in prior work and also sheds light on issues that were less known.”

In future research, they plan to explore new lens architectures and see whether the technique could be applied to further improve supervised learning algorithms.

Source: Read Full Article