一种提升攻击能力的降信息对抗样本方法

An Adversarial Example Method Based on Dropping Information to Improve Attack Capability

当前, 深度神经网络在多个研究领域中得到广泛应用。但随着对人工智能研究的深入, 研究者发现基于深度神经网络的人工智能技术在带来便利的同时也存在安全隐患。例如, 攻击者使用对抗样本方法在干净图像中添加细微扰动, 可导致图像分类模型输出错误的结果。相较于以往在图像上添加额外信息产生对抗样本的方法, Duan Ranjie等人近期提出了AdvDrop算法。该算法通过调整量化步长来删除图像的现有信息从而生成对抗样本, 但AdvDrop在量化过程中并未考虑量化表不同的梯度数值对于对抗效果的不同影响。对此, 本文提出AdvDrop+, 即在每次迭代过程中根据量化表的梯度数值来更新量化表。梯度数值被一个缩放因子所缩放。为了找到合适的缩放因子, AdvDrop+在梯度直方图中找到最高频率的梯度值, 并计算其对数, 最终得到的结果即为缩放因子。实验表明, 在目标攻击的设置下, 图像失真几近相同, AdvDrop+有着比AdvDrop更好的攻击性能。同时, AdvDrop+保留了AdvDrop能够降信息的特点。

At present, deep neural network has been widely used in many research fields. However, with the in-depth research on artificial intelligence research, it is found that artificial intelligence technology based on deep neural networks bringsconvenience coming with potential security risks. For example, an attacker may misguide the image classification model to output a wrong result with high confidence by adding slight perturbations to a clean image via the adversarial example method. Compared with the previous methods of adding additional information to the images to generate adversarial examples, Ranjie Duan et al. proposed the AdvDrop algorithm. In this algorithm, adversarial examples are generated by deleting the existing information of the images, which is realized by adjusting the quantization step. However, in the quantization process, the AdvDrop algorithm does not consider the different effects of different gradient values of the quantization table on the adversarial effect. In this regard, AdvDrop+ is proposed, that is, in each iteration, the quantization tables are updated according to the gradient numerical values scaled with a factor. To find the proper scaling factor, we find the gradient value of the highest frequency in the gradient histogram and compute its logarithm and the final result is the scaling factor. The experiments show that AdvDrop+ has better attack performance than AdvDrop under the setting of target attack with nearly the same image distortion. At the same time, AdvDrop+ retains the characteristic of AdvDrop, which can drop information.