In order to introduce the semantic correlation between labels into the multi-label image classification model, ADD-GCN ( Attention-Driven Dynamic Graph Convolutional Network ) generates a dynamic graph for each image. The Dynamic Graph Convolutional Network ( D-GCN ) is used to model the relationship between the content-aware category representations generated by the Semantic Attention Module ( SAM ) to avoid frequency bias.However, ADD-GCN cannot automatically learn and selectively focus on important information in the input. When transmitting data, it is prone to instability and difficult to summarize all label semantic relationships. Aiming at the problem of ADD-GCN, a new multi-label image recognition model based on multi-scale dynamic graph convolutional network is proposed based on ADD-GCN.The model updates the multi-scale feature extraction module and integrates the Convolutional Block Attention Module ( CBAM ) to automatically focus on important information. In order to improve the data transmission of SAM and D-GCN, Gaussian Error Linear Units ( GELU ) is integrated to complete the mapping of neurons, and Adaptive Moment Estimation ( Adam ) and Binary Cross-Entropy With Logits Loss ( BCEWithLogitsLoss ) are introduced to make the data propagate stably. Compared with the original ADD-GCN model, the average accuracy of the algorithm in the MS-COCO data set and the PASCAL VOC data set reached 87.1 % and 95.6 %, respectively, which has better detection and recognition accuracy. It effectively improves the detection and recognition effect of the model for multi-label images.
@artical{l13102024ijsea13101022,
Title = "Multi-Label Image Recognition Based on Attention and Multi-Scale Dynamic Graph Convolutional Network",
Journal ="International Journal of Science and Engineering Applications (IJSEA)",
Volume = "13",
Issue ="10",
Pages ="102 - 109",
Year = "2024",
Authors ="Luo Litao, Lin Qihuang"}