Abstract：In order to improve the performance of the image annotation, an image semantic annotation method based on multi-modal relational graph was proposed. The relationship between the low-level features of the image region, annotated words and images was presented by an undirected graph. Semantic information was extracted by combining similarity measured in the region feature space and the correlation of annotation words to improve the accuracy of the extracted semantics. Inverse Document Frequency (IDF) was introduced to adjust the weights of edges between the image node and its annotation words node in order to overcome the deviation caused by high-frequency words. It can effectively improve the image annotation performance. The experimental results on the Corel image datasets show the effectiveness of the proposed approach in terms of quality of the image annotation.