Back to results list
Show full item record
Please use this identifier to cite or link to this item:
|Title:||Attention-driven image interpretation, annotation and retrieval||Authors:||Fu, Hong||Degree:||Ph.D.||Issue Date:||2007||Abstract:||This thesis presents novel attention-driven techniques for image interpretation, annotation and retrieval. Four main contributions are reported in the thesis. They include: (1) an attention-driven image interpretation method with application to image retrieval; (2) an efficient algorithm for attention-driven image interpretation from image segments; (3) a pre-classification technique to classify "attentive" images and "non-attentive" images and a combined retrieval strategy; and (4) a semantic network for image annotation by modelling attentive objects and their correlations. In the first investigation, we propose an attention-driven image interpretation method to pop out visually attentive objects from an image iteratively by maximizing a global attention function. In the method, an image is interpreted as containing several perceptually attended objects as well as the background, where each object is measured by an attention value. The attention values of attentive objects are then mapped to importance measures so as to facilitate the subsequent image retrieval. An attention-driven matching algorithm is proposed based on a retrieval strategy emphasizing attended objects. Experiments on 7376 Hemera color images annotated by keywords show that the retrieval results from our attention-driven approach compare favorably with conventional methods, especially when important objects are seriously concealed by the irrelevant background. In the second investigation, the computation issue of an attention-driven image interpretation is addressed. The object reconstruction is a combinational optimization problem with a complexity of 2N which is computationally very expensive when the number of segments N is large. We formulate the attention-driven image interpretation process by a matrix representation. An efficient algorithm based on elementary transformations of the matrix is proposed to reduce the computational complexity to 3N(N-1)2/2. Experimental results on both synthetic and real data show an acceptable small degradation to the accuracy of object formulation while the processing speed is significantly increased. In the third investigation, an all-season image retrieval system is proposed. The system can handle both the images with clearly attentive images and those without clearly attentive images. Firstly, considering the visual contrasts and spatial information of an image, a neural network is trained to classify it as an "attentive" or "non-attentive" image by using the Back Propagation Through Structure (BPTS) algorithm. In the second step, an "attentive" image is processed by an attentive retrieval strategy emphasizing attentive objects. Meanwhile, a "non-attentive" image is processed by a fusing-all retrieval system. An improved performance can be obtained by using this combined system. In the fourth investigation, we propose an image annotation method based on attentively interpreted images. Firstly, a number of interpreted images are annotated manually as training samples. Secondly, a semantic network is constructed, which stores both the visual classifiers of attentive objects and the correlations among concepts. Finally, an annotation strategy is proposed to utilize the semantic network to annotate objects. Experimental results show that the trained semantic network is able to produce good annotation results, especially when a visual classifier does not produce a precise concept.||Subjects:||Hong Kong Polytechnic University -- Dissertations.
Image processing -- Mathematics.
|Pages:||xx, 169 leaves : ill. (chiefly col.) ; 30 cm.|
|Appears in Collections:||Thesis|
View full-text via https://theses.lib.polyu.edu.hk/handle/200/160
Citations as of May 28, 2023
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.