The project proposes a new scheme for
image analysis and retrieval in a large image database. The proposed
retrieval system integrates both the low level image features like
color and texture, and textual information such as image file name or
meta data to improve retrieval performance. Majority of existing work
focuses on low level image features for image retrieval while ignores
the textual information associated with the images. The proposed system
attempts to narrow the gap between content-based image retrieval and
semantic-based image retrieval. The research focuses on how to use
higher level textual information to improve the current content-based
image retrieval while proposes several new content-based techniques
such as color-spatial histogram, histogram dimension reduction and
texture histogram.
Images are very rich in information. While some information is conveyed
in text description, other information is captured by their dominant
colours, object shapes and texture composition. It is unlikely that an
image can be described to users’ expectation using single image
feature. Therefore, an effective image retrieval system should use a
combination of these features. The approach of this project is to look
for promising techniques on extracting these features and then
integrate the best techniques by improving the existing ones or
developing new ones into an integrated image retrieval system.
In conventional content based image retrieval techniques, textual
information is not considered. However, textual information is very
important because it can capture semantics and high level abstraction
in images. It reflects human knowledge on the image data. The higher
level image features are extracted in two ways. The first is to extract
the higher level information from meta data. All the multimedia data
comes with certain type of meta data information, e.g., file name at
the bottom level, and alternative text in the web documents. The second
is to extract the higher level information from the data itself. For
instance, specific colour like red, green, blue, pink, yellow in green
extracted from image data can be represented as higher level semantic
features rather than as arbitrary numbers in conventional content based
image retrieval. These embedded knowledge in the data are of great help
for data description and retrieval. Once textual features are
extracted, they can be used for image retrieval using the latest
information retrieval techniques [8, 9]. In implementation, images can
be retrieved first using textual features and then refined by content
features, or vice versa. This improves both retrieval efficiency and
effectiveness. Textual features can also be used to provide
semantic/keyword based retrieval interface which is more natural to
users than the common query-by-example (QBE) based retrieval interface
in CBIR.
The widely used image retrieval technique uses colour histograms [11].
In histogram technique, a chosen colour space is divided into n bins.
For each image, a histogram is built for each image by counting the
number of pixels classified into each bin. The histogram becomes the
feature vector/descriptor of the image. During retrieval, the images
are retrieved and ranked according to the histogram distances between
the query image and images in databases. The common distances used are
Manhanttan distance (L1) or Euclidean distance (L2). However, common
histogram techniques have several problems such as bin correlation,
spatial correlation and high dimension. Past research has found
solution to the bin correlation problem by considering the relationship
between neighbouring bins [12, 13]. Recently, spatial correlation
problem has drawn extensive attentions. A number of researches attempt
region based approach making use of latest segmentation results [9, 14,
15]. In this research, we propose a technique called colour-spatial
histogram which is a joint histogram of both the conventional colour
histogram and the spatial histogram. In conventional colour histogram,
the value of each bin is the total number of pixels having that bin
colour, no spatial information is included in the value. In the
proposed colour-spatial histogram, however, image space or subspace is
quantized into a number of sections, say 4 or 16 sections. In the
succeeding count of pixels for each colour bin, rather than counting
the pixels irrespective of their spatial sections, pixels are put into
spatial sub-bins within that colour bin based on the pixel locations in
the image space. In other words, pixels falling into each colour bin
are further sorted according to their spatial locations in the image
space. The succeeding normalization and matching is similar to the
conventional histogram technique. The dimension of the colour-spatial
histogram will be higher than conventional histogram, however, the
dimension can be reduced using the following proposed dimension
reduction technique. The technique will be fundamentally different from
existing methods. Rather than attempting to group pixels into region
which is complex to implement and not robust, the proposed method will
compute a histogram which incorporates both colour information and
spatial information. The proposed methods will be compared with the
region based technique.
To solve the high dimension problem, a spectral image descriptor based
on spectral transform on the derived colour or colur-spatial histogram
will be proposed. Our previous experience on shape transform has shown
spectral transform can effectively and significantly reduce feature
dimensions [10, 16]. Spectral features are also more robust than
spatial features. The errors caused by the colour quantization process
can also be reduced due to the use of spectral transform, because more
colours can be used to derive the colour histogram. The reduction of
feature dimension is a significant issue in image retrieval. Once a
solution for dimension reduction is found, more bins can be used in
colour histogram or other histogram based features, as a result, more
accurate features can be used to describe images.
Other features such as texture histogram and shape features can also be
incorporated into the retrieval to improve performance.
3. Qualifications
Generally the applicant must own a bachelor degree with honors,
or a master degree, in related area. The applicant should have a good
skill in Java or C programming.
For international students, the student must pass
the English test
either in IELTS or TOEFL: IELTS (International English Language
Testing System - academic) - minimum test score of 6.5 with a score of
at least 6 for each individual band; TOEFL (Test of English as a
Foreign Language) - minimum test score of 575 with a TWE (Test of
Written English) score of 5. Students can start to apply for it
straightway, there is no time restriction for postgraduate enrolment.
If a student is awarded the scholarship, he/she is likely to obtain the
visa very soon.
The $30,000 scholarship is sufficient to cover both the tuition fee and
living fee for one year. In Australia, research master only takes one
year, student
only works on the research project and writes up a thesis, no courses
are required. After one year, the student can either complete the
master research to obtain the master degree or transfer to a PhD
program before obtaining the master degree.
4. Outcomes
A master thesis on multimedia information retrieval and a master degree
on computing are the direct result from this research. The research is
expected to produce 1~2 research papers published on international
conference on multimedia area. The thesis and publications will pave
the way for multimedia career in either industry or research.
Student who completes this project will gain expertise in the area of
content-based image retrieval and MPEG-7 standard, an overall knowledge
in multimedia computing, image processing and analysis.
5. REFERENCES
[1] W. B. Frakes W. B. and R. Baeza-Yates (ed.), “Information
Retrieval: Data structures and Algorithms”, Prentice Hall, 1992.
[2] G. Salton, “Automatic Text Processing—The Transformation, Analysis,
and Retrieval of Information by Computers”, Addison-Wesley Publishing
Company, 1989.
[3] G. Lu, “Multimedia Database Management Systems”, Artech House, 1999.
[4] B. S. Manjunath, P. Salembier and T. Sikora, “Introduction to
MPEG-7: Multimedia Content Description Interface”, John Wiley &
Sons Publisher, 2002.
[5] M. Flickner et al, “Query by Image and Video Content: the QBIC
System”, IEEE Computer 28(9):23-32, 1995.
[6] J. R. Bach et al., “Virage Image Search Engine: An Open Framework
for Image Management”, SPIE Conf. On Storage and Retrieval for Image
and Video Databases IV, San Jose, CA, pp.76-87, 1996.
[7] J. Feder, “Towards Image Content-based Retrieval for the World-Wide
Web”, Advanced Imaging 11(1):26-29, 1996.
[8] J. Yang, L. Wenyin, H. Zhang and Y. Zhuang, “Thesaurus-aided
Approach for Image Browsing and Retreival”, In Proc. of IEEE
International Conference on Multimedia and Expo (ICME01), pp.313-316,
Tokyo, Japan, 2001.
[9] Y. Liu, D. S. Zhang and G. Lu, “Narrowing Down The ‘Semantic Gap’
in Content-Based Image Retrieval—A Survey”, Submitted to IEEE Trans. on
Multimedia, October, 2004.
[10] D. S. Zhang and G. Lu, “Evaluation of MPEG-7 Shape Descriptors
Against Other Shape Descriptors”, ACM Journal of Multimedia Systems,
Accepted in July 2002.
[11] M. J. Swain and D. H. Ballard, “Colour Indexing”, International
Journal of Computer Vision, 17(1):11-32, 1991.
[12] G. Lu and J. Phillips, "Using Perceptually Weighted Histograms For
Colour-Based Image Retrieval", In Proc. of the 4th Internationla
Conference on Signal Processing, pp.1150-1153, Beijing, China, October,
1998.
[13] J. Huang, S. Kumar, M. Mitra, W. Zhu, and R. Zabih, “Image
Indexing Using Colour Correlograms”, In Proc. of IEEE Conference on
Computer Vision and Pattern Recognition, pp.762-768, San Juan, Puerto
Rico, June 1997.
[14] G. Pass, R. Zabih and J. Miller, “Comparing Images Using Colour
Coherence Vectors”, In Proc. of the 4th ACM International Multimedia
Conference, pp.65-73, 1996.
[15] D. S. Zhang and G. Lu, "Segmentation of Moving Objects in
Image Sequence: A Review", Circuits, Systems and Signal Processing,
20(2):143-183, 2001.
[16] D. S. Zhang and G. Lu, "Shape Based Image Retrieval Using Generic
Fourier Descriptors", Signal Processing: Image Communication,
17(10):825-848, 2002.