Human Recognition, Visually Mediated Interaction, Communication and Surveillance

Shaogang Gong, Queen Mary University of London


Human Recognition, Video Content Analysis and Search (funded by MOD, EPSRC and EU FP7):

 

We are developing robust models for fully automated semantic content analysis of CCTV video data based on the detection and recognition of human presence, abnormal event and activity, and video topic spotting and scene change detection without meta-data. We are also developing algorithms for automated selective zooming and super-resolution in CCTV recordings given variable levels of input resolution. We wish to synthesize in arbitrary virtual views good-quality close-up images of partially occluded low-resolution objects captured in live CCTV in order to improve the accuracy of automatic face-recognition in video under realistic operational conditions. Download QMUL i-LIDS Re-identification Dataset, QMUL Underground Re-Identification (GRID) Dataset, QMUL Road Traffic Datasets, QMUL Mall Dataset, QMUL Junction Dataset.

Selected publications:

1.     W. Zheng, S. Gong and T. Xiang. Re-identification by Relative Distance Comparison. IEEE Transactions on Pattern Analysis and Machine Intelligence, PrePrint ISSN: 0162-8828, June 2012.

2.     T. Hospedales, S. Gong and T. Xiang. Video Behaviour Mining Using a Dynamic Topic Model. International Journal of Computer Vision, Vol. 98, No. 3, pp. 303-323, July 2012.

3.     W. Zheng, S. Gong and T. Xiang. Quantifying and Transferring Contextual Information in Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 34, No. 4, pp. 762-777, April 2012.

4.     C.C. Loy, T. Xiang and S. Gong. Incremental Activity Modelling in Multiple Disjoint Cameras. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 34, No. 9, pp. 1799-1813, September 2012.

5.     T. Hospedales, S. Gong and T. Xiang. A Unifying Theory of Active Discovery and Learning. In Proc. European Conference on Computer Vision, Firenze, Italy, October 2012.

6.     Y. Fu, T. Hospedales, T. Xiang and S. Gong. Attribute Learning for Understanding Unstructured Social Activity. In Proc. European Conference on Computer Vision, Firenze, Italy, October 2012.

7.     C. Liu, S. Gong, C.C. Loy and X. Lin. Person Re-Identification: What Features Are Important? In Proc. First International Workshop on Re-Identification, Firenze, Italy, October 2012.

8.     R. Layne, T. Hospedales and S. Gong. Towards Person Identification and Re-Identification with Attributes. In Proc. First International Workshop on Re-Identification, Firenze, Italy, October 2012.

9.     W. Zheng, S. Gong and T. Xiang. Transfer Re-identification: From Person to Set-based Verification. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, Providence, Rhode Island, USA, June 2012.

10.  C.C. Loy, T. Hospedales, T. Xiang and S. Gong. Stream-based Joint Exploration-Exploitation Active Learning. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, Providence, Rhode Island, USA, June 2012.

11.  M. Bregonzio, T. Xiang and S. Gong. Fusing Appearance and Distribution Information of Interest Points for Action Recognition. Pattern Recognition, Vol. 45, No. 3, pp. 1220-1234, March 2012.

12.  S. Gong and T. Xiang. Visual Analysis of Behaviour: From Pixels to Semantics, 376 pages, Springer, May 2011.

13.  S. Gong, C.C. Loy and T. Xiang. Security and Surveillance. In Moeslund, Hilton, Kruger and Sigal (Eds.), Visual Analysis of Humans: Looking at People, pp. 455-472, Springer, September 2011.

14.  J. Li, S. Gong and T. Xiang. Learning Behavioural Context. International Journal of Computer Vision, Vol. 97, No. 3, pp. 276-304, May 2012.

15.  T. Hospedales, J. Li, S. Gong and T. Xiang. Identifying Rare and Subtle Behaviours: A Weakly Supervised Joint Topic Model. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 33, No. 12, pp. 2451-2464, December 2011.

16.  T. Hospedales, S. Gong and T. Xiang. Learning Tags from Unsegmented Videos of Multiple Human Actions. In Proc. IEEE International Conference on Data Mining, Vancouver, Canada, December 2011.

17.  T. Hospedales, S. Gong and T. Xiang. Finding Rare Classes: Active Learning with Generative and Discriminative Models. IEEE Transactions on Knowledge and Data Engineering, PrePrint ISSN: 1041-4347, November 2011.

18.  W. Zheng, S. Gong and T. Xiang. Person Re-identification by Probabilistic Relative Distance Comparison. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, USA, June 2011.

19.  C.C. Loy, T. Xiang and S. Gong. Time-Delayed Correlation Analysis for Multi-Camera Activity Understanding. International Journal of Computer Vision, Vol. 90, No. 1, pp. 106-129, October 2010.

20.  J. Zhang and S. Gong. Action Categorisation by Structural Probabilistic Latent Semantic Analysis. Computer Vision and Image Understanding, Vol. 114, No. 8, pp. 857-864, August 2010.

21.  J. Zhang and S. Gong. Action Categorisation with Modified Hidden Conditional Random Field. Pattern Recognition, Vol. 43, No. 1, pp. 197-203, January 2010.

22.  W. Zheng, S. Gong and T. Xiang. Quantifying Contextual Information for Object Detection. In Proc. IEEE International Conference on Computer Vision, Kyoto, Japan, October 2009.

23.  T. Hospedales, S. Gong and T. Xiang. A Markov Clustering Topic Model for Mining Behaviour in Video. In Proc. International Conference on Computer Vision, Kyoto, Japan, October 2009.

24.  C.C. Loy, T. Xiang and S. Gong. Modelling Activity Global Temporal Dependencies using Time Delayed Probabilistic Graphical Model. In Proc. International Conference on Computer Vision, Kyoto, Japan, October 2009.

25.    M. Bregonzio, S. Gong and T. Xiang. Recognising Action as Clouds of Space-Time Interest Points. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, June 2009.

26.  C.C. Loy, T. Xiang and S. Gong. Multi-Camera Activity Correlation Analysis. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, June 2009.

27.  Y. Wang, T. Mei, S. Gong and X. Hua. Combining Global, Regional and Contextual Features for Automatic Image Annotation. Pattern Recognition, Vol. 42, No. 2, pp. 259-266, February 2009.

28. T. Xiang and S. Gong. Optimising Dynamic Graphical Models for Video Content Analysis. Computer Vision and Image Understanding, Vol. 112, No. 3, pp. 310-323, December 2008.

29.  J. Li, S. Gong and T. Xiang. Scene Segmentation for Behaviour Correlation. In Proc. European Conference on Computer Vision, Marseille, France, October 2008.

30.  D. Russell and S. Gong. Multi-Layered Decomposition of Recurrent Scenes. In Proc. European Conference on Computer Vision, Marseille, France, October 2008.

31.  T. Xiang and S. Gong. Incremental and Adaptive Abnormal Behaviour Detection. Computer Vision and Image Understanding, Vol. 111, No. 1, pp. 59-73, July 2008.

32.  T. Xiang and S. Gong. Activity based Surveillance Video Content Modelling. Pattern Recognition, Vol. 41, No. 7, pp. 2309-2326, July 2008.

33.  K. Jia and S. Gong. Generalised Face Super-Resolution. IEEE Transactions on Image Processing, Vol. 17, No. 6, pp. 873-886, June 2008.

34.  T. Mei, Y. Wang, X. Hua and S. Gong. Coherent image annotation by semantic distance learning. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, pp. 1-8, Alaska, USA, June 2008.

35.  T. Xiang and S. Gong. Video Behaviour Profiling for Anomaly Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 30, No. 5, pp. 893-908, May 2008.

36. T. Xiang and S. Gong. Model selection for unsupervised learning of visual context. International Journal of Computer Vision, Vol. 69, No. 2, pp. 181-201, 2006.

37.  T. Xiang and S. Gong. Beyond tracking: Modelling activity and understanding behaviour. International Journal of Computer Vision, Vol. 67, No. 1,  pp. 21-51, 2006.

38.  K. Jia and S. Gong. Hallucinating multiple occluded face images of different resolutions. Pattern Recognition Letters, Vol. 27, No. 15, pp. 1768-1775, November 2006.

39.  K. Jia and S. Gong. Multi-resolution patch tensor for facial expression hallucination. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, Vol. 1, pp. 395-402, New York, June 2006.

40.  T. Xiang and S. Gong. Visual learning given sparse data of unknown complexity. In Proc. IEEE International Conference on Computer Vision, Beijing, October 2005.

41.  K. Jia and S. Gong. Multi-modal tensor face for simultaneous super-resolution and recognition. In Proc. IEEE International Conference on Computer Vision, Beijing, October 2005.

42.  T. Xiang and S. Gong. Video behaviour profiling and abnormality detection without manual labelling. In Proc. IEEE International Conference on Computer Vision, Beijing, October 2005.

43.  T. Xiang and S. Gong. Online video behaviour abnormality detection using reliability measure. In Proc. British Machine Vision Conference, Oxford, September 2005.

44.  S. Gong. Finding a needle in haystacks: Towards behaviour recognition based video surveillance, In Security and Defence 2004, invited article, London, October 2004.

45.  Graves and S. Gong. Wavelet-based holistic sequence descriptor for generating video summaries. In Proc. British Machine Vision Conference, pp. 167-176, Kingston-upon-Thames, England, September 2004.

 

Face Recognition:

 

To be able to recognise faces of moving people not only requires the ability to label novel face images with known identities, but also needs detecting and tracking of faces over time. We are interested in the problem of recognising moving faces captured in image sequences. Compared to the more typical scenarios of face recognition in which a single or a few isolated face images of frontal or near-frontal view are the subjects of interest, it is notoriously more difficult to recognise faces of moving people in natural scenes. This requires not only correct recognition of continuously changing face images of the same person, but also consistent detection and tracking of faces in a given dynamic scene therefore enabling any recognition to take place. In such a scenario, face recognition not only needs to cope with changes in face images caused by variations in illumination, scale, translation and rotation in the image-plane, but also associate face images of the same person from significantly different poses caused by head rotations in depth (facial identity surface).

Selected publications:

S. Gong, S. McKenna and A. Psarrou. Dynamic Vision: From Images to Face Recognition. Imperial College Press, World Scientific Publishing, May 2000. (download QMUL Multiview Face Dataset).

Y. Li, S. Gong, J. Sherrah and H. Liddell. Support vector machine based multi-view face detection and recognition. Image and Vision Computing, Vol. 22, No. 5, pp. 413-427, May 2004.

Y. Li, S. Gong and H. Liddell. Constructing facial identity surfaces for recognition. International Journal of Computer Vision, Vol. 53, No. 1, June 2003.

Y. Li, S. Gong and H. Liddell. Recognising trajectories of facial identities using Kernel Discriminant Analysis. Image and Vision Computing, Vol. 21, No. 13-14, pages 1077-1086, December 2003.

Y. Li, S. Gong, H. Liddell. Recognising trajectories of facial identities using Kernel Discriminate Analysis. In Proc. British Machine Vision Conference, pages 613-622, Manchester, UK, 2001. Best Scientific Paper Award.

Y. Li, S. Gong and H. Liddell. Modelling faces dynamically across views and over time. In Proc. IEEE International Conference on Computer Vision, Vancouver, Canada, July 2001.

Y. Li, S. Gong and H. Liddell. Video-based online face recognition using identity surfaces. In Proc. IEEE ICCV Workshop on Recognition, Analysis and Tracking of Faces and Gestures in Real-time Systems, pp. 40-47, Vancouver, Canada, July 2001. Best Paper Prize.

S. McKenna and S. Gong. Recognising moving faces. In Wechsler, Philips, Bruce, Fogelman-Soulie, and Huang (Eds.) Face Recognition: From Theory to Applications, NATO ASI Series F, Springer-Verlag, July 1998.

S. McKenna, S. Gong and Y. Raja. Modelling facial colour and identity with Gaussian mixtures. Pattern Recognition. Vol. 31, No. 12, pp. 1883-1892, 1998.

Y. Raja, S. McKenna and S. Gong. Tracking and segmenting people in varying lighting conditions using colour. In Proc. IEEE International Conference on Automatic Face and Gesture Recognition, Nara, Japan, 14-16 April 1998.

S. McKenna and S. Gong. Non-intrusive person authentication for access control by visual tracking and face recognition. In Proc. IAPR International Conference on Audio-Video Based Biometric Person Authentication, pp. 177-184, Crans-Montana, Switzerland, March 1997.

S. Gong, A. Psarrou, I. Katsoulis and P. Palavouzis. Tracking and recognition of face sequences. In Proc. European Workshop on Combined Real and Synthetic Image Processing for Broadcast and Video Production, pp. 97-112, Hamburg, Germany, November 1994.

 

Face and Gesture Analysis (funded by EPSRC and the British Council):

 

Human body and facial movement are powerful channels for personal communication. Methods are investigated to develop 2D view-based robust models of human gestures and 3D head pose, and a mechanism for attentional-focus switching. Spatio-temporal information is extracted from video sequences and matched to models that have been previously learned from training examples.


Selected publications:

1.     Leung and S. Gong. Online feature selection using mutual information for real-time multi-view object tracking. In Proc. IEEE International Workshop on Analysis and Modelling of Faces and Gestures, Beijing, October 2005.

2.     Shan, S. Gong and P. McOwan. Appearance manifold of facial expression. In Proc. IEEE International Workshop on Human-Computer Interaction, Beijing, October 2005. 

3.     Shan, S. Gong and P. McOwan. Recognising facial expression at low resolution. In Proc. IEEE International Conference on Advanced Video and Signal based Surveillance, Como, September 2005.

4.     Shan, S. Gong and P. McOwan. Robust facial expression recognition using local binary patterns. In IEEE International Conference on Image Processing, Geona, September 2005.

5.     Leung and S. Gong. An optimization framework for real-time appearance-based tracking under weak perspective. In Proc. British Machine Vision Conference, Oxford, September 2005.

6.     Shan, S. Gong and P. McOwan. Conditional mutual information based boosting for facial expression recognition. In Proc. British Machine Vision Conference, Oxford, September 2005.

7.     L. Zalewski and S. Gong. 2D statistical models of facial expressions for realistic 3D avatar animation. In IEEE Conference on Computer Vision and Pattern Recognition, San Diego, USA, June 2005.

8.     A.P. Leung and S. Gong. Multi-view temporal tracking under weak perspective in real-time. In IEE International Conference on Visual Information Engineering, Glasgow, April 2005.

9.     L. Zalewski and S. Gong. A statistical virtual head animator. In IEE International Conference on Visual Information Engineering, Glasgow, April 2005.

10.  L. Zalewski and S. Gong. A probabilistic hierarchical framework for expression classification. In Proc. Artificial Intelligence and the Simulation of Behaviour Symposium on Language, Speech and Gesture for Expressive Characters, pp. 12-20, Leeds, UK, March 2004.

11.  L. Zalewski and S. Gong. Synthesizing and recognition of facial expressions in virtual 3D views. In Proc. IEEE International Conference on Automatic Face and Gesture Recognition, Seoul, Korea, May 2004.

12.  S. Gong, A. Psarrou and S. Romdhani. Corresponding dynamic appearances. Image and Vision Computing, Vol. 20, No. 4, pages 307-318, 2002.

13.  J. Sherrah, S. Gong and E-J. Ong. Face distribution in similarity space under varying head pose. Image and Vision Computing, Vol.19, No.11, 2001.

14.  S. Romdhani, A. Psarrou and S. Gong. On utilising template and feature-based correspondence in multi-view appearance models. In Proc. European Conference on Computer Vision, Vol. 1, pp. 799-813, Dublin, Ireland, 26 June - 1 July 2000.

15.  S. Romdhani, S. Gong and A. Psarrou. Multi-view nonlinear active shape model using kernel PCA. In Proc. British Machine Vision Conference, Nottingham, England, 13-16 September 1999. Best Scientific Paper Award.

16.  S. Gong, Eng-Jon Ong and S. McKenna. Learning to associate faces across views in vector space of similarities to prototypes. In Proc. British Machine Vision Conference, Southampton, England, September 1998.

17.  S. McKenna and S. Gong. Gesture recognition for visually mediated interaction using probabilistic event trajectories. In Proc. British Machine Vision Conference, Southampton, England, September 1998.

18.  S. McKenna and S. Gong. Real-time face pose estimation. International Journal on Real Time Imaging, Special Issue on Real-time Visual Monitoring and Inspection. Vol. 4, pp. 333-347, 1998.

19.  S. Gong, S. McKenna, and J.J. Collins. An investigation into face pose distributions. Proc. IEEE International Conference on Automatic Face and Gesture Recognition, pp. 265-270, Vermont, USA, October 1996.

 

Modellng Temporal Structures for Recognition (funded by EPSRC and HEFCE):

 

A human subject is invariably almost always in motion, either due to the relative motion of other observers or the movement of the person's head and body. Perceiving a moving face for instance involves more than that of a static picture of a face. Likewise, the nature of human body gestures is essentially temporal. The underlying spatial and in particular temporal trajectories in high-dimensional space (we refer to as temporal structures) are important for the modelling of a dynamic visual phenomenon arising from activities such as gestures and human actions. Our current work focuses on learning probabilistic models of temporal structures of human behaviour and actions.

Selected publications:

J. Ng and S. Gong. On the binding mechanism of synchronised visual events. In Proc. IEEE Workshop on Motion and Video Computing, Orlando, FL, USA, December 2002.

J. Ng and S. Gong. Learning intrinsic video content using Levenshtein distance in graph partitioning. In Proc. European Conference of Computer Vision, Copenhagen, Denmark, May 2002.

A. Psarrou, S. Gong and M. Walter. Recognition of human gestures and behaviour. Image and Vision Computing, Vol. 20, No. 5-6, pages 349-358, 2002.

M. Walter, A. Psarrou and S. Gong. Data driven model acquisition using Minimum Description Length. In Proc. British Machine Vision Conference, Manchester, UK, September 2001.

M. Walter, A. Psarrou and S. Gong. Auto-clustering for unsupervised learning of atomic gesture components using Minimum Description Length. In Proc. IEEE ICCV Workshop on Recognition, Analysis and Tracking of Faces and Gestures in Real-time Systems, pp. 157-163, Vancouver, Canada, July 2001.

M. Walter, A. Psarrou, and S. Gong. Incremental gesture recognition. In Proc. IEEE International Workshop on Human Motion, Austin, Texas, USA, December 2000.

S. Gong, M. Walter and A. Psarrou. Recognition of temporal structures: Learning prior and propagating observation augmented densities via hidden Markov states. In Proc. IEEE International Conference on Computer Vision, pp.157-162, Corfu, Greece, September 1999.


Perceptual Fusion (funded by EPSRC and industry):

 

The paradigm of perceptual fusion provides robust solutions to computer vision problems. By combining the outputs of multiple inexpensive vision modules, the assumptions and constraints of each module are factored out to result in a more robust system overall. Perceptual fusion using both visual cues such as motion and colour and perceptual constraints such as the epipolar line and homography has been adopted for robust and consistent object detection and tracking. Fusion of multiple uncalibrated cameras has been exploited to track objects through non-overlapping views. The method has also been applied to a multi-view moving face detection and tracking system.

Selected publications:

  1. J. Sherrah and S. Gong. Continuous global evidence-based Bayesian Modality Fusion for simultaneous tracking of multiple objects. In Proc. IEEE International Conference on Computer Vision, Vancouver, Canada, 2001.
  2. T. Chang and S. Gong. Bayesian Modality Fusion for tracking multiple people with a multi-camera system. In Proc. European Workshop on Advanced Video-based Surveillance Systems, Kingston, UK, September 2001.
  3. T. Chang and S. Gong. Tracking multiple people with a multi-camera system. In Proc. IEEE ICCV Workshop on Multi-Object Tracking, Vancouver, Canada, July 2001.
  4. J. Sherrah and S. Gong. Fusion of perceptual cues for robust tracking of head pose and position. Pattern Recognition, special issue on Fusion in Image Processing and Computer Vision, Vol.34, No.8, 2001.
  5. T. Chang, S. Gong and E-J. Ong. Tracking multiple people under occlusion using multiple cameras. In Proc. British Machine Vision Conference, Bristol, UK, 2000.
  6. J. Sherrah and S. Gong. Fusion of 2D face alignment and 3D head pose estimation for robust and real-time performance. In Proc. IEEE International Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, Corfu, Greece, 26-27 September 1999.

 

Real-Time Object Tracking and Segmentation (funded by EPSRC, EU and industry):

 

Models for robust and real-time visual tracking have been developed based on learning view-based appearance and template models, computing qualitative visual motion from image sequences, and learning adaptive colour models for real-time object tracking and multi-coloured foreground and background segmentation.

Selected publications:

S. McKenna, Y. Raja and S. Gong. Tracking colour objects using adaptive mixture models. Image and Vision Computing, Vol. 17, pp. 225-231, 1999.

F. de la Torre, S. Gong and S. McKenna. View alignment with dynamically updated affine tracking. In Proc. IEEE International Conference on Automatic Face and Gesture Recognition, Nara, Japan, 14-16 April 1998.

F. de la Torre, S. Gong and S. McKenna. View-based Adaptive Affine Alignment. In Proc. European Conference on Computer Vision, Freiburg, Germany, 1998.

Y. Raja, S. McKenna and S. Gong. Tracking and segmenting people in varying lighting conditions using colour. In Proc. IEEE International Conference on Automatic Face and Gesture Recognition, Nara, Japan, 14-16 April 1998.

S. McKenna, S. Gong, R. Würtz, J. Tanner and D. Banin. Tracking facial motion using Gabor wavelets and flexible shape models. In Proc. IAPR International Conference on Audio-Video Based Biometric Person Authentication, pp. 35-43, Crans-Montana, Switzerland, March 1997.

S. McKenna and S. Gong. Tracking faces. In Proc. IEEE International Conference on Automatic Face and Gesture Recognition, pp. 271-277, Vermont, USA, 1996.

S. McKenna and S. Gong. Combined motion and model-based face tracking. In Proc. British Machine Vision Conference, pp. 755-765, Edinburgh, Scotland, 1996.

 

Modelling Human Body Dynamics (funded by the British Council and EPSRC):

 

We investigate a computationally efficient and robust representation of a human subject for behavioural prediction and virtual reality interaction. To this end, we exploit both 2D view-based image feature models and 3D body structural-based virtual models. In particular, a virtual 3D skeleton model of a human body is learned from linear combination of example views. Probabilistic learning methods based on Hierarchical Principal Component Analysis is exploited to learn how 2D image features correlate with corresponding 3D skeleton models.

Selected publications:

E-J. Ong and S. Gong. The dynamics of linear combinations. Image and Vision Computing, Vol. 20, No. 5-6, pages 397-414, 2002.

E-J. Ong and S. Gong. Quantifying ambiguities in inferring vector-based 3D models. In Proc. British Machine Vision Conference, Bristol, UK, September 2000.

E-J. Ong and S. Gong. Tracking hybrid 2D-3D human models through multiple views. In Proc. IEEE International Workshop on Modelling People, Corfu, Greece, 20 September 1999.

E-J. Ong and S. Gong. A dynamic human model using hybrid 2D-3D representations in hierarchical PCA space. In Proc. British Machine Vision Conference, Nottingham, England, 13-16 September 1999.

 

Understanding Visual Behaviour (funded by the DTI and EPSRC):

Computational understanding of human behaviour from visual data is critical for computer vision systems that aim to provide a more natural human-computer interface experience, to create realistic visually augmented virtual sets in interactive 3D TV productions, to facilitate visually mediated telecommunication, and to build systems that can process surveillance data unsupervised. The problem of interpreting behaviours of humans and their activities is challenging because behaviour cannot be visually observed or measured from images. It is commonly hard even for human subjects to describe their own behaviour adequately. We wish to infer semantics of human behaviour patterns for autonomous visual event recognition in dynamic scenes.

Selected publications:

  1. J. Ng and S. Gong. Learning pixel-wise signal energy for understanding semantics. Accepted to appear in Image and Vision Computing, 2003.
  2. S. Gong and T. Xiang. Recognition of group activities using a dynamic probabilistic network. In Proc. IEEE International Conference on Computer Vision, pages 742-749, Nice, France, October 2003.
  1. T. Xiang and S. Gong. On the structure of dynamic Bayesian networks for complex scene modelling. In Proc. Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, Nice, France, October 2003.
  2. T. Xiang and S. Gong. Outdoor activity recognition using multi-linked temporal processes. In Proc. British Machine Vision Conference, Norwich, UK, September 2003.
  3. A. Graves and S. Gong. Spotting scene change for indexing surveillance video. In Proc. British Machine Vision Conference, Norwich, UK, September 2003.
  4. T. Xiang and S. Gong. Discovering Bayesian causality among visual events in a complex outdoor scene. In Proc. IEEE International Conference on Advanced Video- and Signal-based Surveillance, Miami, USA, July 2003.
  5. S. Gong and T. Xiang. Scene event recognition without tracking. Special issue on visual surveillance, Acta Automatica Sinica (Chinese Journal of Automation), Chinese Academy of Sciences, Vol. 29, No. 3, pages 321-331, May 2003.
  1. J. Ng and S. Gong. On the binding mechanism of synchronised visual events. In Proc. IEEE Workshop on Motion and Video Computing, Orlando, FL, USA, December 2002.
  2. S. Gong and H. Buxton. Editorial: Understanding visual behaviour. Image and Vision Computing, Vol. 20, No. 12, pages 825-826, October 2002.
  3. S. Gong, J. Ng and J. Sherrah. On the semantics of visual behaviour, structured events and trajectories of human action. Image and Vision Computing, Vol. 20, No. 12, pages 873-888, October 2002.
  4. T. Xiang, S. Gong and D. Parkinson. Autonomous visual events detection and classification without explicit object-centred segmentation and tracking. In Proc. British Machine Vision Conference, Cardiff, September 2002.
  5. J. Ng and S. Gong. Learning intrinsic video content using Levenshtein distance in graph partitioning. In Proc. European Conference on Computer Vision, Part-IV, pages 670-684, Copenhagen, Denmark, May 2002.
  6. J. Ng and S. Gong. Learning pixel-wise signal energy for understanding semantics. In Proc. British Machine Vision Conference, Manchester, UK, 2001.
  7. J. Sherrah and S. Gong. Automated detection of localised visual events over varying temporal scales. In Proc. European Workshop on Advanced Video-based Surveillance Systems, Kingston, UK, 2001.
  8. H. Buxton and S. Gong. Visual surveillance in a dynamic and uncertain world. Artificial Intelligence, Special volume on computer vision, Vol. 78, No. 1-2, 1995.
  9. S. Gong and H. Buxton. Bayesian nets for mapping contextual knowledge to computational constraints in motion segmentation and tracking. In Proc. British Machine Vision Conference, pp. 229-239, Guildford, UK, September 1993.
  10. S. Gong. Visual observation as reactive learning. In Proc. International Conference on Adaptive and Learning Systems, pp.175-187, Orlando, USA, 1992.
  11. S. Gong and H. Buxton. On the expectations of moving objects: A probabilistic approach with visually augmented hidden Markov model. In Proc. European Conference on Artificial Intelligence, pp. 781-785, Vienna, Austria, August 1992.
  12. S. Gong. Visual behaviour: modelling hidden purposes in motion. In Proc. International Conference on Neural and Stochastic Methods in Image and Signal Processing, pp. 45-57, San Diego, USA, November 1992.

Visual Learning (funded by Queen Mary Research Studentships):

Statistical learning methods and in particular Support Vector Machines (SVMs) can be used to effectively exploit the principle of Structural Risk Minimisation (SRM). We are currently investigating Support Vector based classification and regression functions as a general approach to visual learning. Applications have been in multi-view face detection, tracking and the modelling of 3D human head pose.

Selected publications:

  1. J. Ng and S. Gong. Composite support vector machines for detection of faces across views and pose estimation. Image and Vision Computing, Vol. 20, No. 5-6, pages 359-368, 2002.
  2. Y. Li, S. Gong and H. Liddell. Constructing facial identity surfaces in a nonlinear discriminating space. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii, USA, December, 2001.
  3. Y. Li, S. Gong and H. Liddell. Support vector regression and classification based multi-view face detection and recognition. In Proc. IEEE International Conference on Face and Gesture Recognition, Grenoble, France, March 2000.
  4. J. Ng and S. Gong. Performing multi-view face detection and pose estimation using a composite support vector machine across the view sphere. In Proc. IEEE International Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, Corfu, Greece, 26-27 September 1999.
  5. J. Ng and S. Gong. A multi-view face model using support vector machines. In Proc. British Machine Vision Conference, Nottingham, England, 13-16 September 1999.

Visually Mediated Interaction (funded by EPSRC):

This work investigates vision functions for intentional tracking and active camera switching (control). We developed active computer vision systems which perform dynamic scene interpretation in terms of subjects' behaviour and intention to mediate human-machine interaction. A prototype system, VIGOUR, has been built to perform real-time tracking of multiple people and behavioural analysis of several individuals (at most three simultaneously) within typical indoor office or home environments.

Selected publications:

  1. J. Sherrah, S. Gong, J. Howell and H. Buxton. Interpretation of group behaviour in visually mediated interaction. In Proc. International Conference on Pattern Recognition, Vol. 1 (Computer Vision and Image Analysis), pp. 266-275, Barcelona, Spain, September 2000.
  2. J. Sherrah and S. Gong. VIGOUR: A system for tracking and recognition of multiple people and their activities. In Proc. International Conference on Pattern Recognition, Vol. 1 (Computer Vision and Image Analysis), pp. 179-183, Barcelona, Spain, September 2000.
  3. J. Sherrah and S. Gong. Tracking discontinuous motion using Bayesian inference. In Proc. European Conference on Computer Vision, Vol. 2, pp. 150-166, Dublin, Ireland, 26 June - 1 July 2000.
  4. J. Sherrah and S. Gong. Exploiting context in gesture recognition. In Proc. 2nd International Interdisciplinary Conference on Modelling and Using Context, Trento, Italy, 9-11 September 1999.

 


 

s.gong AT qmul.ac.uk