Vu Hai's homepage

About me | Research Topics | Publications |  Professional Activities  | Home

Reasearch Topics: Human Computer Interaction

Video Capsule Endoscopy Analysis

Computer Vision in Agricultural Engineering and Biodiversity

Vision-based system
supporting visually impaired people
Human in a Surveillance Camera Network

Recognition of hand gestures from cyclic hand movements using spatial-temporal features
Dynamic hand gesture recognition is a challenge field evenly this topic has been studied for a long time because of lack of feasible techniques deployed for Human-Computer Interaction (HCI) applications. In this paper, we propose a new type of gestures which presents a cyclic pattern of hand shapes during a movement. Through mapping of commands (e.g., turn devices on/off; increasing volume/channel) as output of a gesture recognition system, main purposes of the proposed gestures are to provide a natural and feasible way in control alliances in a smart home such as television, light, fan, door, so on. The proposed gestures are represented
by both hand shapes and directions. Thanks to cyclic pattern of the hand shapes during performing a command, hand gestures are more easily segmented from video stream. We then focus on several challenges of the proposed gestures such as: non-synchronization phase of the gestures, change of hand shapes along temporal dimension and direction of
hand movements. Such issues are addressed using combinations of spatial and temporal features extracted from consecutive frames of a gesture. The proposed algorithms are evaluated on several subjects. Evaluation results confirm that the proposed method obtains accuracy rates at 96% for
segmenting a dynamic hand gesture and 95% for recognizing a command, averagely.

A Combination of user-guide scheme and kernel descriptor on RGB-D data for robust and realtime hand posture recognition
This paper presents a robust and real-time hand posture recognition system. To obtain this, key elements of the proposed system contain an user-guide scheme and a kernel-based hand posture representation. We firstly describe a three-stage scheme to train an end-user. This scheme aims to adapt environmental conditions (e.g., background images, distance from device to hand/human body) as well as to learn appearance-based features such as hand-skin color. Thanks to the proposed user-guide scheme, we could precisely estimate heuristic parameters which play an important role for detecting and segmenting hand regions. Based on the segmented hand regions, we utilize a kernel-based hand representation in which three levels of feature are extracted. Whereas pixel-level and patch-level are conventional extractions, we construct image-level which presents a hand pyramid structure. These representations contribute to a Multi-class support vector machine classifier. We evaluate the proposed system in term of the learning time versus the robustness and real time performances. Averagely, the proposed system requires 14 seconds in advanced to guide an end-user. However, the hand posture recognition rate obtains 91.2% accuracy. Performance of the proposed system is comparable with state-of-the-art methods  but it is a real time system.

An Efficient Combination of RGB and Depth for Background Subtraction
This paper describes an efficient combination of KINECT data (RGB and Depth data) for background subtraction. To obtain this goal, we simply utilize a statistic model of background pixels like Gaussian Mixture Model for color and depth features. However, beyond results of the segmentation from separated data, our combination strategy that takes into account spatial pixels whose depth (or color feature) is more valuable than the other one. Our strategy is that in valid range of the depth measurement, results of foreground segmentation from depth features are biased, whereas outside the valid range, results of foreground segmentation using color feature are utilized. Following such combination scheme, depth pixels are filtered through a proposed noise model of depth as well as is validated in a range of the depth measurements. The proposed method is evaluated using a public dataset which is suffered from common problems of the background subtraction such as shadows, reflections and camouflage. The experiments show segmentation results which are comparable with recent reports. Furthermore, the proposed method is successful with a challenging task such as extracting human fall-down in a RGB-D image sequence. The foreground segmentation results is feasibility for recognition task.
  • [1]  Nguyen Van Toi, Hai Vu, Tran Thi Thanh Hai, Le Thi Lan, "An Efficient Combination of RGB and Depth for Background Subtraction", to appear in the Series of "Some Current Advanced Researches on Information and Computer Science in Vietnam", Springer, 2014
  • [2] NGUYEN Van Toi, Vu Hai, Tran Thi Thanh Hai, “Background Subtraction with KINECT data: An Efficient Combination RGB and Depth”, The first NAFOSTED Conference on Information and Computer Science (NICS 2014), pp 160--169, Hanoi, March 2014,