Thi Thanh Hai - TRAN

Home Publications Research Teaching Links

Temporal Gesture Segmentation
This paper presents a method for temporal gesture segmentation based on the total activity of the video sequence. The new point of this method is that we apply some filters on the sequence and on the total activity plot that makes our method more robust to noise. This method has been shown to be very efficient on a very big data of the new contest CHALEARN on hand gesture recognition. This method can be a good reference for participants to the CHALEARN contest. The method is generic so could be applied for any shot boundary problem.
Hand Posture Recognition for HRI The use of hand gestures provides an attractive alternative to cumbersome interface devices for human - machine interaction (HMI). However, recognition of hand gestures is not a simple problem. In this paper, we propose to decompose the hand gesture recognition problem into 2 steps. In the first step, we detect skin regions using a very fast algorithm of color segmentation. In the second step, each skin region will be classified into one of hand posture class using cascaded Adaboost classifier and shape analysis techniques. The contribution of this paper is twofold. First, we proposed using both techniques for hand gesture recognition that reduces significantly the computational time in comparison with the traditional use of cascaded Adaboost classifier. Secondly, we integrated successfully this method on the robot and validated it in the context of interaction between human and robot guide in museum.
Face RecognitionThis work concerns the problem of face recognition from video under uncontrolled lighting condition. To face with illumination change, we propose to inspire the idea that pre-processes input images in order to represent them robust to illumination change [1]. We then use embedded Hidden Markov Model (EHMM), a famous model for face recognition to identify faces [2]. The main reason that we would like to study and experiment this model is that it allows us to represent structure of face images that makes more explicit face representation than numeric face descriptors. The traditional EHMM + applied to face recognition from still images. In our paper, we deal with face recognition from video. Therefore, we propose to combine the recognition result obtained from several frames to make our decision more confident. This improves significantly the recognition rate. We have trained our model and tested with two face databases, the one is the Yale-B database and the other one is created by MICA Institute.
Ridge Extraction We developed a simple but efficient algorithm to detect multiscales visual features in an image: ridges      d peaks. The method is based on Laplacian of Gaussian of the image and differential geometry properties of the surface associated with the image. Experiments of feature detection carried out with many types of images (eg. CT&MR images, fingerprint images, real-world images) showed that the method is very good for detecting features which represent object shapes. These features enrich the set of classical features (eg. region, contour line, interest point) and significantly improve the representation of objects in images. 
 
Text Detection A text is a special kind of structural object. At a small scale, we see some traits of characters. At a larger scale, we see only a long band co rresponding to the text line. Using these properties, we developed a method to model text in an image using ridges. More specifically, a text is chacharacterized by a long ridgeline at a coarse scale and several shorter ridges perpendicular to long ridgeline at a small scale. This reprepresentation of text is generic to many types of text: alphabet or ideogram, scene text or artificial text and independent with text orientation. Thedetection was evaluated on images of different nature and gave as    surprising precision and recall (93%).
 
 
Hierachical Object representation
 
We developed a new method for object representation based on ridges and peaks detected at several scales. Each object is represented as a graph such that each node is a feature (ridge or peak) and each arc is built from covering relation between spatial extensions of two features. T    his graph describes global shape as well as details of the object and allows many efficient strategies for graph matching.  
 
Human Modeling
 
Each person is represented by significant ridges and peaks detected at appropriate scales, representing important human       parts like head, torso and legs. Geometrical relations between features are explored to build a human model, which is a      vec tor of 10 components describing a configuration of a human. This representation method was applied for person detection in surveillance sequences. It has been evaluated on 26 video sequences provided by the CAVIAR project and obtained a good rate of detection (85%).       
 
Tracking based on Keypoints We developed a real-time method for detecting keypoints in images. These keypoints are described by a gradient magnitude based descriptor vector, which is projected onto an eigenspace to reduce dimensions and make faster the   search for nearest neighbors. This method has been evaluated in the context of visual servoing: moving a robot from a certain   position to a desired position. It worked in real-time (10-14fps) and has been shown to be very robust to occlusion. In    addition, it is capable to reinitialize whenever objects of  interest get lost. 
 
People Tracking The project LOVe has objectif to develop a software for vulnerable observation. This project has 12 partners. Our laboratory at CEA takes part into this project. We developped potential moduls: pedestrian detection from stereo moving camera and pedestrian tracking based on kalman filter, with help of lidar measurements.  
 
Image Processing To convert an image raw into a visible image (jpeg), we have to apply an image processing process, including : Demosaicing, White Balance, Color rendering, ToneCurve. My work is specified in to developing a method for color rendering, a very important step of image processing process.s