Current Research Projects

1.Invariant structural representation for automatic speech recognition

Speech recognition has to deal with inevitable acoustic variations caused by non-linguistic factors. Recently, an invariant structural representation of speech was proposed by N. Minematsu, where the non-linguistic variations are effectively removed though modeling the dynamic aspects of speech signals. I work on both the theoretical and practical aspects of the invariant structural representation.

Theoretically, .I prove f-divergence yields a general family of invariant measures, and prove that all invariant measures have to be written in the form of f-divergence.
  • Y. Qiao and N. Minematsu, "f-divergence is a generalized invariant measure between distributions," Proc. INTERSPEECH, 2008

  • I also develop practical techniques to address the two problems of structural representations: high dimensionality and too strong invariance.
  • Yu Qiao, Satoshi Asakawa andNobuaki Minematsu, "Random Discriminant Structure Analysis for Automatic Recognition of Connected Vowels", IEEE workshop on Automatic Speech Recognition and Understanding (ASRU), 2007.
  • Yu Qiao, S. Asakawa, N. Minematsu, and K. Hirose, "Dimension reduction and discriminant analysis for Japanese connected vowel recognition," Proc. Autumn Meeting of Acoust. Soc. Jpn., 2-P-2, (2008-9)
  • 2.Unsupervised Phoneme Segmentation

    Phoneme segmentation is a fundamental problem in many speech recognition and synthesis studies. Unsupervised phoneme segmentation assumes no knowledge on linguistic contents and acoustic models, and thus poses a challenging problem. In this work, we formulate the optimal segmentation problem into a probabilistic framework. Using statistics and information theory analysis, we develop three different objective functions, namely, Summation of Square Error (SSE), Log Determinant (LD) and Rate Distortion (RD). We introduce a time-constrained agglomerative clustering algorithm to find the optimal segmentations. The proposed method outperforms the recently published unsupervised segmentation methods.
  • Y. Qiao, N. Shimomura, N. Minematsu, "Unsupervised optimal phoneme segmentation: objectives, algorithm and comparisons," Proc. Int. Conf. Acoustics, Speech, & Signal Processing (ICASSP), 2008

  • We further show learned metric by Minimum of Summation Variance (MSV) and Maximum of Discrimination Variance (MDV) can significantly improve the segmentation results.
  • Y. Qiao and N. Minematsu, "Metric learning for unsupervised phoneme segmentation," Proc. INTERSPEECH, 2008
  • 3.Phase Singularities for Image Representation and Object Matching

    Phases have been widely used in signal and image processing due to their stability to transformation, deformation, and noise addition. However, phase singularities, where the complex signals vanish, are generally regarded as harmful and unreliable. In this work, on the contrary, we try to show that phase singularities calculated by using the Laguerre-Gauss filter contain important information and can provide a reliable representation for images. We prove that phase singularities are invariant to translation and rotation, and show how to reconstruct an image up to a scale only from the positions of phase singularities. We develop two applications of phase singularities: object tracking and image matching. In object tracking, we use the iterative closest point algorithm to determine the corresponding relations of phase singularities between two adjacent frames.
  • Y. Qiao, W. Wang, N. Minematsu, J. Liu, X. Tang "Phase singularities for image representation and matching," Proc. Int. Conf. Acoustics, Speech, & Signal Processing (ICASSP), 2008
  • Demo video on tracking fugu (6 M)
  • Old Research Projects

    1.Offline Signature Verification Using Online Handwriting Registration

    This research proposes a novel framework for offline signature verification. Different from previous methods, our approach makes use of online handwriting instead of handwritten images for registration. The online registrations enable robust recovery of the writing trajectory from an input offline signature and thus allow effective shape matching between registration and verification signatures. In addition, we propose several new techniques to improve the performance of the new signature verification system: 1. we formulate and solve the recovery of writing trajectory within the framework of Conditional Random Fields; 2. we propose a new shape descriptor, online context, for aligning signatures; 3. we develop a verification criterion which combines the duration and amplitude variances of handwriting. Experiments on a benchmark database show that the proposed method significantly outperforms the wellknown offline signature verification methods and achieve comparable performance with online signature verification methods.

  • Yu Qiao, Jianzhuang Liu and Xiaoou Tang, "Offline Signature Verification using Online Registration", International Conference on Computer Vision and Pattern Recognition (CVPR), 2007.
  • 2.Recover drawing order from static handwritten images

    The object of this research is to recover the temporal information (online) from static handwriting image, which is generally regarded as an important yet a hard problem in the handwriting recognition field. We formulate the recovery problem as to find the smoothest path in its graph representation. A 3-phase approach to recover a writing order is proposed within the framework of Edge Continuity Relation (ECR). Experiments on 708,988 static images show that our method achieves a restoration rate of 96.0%. To the best of our knowledge, this is the highest result reported on large database.

  • Yu Qiao M. Nishiara and M. Yasuhara, " A Framework toward Restoration of Writing Order from Single-Stroked Handwriting Image", (Proof of Theorems) IEEE Transactions on Pattern Analysis and Machine Intelligence, pp.1724-1737, Vol. 28, No. 11, November 2006

  • Some experimental results can be found here.
  • Yu Qiao Mikihiko Nishiara and M. Yasuhara; " A Novel Approach to Recover Writing Order From Single Stroke Offline Handwritten Images ", Proceeding of Eighth International Conference on Document Analysis and Recognition (ICDAR), pp. 227-231, 2005, Seoul Korea
  • Yu Qiao and M. Yasuhara. " Recover Writing Trajectory from Multiple Stroked Image Using Bidirectional Dynamic Search", International Conference on Pattern Recognition (ICPR), 2006. Hongkong China
  • 3.Optimal Euler Circuit/Path

    This research introduces and solves a new graph problem: to find an Optimal Euler Circuit (OEC) in Euler graph. I prove that the OEC problem is NP-complete. I develop a polynomial time algorithm to find OEC in an Euler graph with 4-degree vertex only and propose a 1/4-approximation algorithm for general Euler graphs. .
    The source codes of some proposed algorithms to find optimal Euler circuit together can be found here.

  • Yu Qiao and M. Yasuhara, "Optimal Euler Circuit of Maximum Contiguous Cost" , IEICE Transactions on Fundamental Electronics, Communication and Computer Science, Vol.E90-A,No.1,pp.274-280,Jan. 2007
  • Yu Qiao, M. Yasuhara. "Reccovering Drawing Order From Offline Handwritten Image Using Direction Context and Optimal Euler Path," 31st International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2006. Toulouse France
  • 4.Face Detection, Track and Recognition

    This research was mainly done in the Microsoft Research Asia, where my mentor is Dr. Stan Li. We used principle angles between subspaces to analyse the face manifolds. I was included the following tasks: 1) training a face detector using AdaBoost; 2) developing a multiple face tracking system based on boosted filter.

    5.Safety Control based on Expert Network

    6.Intelligent Frequency Estimation using Adaptive Filter