Deep learning applications

Lane detection for self-driving cars

alt text 

In this project we created a state-of-the-art road-lane detector for autonomous vehicles. To develop a robust and accurate system, we used a broad set of techniques: deep learning, robust statistics, convex optimization. This system is the only one in the market that does not require any human-labeled data for training. Manually labeling a dataset consisting of millions of frames from a video-stream is extremely expensive. The semi-supervised machine learning approach that we developed heavily relies on spatial and temporal statistical properties of the video-stream data from the vehicle’s camera.

Region segmentation

alt text 

In this project, we develop a method to segment regions in three-dimensional point clouds. For example, in the image above one can see a cube with rounded edges. The goal is to automatically segment the cube into a set of faces. The problem is difficult in this particular case because the edges of the cube are rounded. More generally, we assume that (i) the shape and the number of regions in the point cloud are not known and (ii) the point cloud may be noisy.

The method consists of two steps. In the first step we use a deep neural network to predict the probability that a pair of small patches from the point cloud belongs to the same region. In the second step, we use a convex-optimization based method to improve the predictions of the network by enforcing consistency constraints.

The method can be seen as a robust and flexible alternative to the famous region growing segmentation algorithm.

Materials:

Fricative phoneme detection with zero delay

alt text 

People with high-frequency hearing loss rely on hearing aids that employ frequency lowering algorithms. These algorithms shift some of the sounds from the high frequency band to the lower frequency band where the sounds become more perceptible for the people with the condition. Fricative phonemes have an important part of their content concentrated in high frequency bands. It is important that the frequency lowering algorithm is activated exactly for the duration of a fricative phoneme, and kept off at all other times. Therefore, timely (with zero delay) and accurate fricative phoneme detection is a key problem for high quality hearing aids.

In this project we develop a deep learning based fricative phoneme detection algorithm that has zero detection delay and achieves state-of-the-art fricative phoneme detection accuracy on the TIMIT Speech Corpus.

Materials:

Fricative phoneme detection with zero delay
M. Yurt, A. N. Escalante B., and V. I. Morgenshtern
2019, submitted