Computer Vision Research Center, National Yang-Ming Chiao-Tung university
Summary | Development of AI Platform for Smart Drone - Intelligent Flight: Due to its high mobility and the ability to fly in the sky, the drone has inspired more and more innovative applications/services in recent years. The goal of this project is to resolve the problem of blindly flying an unmanned aerial vehicle (UAV, which a drone in our case) when it is out of human sight or the range of wireless communication, and three major research and development directions will be considered in this project. Three artificial intelligence (AI) technologies, namely, smart sensing, smart control, and smart simulation, are applied in this project. Smart sensing - a flight system is developed, which can avoid the obstacles, complete a flight mission, and land safely. Smart control - an intelligence flight control system and a light-weighted somatosensory vest are developed. Smart simulation - a cost-effective training system and a 3D model simplification method are designed. Smart UAV technologies list: Based on the AI technologies, major innovations and benefits contribute in this project at least as follows. In smart sensing, we develop: - A tiny object detection system for vehicles and humans. - A parking lot detection system from the UAV camera. - A building detection and recognition system. - A real-time stereo distance estimation technology with embedding system. - A single camera distance estimation technique. - An object tracking system. - A precision landing technology, the UAV can land on an A4 size area. In smart control, we develop: -An obstacle avoidance system -A light-weighted and wireless somatosensory vest -An autonomous flight control system. -An UAV object delivery system. In smart simulation, we develop: -A VR environment control simulator -A third-person and first-person view simulator -A simplify 3D model If you want to know more details, please visit our YouTube channel: https://www.youtube.com/channel/UCWRAGW2BPOx7PkfB0qebNhw/video Depth Estimation via Spatiotemporal Correspondence: Stereo matching and flow estimation are two essential tasks for scene understanding, spatially in 3D and temporally in motion. Existing approaches have been focused on the unsupervised setting due to the limited resource to obtain the large-scale ground truth data. To construct a self-learnable objective, co-related tasks are often linked together to form a joint framework. However, the prior work usually utilizes independent networks for each task, thus not allowing to learn shared feature representations across models. In this paper, we propose a single and principled network to jointly learn spatiotemporal correspondence for stereo matching and flow estimation, with a newly designed geometric connection as the unsupervised signal for temporally adjacent stereo pairs. We show that our method performs favorably against several state-of-the-art baselines for both unsupervised depth and flow estimation on the KITTI benchmark dataset. This technique had been published in IEEE CVPR 2019. Company Description: Intelligent video surveillance is a cutting-edge technology that has attracted much attention in recent years. In order to enhance Taiwan's standard in the visual surveillance industry, the CVRC of NYCU, consisting of top-notch professors from domestic universities and research institutes, has been working on the development of key technologies ready to be transferred to manufacturers since 2004. The CVRC has achieved fruitful results in the past decade and has developed nearly 200 core technologies and nearly 100 transferable technologies. It has become Taiwan's largest and world-leading "visual surveillance technology collection" and "computer vision talent pool" |
||
---|---|---|---|
Technical Film | |||
Keyword | Development of AI Platform for Smart Drone - Intelligent Flight Smart UAV technologies list Depth Estimation via Spatiotemporal Correspondence | ||
Research Project | |||
Research Team |
More like this
Provide the latest information of AI research centers and applied industries
-
Embedding multimodal machine intelligence in the digital life of AI technology
This project collaborates with the international team to collect a very large-scale Chinese emotional corpus. In terms of technology, the fairness of speech emotion recognition is also discussed to solve social issues that may be encountered regarding the usability of emotion recognition. Among them, it is found that the database annotations are all labeled with the unfair perspective of men and women, which leads to biases in the trained model. In order to solve this problem, there have been preliminary achievements in the technological development of fairness, and will be submitted in the near future.
-
Deep Reinforcement Learning in Autonomous Miniature Car Racing
This project develops a high-performance end-to-end reinforcement learning training platform for autonomous miniature car racing. With this platform, our team won the championship of Amazon DeepRacer, a world autonomous racing competition. In addition, by combining various reinforcement learning algorithms and frameworks, our self-developed autonomous racing platform can operate at a much higher speed, surpassing the performance of Amazon DeepRacer.
-
A comprehensive evaluation of self-supervised speech models - SUPERB
Machines need annotations to learn, but human babies learn human languages with almost no annotations. Can machines do the same thing? To allow machines to learn human languages with only observations like human babies, a research team at Taiwan has partnered with the speech research groups in Meta, CMU, MIT, and JHU to develop a brand new self-supervised speech processing evaluation framework, Speech Processing Universal PERformance Benchmark (SUPERB).
-
Advanced Technologies for Designing Trustable AI Services
This integrated research project follows the Taiwan's 2030 Science & Technology Vision and takes LOHAS community and inclusive technology as the major research direction. We aim to develop trustable AI technologies, and introduce them to future smart services. That will realize the development of human-centric smart technology, and strengthen the governance and application of emerging technologies. The integrated project consists of 7 sub-projects led by PIs from National Taiwan University, National Tsing-Hua Universiy and Academia Sinica and composed of top AI technological teams. These sub-projects are divided into 3 clusters, including machine learning (sub-projects 1 and 2), computer vision (sub-projects 3 and 4), and human-centric computing (sub-projects 5, 6 and 7). We will deal with the issues of bias, fairness, transparency, explainability, traceability, and so on, from the aspects of data collection, technology, and application landing. Each sub-project will implement specific smart services to reflect the benefits and practical applications of the developed technologies. The NTU Joint Research Center for AI Technology and All Vista Healthcare, an AI Innovation Research Center supported by MOST, is responsible for management, planning, and execution of the integrated research project. We will propose a plan that can be generalized and applied to the intelligent service industry.
-
Ckip Lab
Textual Advertisement Generator: Given any limited specifics of any product, AI Advertisement Producer can automatically generate tons of top-quality descriptions and advertisements for the product in just one second. And not just one copy is produced. With deep learning and natural language processing technologies learned from millions of existing samples, our AI model can produce various styles of advertisements at the same time for users to select. It will be a big helper or a virtual brainstorming partner for any brands or advertisers to create their advertisements.
-
Stepped Respiratory Care Platform based on Zero-Contact Physiological Monitoring System
Combined with millimeter wave radar detection of chest undulation breathing mode and heart rate, continuous blood oxygen detection, active disease record of chat robot, and mobile phone analysis of 30 second sitting and standing alternate activity frequency mode, a set of personalized respiratory capacity benchmark is established through AI modeling, which can be applied to zero-contact respiratory physiological monitoring and useful for infectious disease ward, epidemic prevention hotels, centralized quarantine centers.
-
Deep Learning Based Anomaly Detection
For video anomaly detection, we apply pretrained models to obtain the foreground and the optical flow as ground truth. Then our model estimates the information by taking only a single frame as input. For human behaviors, we take the human poses as input and use a GCN-based model to predict the future poses. Both the anomaly scores of these two works are given by the error of the estimation. For defect detection, our model takes patches of the image as input and learns to extract features. The anomaly score of each patch is given by the distance between the patch and all the training patches.
-
Visually Impaired Navigation Dialogue System with Multiple AI Models
The dialogue system is the main subsystem of the visually impaired navigation system, which provides destinations for the navigation system through multiple dialogues. We use the knowledge graph as the basis for reasoning. In terms of close-range navigation, deep learning technology is used to develop RGB camera detection depth algorithm, indoor semantic cutting algorithm, integrated detection depth estimation and indoor semantic cutting in indoor obstacle avoidance, etc. The whole system uses the CellS software design framework to integrate distributed AIoT systems.
-
A deep learning based outdoor walking assistive system for the visually impaired
We provides a wearable device for the visually impaired to walk outdoors. By the deep learning network, the system can recognize and guide the visually impaired to walk on safe areas such as sidewalks and crosswalk. In addition, it can recognize the types of common obstacles and guide the visually impaired to avoid it in advance. Finally, we can convert the Google Maps route into easy-to-understand voice prompts instruction to guide the visually impaired to move in the right direction.