Embedding multimodal machine intelligence in the digital life of AI technology
Summary | Practical Verification For the robust processing of speech signals, 8k audio files are converted into high-quality 16k audio files and can be developed with anti-noise technology. Speech technology is applied to real-time mode conversion, as well as personal stress based on the speech and physiological signals collected in the real field. These developing technologies will be verified in practical with industry-university cooperation, especially the robustness of speech will be verified in the customer service. Trustworthy AI International collaboration with the team that established MSP-PODCAST to conduct a large-scale Chinese corpus. All construction process details, applied technologies and versions will be recorded, such as data analysis, voice activity detection (VAD), speaker recognition, de-noise, automatic detection of speechless segments, emotion retrieval and annotation strategies, etc. The process of collecting from the database directly grasps all the records accumulated for the trustworthy emotion recognition, so as to examine the fairness and reliability of speech emotion construction from various perspectives. Coping actions for the impact of humanities and legal system/governance on sharing of data and AI model Relevant issues are more than database collection procedures. Data collection policies and procedures have been established, and the collected public data sources are in compliance with the terms of the Creative Commons (CC) License. According to the data sources and data types, relevant laws, regulations, orders, the author's authorization and contract are taken into consideration, and a standardized form is formulated to ensure that this project team can process or use the information on a legal and authorized basis. |
||
---|---|---|---|
Keyword | Emotional Corpus Fairness Algorithm Speech Emotion Recognition | ||
Research Project | Advanced Technologies for Designing Trustable AI Services | ||
Research Team | Led by PI:Prof. Hsin-Hsi Chen, National Taiwan University, Co-PI: Prof. Chi-Chun Lee, National Tsing Hua University |
More like this
Provide the latest information of AI research centers and applied industries
-
Deep Reinforcement Learning in Autonomous Miniature Car Racing
This project develops a high-performance end-to-end reinforcement learning training platform for autonomous miniature car racing. With this platform, our team won the championship of Amazon DeepRacer, a world autonomous racing competition. In addition, by combining various reinforcement learning algorithms and frameworks, our self-developed autonomous racing platform can operate at a much higher speed, surpassing the performance of Amazon DeepRacer.
-
A comprehensive evaluation of self-supervised speech models - SUPERB
Machines need annotations to learn, but human babies learn human languages with almost no annotations. Can machines do the same thing? To allow machines to learn human languages with only observations like human babies, a research team at Taiwan has partnered with the speech research groups in Meta, CMU, MIT, and JHU to develop a brand new self-supervised speech processing evaluation framework, Speech Processing Universal PERformance Benchmark (SUPERB).
-
Advanced Technologies for Designing Trustable AI Services
This integrated research project follows the Taiwan's 2030 Science & Technology Vision and takes LOHAS community and inclusive technology as the major research direction. We aim to develop trustable AI technologies, and introduce them to future smart services. That will realize the development of human-centric smart technology, and strengthen the governance and application of emerging technologies. The integrated project consists of 7 sub-projects led by PIs from National Taiwan University, National Tsing-Hua Universiy and Academia Sinica and composed of top AI technological teams. These sub-projects are divided into 3 clusters, including machine learning (sub-projects 1 and 2), computer vision (sub-projects 3 and 4), and human-centric computing (sub-projects 5, 6 and 7). We will deal with the issues of bias, fairness, transparency, explainability, traceability, and so on, from the aspects of data collection, technology, and application landing. Each sub-project will implement specific smart services to reflect the benefits and practical applications of the developed technologies. The NTU Joint Research Center for AI Technology and All Vista Healthcare, an AI Innovation Research Center supported by MOST, is responsible for management, planning, and execution of the integrated research project. We will propose a plan that can be generalized and applied to the intelligent service industry.
-
Computer Vision Research Center, National Yang-Ming Chiao-Tung university
Development of AI Platform for Smart Drone - Intelligent Flight: Due to its high mobility and the ability to fly in the sky, the drone has inspired more and more innovative applications/services in recent years. The goal of this project is to resolve the problem of blindly flying an unmanned aerial vehicle (UAV, which a drone in our case) when it is out of human sight or the range of wireless communication, and three major research and development directions will be considered in this project. Three artificial intelligence (AI) technologies, namely, smart sensing, smart control, and smart simulation, are applied in this project. Smart sensing - a flight system is developed, which can avoid the obstacles, complete a flight mission, and land safely. Smart control - an intelligence flight control system and a light-weighted somatosensory vest are developed. Smart simulation - a cost-effective training system and a 3D model simplification method are designed.
-
Ckip Lab
Textual Advertisement Generator: Given any limited specifics of any product, AI Advertisement Producer can automatically generate tons of top-quality descriptions and advertisements for the product in just one second. And not just one copy is produced. With deep learning and natural language processing technologies learned from millions of existing samples, our AI model can produce various styles of advertisements at the same time for users to select. It will be a big helper or a virtual brainstorming partner for any brands or advertisers to create their advertisements.
-
Stepped Respiratory Care Platform based on Zero-Contact Physiological Monitoring System
Combined with millimeter wave radar detection of chest undulation breathing mode and heart rate, continuous blood oxygen detection, active disease record of chat robot, and mobile phone analysis of 30 second sitting and standing alternate activity frequency mode, a set of personalized respiratory capacity benchmark is established through AI modeling, which can be applied to zero-contact respiratory physiological monitoring and useful for infectious disease ward, epidemic prevention hotels, centralized quarantine centers.
-
Deep Learning Based Anomaly Detection
For video anomaly detection, we apply pretrained models to obtain the foreground and the optical flow as ground truth. Then our model estimates the information by taking only a single frame as input. For human behaviors, we take the human poses as input and use a GCN-based model to predict the future poses. Both the anomaly scores of these two works are given by the error of the estimation. For defect detection, our model takes patches of the image as input and learns to extract features. The anomaly score of each patch is given by the distance between the patch and all the training patches.
-
Visually Impaired Navigation Dialogue System with Multiple AI Models
The dialogue system is the main subsystem of the visually impaired navigation system, which provides destinations for the navigation system through multiple dialogues. We use the knowledge graph as the basis for reasoning. In terms of close-range navigation, deep learning technology is used to develop RGB camera detection depth algorithm, indoor semantic cutting algorithm, integrated detection depth estimation and indoor semantic cutting in indoor obstacle avoidance, etc. The whole system uses the CellS software design framework to integrate distributed AIoT systems.
-
A deep learning based outdoor walking assistive system for the visually impaired
We provides a wearable device for the visually impaired to walk outdoors. By the deep learning network, the system can recognize and guide the visually impaired to walk on safe areas such as sidewalks and crosswalk. In addition, it can recognize the types of common obstacles and guide the visually impaired to avoid it in advance. Finally, we can convert the Google Maps route into easy-to-understand voice prompts instruction to guide the visually impaired to move in the right direction.