专题:Speech Recognition and Synthesis

This cluster of papers focuses on the advances in speech recognition technology, covering topics such as acoustic modeling using deep neural networks, speaker verification, convolutional neural networks for speech recognition, end-to-end speech recognition systems, hidden Markov models, sequence-to-sequence models, automatic speech recognition, speaker diarization, and statistical language modeling.
最新文献
The Sampling-Assisted Pathloss Radio Map Prediction Competition

article Full Text OpenAlex

Enhancements in Immediate Speech Emotion Detection: Harnessing Prosodic and Spectral Characteristics

article Full Text OpenAlex

Cyrus: A DRL-based Puncturing Solution to URLLC/eMBB Multiplexing in O-RAN

article Full Text OpenAlex

Disentanglement in a GAN for Unconditional Speech Synthesis

article Full Text OpenAlex

Validation of an ECAPA-TDNN system for Forensic Automatic Speaker Recognition under case work conditions

article Full Text OpenAlex

Speech recognition sensors and artificial intelligence automatic evaluation application in English oral correction system

article Full Text OpenAlex

LightCodec: A High Fidelity Neural Audio Codec with Low Computation Complexity

article Full Text OpenAlex

Emphasized Non-Target Speaker Knowledge in Knowledge Distillation for Automatic Speaker Verification

article Full Text OpenAlex

TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning

article Full Text OpenAlex

FeatAug-DETR: Enriching One-to-Many Matching for DETRs With Feature Augmentation

article Full Text OpenAlex

近5年高被引文献
EMBI

preprint Full Text OpenAlex 44944 FWCI0

LLaMA: Open and Efficient Foundation Language Models

preprint Full Text OpenAlex 3778 FWCI0

TIMIT Acoustic-Phonetic Continuous Speech Corpus

dataset Full Text OpenAlex 2544 FWCI0

LoRA: Low-Rank Adaptation of Large Language Models

preprint Full Text OpenAlex 2331 FWCI0

Prefix-Tuning: Optimizing Continuous Prompts for Generation

article Full Text OpenAlex 1940 FWCI211.09094843

Xlnet: Generalized Autoregressive Pretraining for Language Understanding

preprint Full Text OpenAlex 1856 FWCI0

Enhancements in Immediate Speech Emotion Detection: Harnessing Prosodic and Spectral Characteristics

article Full Text OpenAlex 1659 FWCI1800.35051355

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing

article Full Text OpenAlex 1301 FWCI254.7342652

Robust Speech Recognition via Large-Scale Weak Supervision

preprint Full Text OpenAlex 1112 FWCI0

Google Speech Commands-Musan test set

preprint Full Text OpenAlex 1071 FWCI0