专题:Speech Recognition and Synthesis

This cluster of papers focuses on the advances in speech recognition technology, covering topics such as acoustic modeling using deep neural networks, speaker verification, convolutional neural networks for speech recognition, end-to-end speech recognition systems, hidden Markov models, sequence-to-sequence models, automatic speech recognition, speaker diarization, and statistical language modeling.
最新文献
Text to band gap: Pre-trained language models as encoders for semiconductor band gap prediction

article Full Text OpenAlex

Compositional Phoneme Approximation for L1-Grounded L2 Pronunciation Training

article Full Text OpenAlex

A Practical Synthesis of Detecting AI-Generated Textual, Visual, and Audio Content

article Full Text OpenAlex

Parameter-Efficient Fine-Tuning with Differential Privacy for Robust Instruction Adaptation in Large Language Models

article Full Text OpenAlex

Foundation Models for Multimodal MRI Synthesis with Language Guidance

article Full Text OpenAlex

EDNet: A Versatile Speech Enhancement Framework With Gating Mamba Mechanism and Phase Shift-Invariant Training

article Full Text OpenAlex

A Ground-Truth-Free Framework for Validating Emotions in Generative AI Speech Synthesis

article Full Text OpenAlex

The dataset for extending EMNIST evaluation

article Full Text OpenAlex

Supervised and semi-supervised domain generalization via consistency learning and maximum coding rate reduction

article Full Text OpenAlex

A Survey on Small Language Models

article Full Text OpenAlex

近5年高被引文献
Kaldi Speech Recognition Toolkit

article Full Text OpenAlex 4896 FWCI0

LLaMA: Open and Efficient Foundation Language Models

preprint Full Text OpenAlex 3855 FWCI0

Enhancements in Immediate Speech Emotion Detection: Harnessing Prosodic and Spectral Characteristics

article Full Text OpenAlex 1663 FWCI867.2702

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing

article Full Text OpenAlex 1519 FWCI201.2868

Robust Speech Recognition via Large-Scale Weak Supervision

preprint Full Text OpenAlex 1139 FWCI0

ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

paratext Full Text OpenAlex 925 FWCI0

Spoken Language Processing

book-chapter Full Text OpenAlex 803 FWCI1.5588

Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

article Full Text OpenAlex 623 FWCI82.1688

XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale

article Full Text OpenAlex 482 FWCI48.6404

Autoencoders and their applications in machine learning: a survey

article Full Text OpenAlex 450 FWCI152.3163