专题:Speech Recognition and Synthesis

This cluster of papers focuses on the advances in speech recognition technology, covering topics such as acoustic modeling using deep neural networks, speaker verification, convolutional neural networks for speech recognition, end-to-end speech recognition systems, hidden Markov models, sequence-to-sequence models, automatic speech recognition, speaker diarization, and statistical language modeling.
最新文献
Applications of Machine Learning in Speech Recognition

article Full Text OpenAlex

Tokenization Matters: Improving Zero-Shot NER for Indic Languages

article Full Text OpenAlex

DEIM: DETR with Improved Matching for Fast Convergence

article Full Text OpenAlex

SCSA: A Plug-and-Play Semantic Continuous-Sparse Attention for Arbitrary Semantic Style Transfer

article Full Text OpenAlex

One Year On: Assessing Progress of Multimodal Large Language Model Performance on RSNA 2024 Case of the Day Questions

article Full Text OpenAlex

EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

article Full Text OpenAlex

Autoregressive Speech Synthesis without Vector Quantization

article Full Text OpenAlex

Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models?

article Full Text OpenAlex

ZeroDL: Zero-shot Distribution Learning for Text Clustering via Large Language Models

article Full Text OpenAlex

Knowledge-Aligned Domain Shift Tuning for Efficient Adaptation in Large Language Models

article Full Text OpenAlex

近5年高被引文献
LLaMA: Open and Efficient Foundation Language Models

preprint Full Text OpenAlex 2981 FWCI0

LoRA: Low-Rank Adaptation of Large Language Models

preprint Full Text OpenAlex 1852 FWCI0

The Power of Scale for Parameter-Efficient Prompt Tuning

article Full Text OpenAlex 1806 FWCI147.593

Prefix-Tuning: Optimizing Continuous Prompts for Generation

article Full Text OpenAlex 1697 FWCI142.987

Enhancements in Immediate Speech Emotion Detection: Harnessing Prosodic and Spectral Characteristics

article Full Text OpenAlex 1662 FWCI1375.44

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units

article Full Text OpenAlex 1300 FWCI124.633

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing

article Full Text OpenAlex 856 FWCI105.711

Spoken Language Processing

book-chapter Full Text OpenAlex 833 FWCI3.287

ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

paratext Full Text OpenAlex 792 FWCI0

Robust Speech Recognition via Large-Scale Weak Supervision

preprint Full Text OpenAlex 735 FWCI0