Paper Digest: Recent Papers on AI for Music
Paper Digest Team extracted all recent AI for Music related papers on our radar, and generated highlight sentences for them. The results are then sorted by relevance & date. In addition to this ‘static’ page, we also provide a real-time version of this article, which has more coverage and is updated in real time to include the most recent updates on this topic.
Since 2018, Paper Digest has built a foundation of data spanning decades of conferences, journals, and research topics. The platform features a daily digest service that sifts through tens of thousands of new papers, clinical trials, news articles, and community posts, filtering the noise to highlight what matters most to specific interests. Beyond daily updates, dozens of built-in research tools streamline the academic workflow, supporting efficient reading and writing, comprehensive literature reviews, and automated research report generation.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Paper Digest: Recent Papers on AI for Music
| Paper | Author(s) | Source | Date | |
|---|---|---|---|---|
| 1 | MusicDET: Zero-Shot AI-Generated Music Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address this issue, we formulate a zero-shot setting for AI-generated music detection, where the detector is trained exclusively on real music without access to any generated samples. Under this setting, we propose MusicDET, a generator-agnostic detection framework based on frequency-guided normalizing flows that probabilistically models the distribution of real music features. |
Chaolei Han; Hongsong Wang; Jie Gui; | arxiv-cs.SD | 2026-05-18 |
| 2 | Publisher Correction: Research on Song Dynasty Copper Mirror Pattern Recognition Based on MOEAD Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Qing Feng; Kexin Yu; Yaxuan Li; Tian Ma; | npj Heritage Science | 2026-05-18 |
| 3 | A Dataset for The Recognition of Historical and Handwritten Music Scores in Western Notation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: A large amount of musical heritage has been digitised by memory institutions: libraries, museums, and archives. Nevertheless, the field of Optical Music Recognition (OMR) has … |
PAU TORRAS et. al. | arxiv-cs.CV | 2026-05-18 |
| 4 | MusicSynth: An Automated Pipeline for Generating Violin Fingerboard Animations from Sheet Music Using Optical Music Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The only part built from scratch is the lookup table that maps each musical note to a string and finger position on the violin. |
Abhimanyu Kaushik; | arxiv-cs.SD | 2026-05-16 |
| 5 | Persian MusicGen: A Large-Scale Dataset and Culturally-Aware Generative Model for Persian Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work introduces a new resource for generative music research and illustrates the adaptability of music generation models to underrepresented cultural and linguistic contexts. |
Mohammad Hossein Sameti; Diba Hadi Esfangereh; Sepehr Harfi Moridani; Leili Javidpour; Mahdieh Soleymani Baghshah; | arxiv-cs.SD | 2026-05-14 |
| 6 | Cross-Linguistic Complexity and Language-Specific Sentiment: Multifractal Structure and Emotional Valence in Popular Music Lyrics Across Three Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We investigate the linguistic complexity and emotional valence of popular song lyrics across English (n=1491), Spanish (n=307), and German (n=225), using an analytical corpus of 2023 tracks drawn from 2113 deduplicated tracks on Spotify’s weekly Top 200 charts (2019–2021). |
Fateme Khanipour; Zeinab Shahbazi; Sara Behnamian; Fatemeh Fogh; Nathan Blood; | Computers | 2026-05-14 |
| 7 | Dance Motion-Guided Music Generation Via Residual Vector Quantization IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel deep learning-based method for generating music from human dance movements. |
Shuhong Lin; Moshe Zukerman; Hong Yan; | Electronics | 2026-05-14 |
| 8 | Text2Score: Generating Sheet Music From Textual Prompts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Text2Score, a two-stage framework comprising a planning stage and an execution stage for generating sheet music from natural language prompts. |
KESHAV BHANDARI et. al. | arxiv-cs.SD | 2026-05-13 |
| 9 | Poly-SVC: Polyphony-Aware Singing Voice Conversion with Harmonic Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we innovatively propose Poly-SVC, a zero-shot, cross-lingual singing voice conversion system designed to process residual harmonies. |
Chen Geng; Meng Chen; Ruohua Zhou; Ruolan Liu; Weifeng Zhao; | arxiv-cs.SD | 2026-05-12 |
| 10 | Quantitative Analysis of Audio Descriptors for Emotion-Based Music Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This research tries to fix that problem by creating a system that can classify music based on how it sounds. |
Dr. Prabha Nair; Sarvagya Dubey; | International Journal of Science, Strategic Management and … | 2026-05-12 |
| 11 | Emotionally Aware Music Playback: MoodSync Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The selection of music is intervened by the user. To bridge this gap, this paper is introducing MoodSync. |
Diya Goyal; | International Journal for Research in Applied Science and … | 2026-05-12 |
| 12 | A Deep Learning Framework for Emotion-Conditioned Personalized Music Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While music acts as a powerful emotional regulator, traditional recommendation systems often fail to account for a user’s immediate affective state, relying instead on static historical logs. We present EmotionMuse, a modular deep learning framework that bridges this gap by integrating real-time facial expression analysis with history-conditioned music suggestions. |
Dr. PRABHA NAIR; SAGAR BARGOTI; | International Journal of Science, Strategic Management and … | 2026-05-11 |
| 13 | Music Emotion Classification Based on Deep Neural Network and Its Application in Context Aware Recommendation System Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The classification of music emotion has emerged as a powerful tool to improve personalized music recommendation systems by aligning musical content with users’ emotional states. This research presents a novel deep learning-based model for classifying music emotions and integrating these classifications into a context-aware recommendation framework. |
Yu Liu; | Discover Artificial Intelligence | 2026-05-11 |
| 14 | Investigation of Part-level Perceptual Music Similarity By Large-scale Listening Test Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study investigates perceptual similarity at two levels: music tracks (track-level) and the individual instrumental parts that compose them (part-level). |
Yuka Hashizume; Tomoki Toda; | APSIPA Transactions on Signal and Information Processing | 2026-05-08 |
| 15 | Exploratory Meta-analysis of The Effect of Music Intervention on Arousal Promotion in Patients with Disorders of Consciousness: Evidence from Controlled Studies Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study systematically reviewed controlled trials investigating the effects of music intervention on the level of consciousness in patients with DoC and performed an exploratory meta-analysis. |
Jiayi Gu; Chengjuan Li; Wei Long; Siqin Zeng; Xiaoying Zhang; | Frontiers in Neuroscience | 2026-05-08 |
| 16 | Do Melody and Rhythm Coevolve? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a novel computational pipeline to extract vocal melodic pitch-interval and percussive inter-onset timing distributions from 27,628 popular songs across 59 countries, enabling large-scale cross-cultural comparison that bypasses traditional music annotations. |
HARIN LEE et. al. | arxiv-cs.SD | 2026-05-07 |
| 17 | Empirical Study of Pop and Jazz Mix Ratios for Genre-Adaptive Chord Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Most large-scale symbolic music systems target melody, multi-track arrangement, or audio synthesis, and chord-only models tend to be relegated to conditioning components inside larger pipelines. |
Jinju Lee; | arxiv-cs.SD | 2026-05-06 |
| 18 | VocalParse: Towards Unified and Scalable Singing Voice Transcription with Large Audio Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite their utility, current automatic transcription systems face significant challenges: they often rely on complex multi-stage pipelines, struggle to recover text-note alignments, and exhibit poor generalization to out-of-distribution (OOD) singing data. To alleviate these issues, we present VocalParse, a unified singing voice transcription (SVT) model built upon a Large Audio Language Model (LALM). |
Yukun Chen; Tianrui Wang; Zhaoxi Mu; Xinyu Yang; EngSiong Chng; | arxiv-cs.SD | 2026-05-06 |
| 19 | APEX: Large-scale Multi-task Aesthetic-Informed Popularity Prediction for AI-Generated Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose APEX, the first large-scale multi-task learning framework for AI-generated music, trained on over 211k songs (10k hours of audio) from Suno and Udio, that jointly predicts engagement-based popularity signals – streams and likes scores – alongside five perceptual aesthetic quality dimensions from frozen audio embeddings extracted from MERT, a self-supervised music understanding model. |
Jaavid Aktar Husain; Dorien Herremans; | arxiv-cs.SD | 2026-05-05 |
| 20 | CMI-RewardBench: Evaluating Music Reward Models with Compositional Multimodal Instruction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While music generation models have evolved to handle complex multimodal inputs mixing text, lyrics, and reference audio, evaluation mechanisms have lagged behind, remaining fragmented and narrowly focused. In this paper, we bridge this critical gap by establishing a comprehensive ecosystem for Compositional Music Instruction (CMI) reward modeling, where the generated music may be conditioned on text descriptions, lyrics, and/or audio prompts. |
YINGHAO MA et. al. | icml | 2026-05-05 |
| 21 | TMD-Bench: A Multi-Level Evaluation Paradigm for Music–Dance Co-Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce TMD-Bench, a benchmark for text-driven music–dance co-generation that assesses systems across unimodal generation quality, instruction adherence, and cross-modal rhythmic alignment. |
XIAODA YANG et. al. | icml | 2026-05-05 |
| 22 | Cosmodoit: A Python Package for Adaptive, Efficient Pipelining of Feature Extraction from Performed Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although many algorithms and tools exist for tasks such as performance-to-score alignment and symbolic or audio feature extraction, they are spread across different programming languages and data formats, making them difficult to combine efficiently. To address this problem, we present Cosmodoit, a novel Python package designed to streamline feature extraction from performed music. |
Corentin Guichaoua; Daniel Bedoya; Elaine Chew; | arxiv-cs.SD | 2026-05-05 |
| 23 | MoEvalSong: A Mixture-of-Experts Model for Automatic Song Aesthetics Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Aesthetic evaluation is essential for judging the quality of generated music, yet it remains largely underexplored. To address this challenge, this paper introduces a Mixture-of-Experts (MoE) model, naming MoEvalSong for song evaluation. |
H. Zhang; | icassp | 2026-05-04 |
| 24 | The ICASSP 2026 Automatic Song Aesthetics Evaluation Challenge Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper summarizes the ICASSP 2026 Automatic Song Aesthetics Evaluation (ASAE) Challenge1, which focuses on predicting the subjective aesthetic scores of AI-generated songs. |
G. Ma; | icassp | 2026-05-04 |
| 25 | Poly-SVC: Polyphony-Aware Singing Voice Conversion with Harmonic Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we innovatively propose Poly-SVC, a zero-shot, cross-lingual singing voice conversion system designed to process residual harmonies. |
C. Geng; M. Chen; R. Zhou; R. Liu; W. Zhao; | icassp | 2026-05-04 |
| 26 | ABC-Eval: Benchmarking Large Language Models on Symbolic Music Understanding and Instruction Following Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While symbolic music has been widely used in generation tasks, LLM capabilities in understanding and reasoning about symbolic music remain largely underexplored. To address this gap, we propose ABC-Eval, the first open-source benchmark dedicated to the understanding and instruction-following capabilities in text-based ABC notation scores. |
J. Zhao; Y. Li; W. Li; K. Yoshii; | icassp | 2026-05-04 |
| 27 | ALMA-Chor: Leveraging Audio-Lyric Alignment with Mamba for Chorus Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose ALMA-Chor, an end-to-end framework that jointly models audio and lyrics for chorus detection. |
R. Bao; | icassp | 2026-05-04 |
| 28 | StylePitcher: Generating Style-Following and Expressive Pitch Curves for Versatile Singing Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose StylePitcher, a general-purpose pitch curve generator that learns singer style from reference audio while preserving alignment with the intended melody. |
J. Huang; | icassp | 2026-05-04 |
| 29 | Synergizing Large-Scale Music Representations and Metric-Based Meta-Learning For Few-Shot Song Aesthetics Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: With only 2k+ labeled songs, the task reduces to a few-shot regime, imposing a severe generalization burden. To tackle these challenges, we leverage tens of millions of tracks from Netease Cloud Music to pre-train two music representation models from different perspectives, yielding holistic representations for accurate aesthetics scoring. |
J. Chen; X. Bai; X. Pan; X. Mu; | icassp | 2026-05-04 |
| 30 | TinyMU: A Compact Audio-Language Model for Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To train TinyMU, we introduce MusicSkills-3.5M, a carefully curated, music-grounded question-answering dataset with 3.5M samples. |
X. Li; A. Quelennec; S. Essid; | icassp | 2026-05-04 |
| 31 | Via Score to Performance: Efficient Human-Controllable Long Song Generation with Bar-Level Symbolic Notation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We argue that bar-level scores are better suited for song generation than audio. We present ${\mathcal{B}}{\text{ar – level}}\,{\mathcal{A}}{\text{I}}\,{\mathcal{C}}{\text{omposing}}\,{\mathcal{H}}{\text{elper}}$ (BACH), the first framework that generates editable symbolic scores before rendering them into audible songs. |
T. Wang; Y. Yu; Q. Wang; J. Qian; | icassp | 2026-05-04 |
| 32 | FUSEMOS: Perceptual Evaluation of Text-to-Music Generation with Dual-Encoder Fusion and Ranking-Aware Composite Loss Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, CLAP is limited in capturing fine-grained musical attributes such as timbre and expressiveness etc. To address this, we propose FUSEMOS, a dual-encoder fusion framework that integrates CLAP with MERT, a self-supervised model pretrained on large scale music audio, to better capture both high level semantic alignment and detailed musical characteristics for more accurate OMI and TA estimation. |
J. YANG et. al. | icassp | 2026-05-04 |
| 33 | Hierarchical Tokenization of Multimodal Music Data for Generative Music Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a method to generate multimodal music tokens (3MToken) that transforms rich metadata from a music database—including audio, credits, semantic tags, song and artist descriptions, musical characteristics, release dates, and consumption patterns—into discrete tokens using a Residual-Quantized Variational Autoencoder (RQ-VAE). |
W. J. Lee; R. Joyee; Z. Luo; S. Mukherjee; E. Coviello; | icassp | 2026-05-04 |
| 34 | Etude: Piano Cover Generation with A Three-Stage Approach — Extract, Structuralize, and Decode Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Etude, a three-stage architecture consisting of Extract, strucTUralize, and DEcode stages. |
T. -Y. Chen; Y. -J. Joung; | icassp | 2026-05-04 |
| 35 | SingMOS-Pro: An Comprehensive Benchmark For Singing Quality Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce SingMOS-Pro, a dataset for automatic singing quality assessment. |
Y. Tang; | icassp | 2026-05-04 |
| 36 | Source Separation For A Cappella Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we study the task of multi-singer separation in a cappella music, where the number of active singers varies across mixtures. |
L. A. Lanzendorfer; C. Pinkl; F. Grötschla; | icassp | 2026-05-04 |
| 37 | A Hybrid Convolution-Mamba Network with Tone-Octave Contrastive Learning for Stratified Semi-Supervised Singing Melody Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, prior semi-supervised methods treat all unlabeled data uniformly, which lower the accuracy of the pseudo labels. To address these issues, we propose a unified framework with three modifiations. |
K. Dong; S. Ding; S. Yu; W. Li; | icassp | 2026-05-04 |
| 38 | Motionbeat: Motion-Aligned Music Representation Via Embodied Contrastive Learning and Bar-Equivariant Contact-Aware Encoding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose MotionBeat, a framework for motion-aligned music representation learning. |
X. Wang; H. Wang; W. Cai; | icassp | 2026-05-04 |
| 39 | Multiple Self-Supervised Representations Fusion Network for Automatic Song Aesthetics Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work presents our submission to the ICASSP 2026 Automatic Song Aesthetics Evaluation Challenge, which predicts human ratings across five perceptual dimensions. |
Z. Yang; | icassp | 2026-05-04 |
| 40 | LG-STAFNet: Emotion Recognition in AI-Generated Music Via Local-Global Spatio-Temporal EEG Feature Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Inspired by this, we construct a novel AI-generated music-EEG dataset that records brain activities of participants while listening to various AI-generated music. Based on this dataset, we propose a local-global spatio-temporal attentional fusion network (LG-STAFNet) to compute music-induced emotions from EEG signals. |
Y. Liu; | icassp | 2026-05-04 |
| 41 | AudioGen-Omni: A Unified Multimodal Diffusion Transformer for Video-Synchronized Audio, Speech, and Song Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present AudioGen-Omni, a unified approach based on multimodal diffusion transformers (MM-DiT) that generates high-fidelity audio, speech, and songs with coherent video synchronization. |
L. WANG et. al. | icassp | 2026-05-04 |
| 42 | DiTSinger: Scaling Singing Voice Synthesis with Diffusion Transformer and Implicit Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a two-stage pipeline: a compact seed set of human-sung recordings is constructed by pairing fixed melodies with diverse LLM-generated lyrics, and melody-specific models are trained to synthesize over 500 hours of high-quality Chinese singing data. |
Z. Du; | icassp | 2026-05-04 |
| 43 | HarmoNet: Music Grounding By Short Video Via Harmonic Resample and Dynamic Sparse Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although current studies have established benchmarks and models, they ignore modeling the different melodic scales between videos and music and dynamic changes in sequence importance caused by tempo variations. To address these issues, we propose HarmoNet. |
Y. Shen; | icassp | 2026-05-04 |
| 44 | Melos: Sentence-To-Section Training with Multi-Task Learning for LLM-Driven Song Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address these, we propose a large language model (LLM)-based framework with a novel two-stage training strategy that progresses from sentence-level to section-level. |
D. WU et. al. | icassp | 2026-05-04 |
| 45 | Leveraging Whisper Embeddings For Audio-Based Lyrics Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce WEALY, a fully reproducible pipeline that leverages Whisper decoder embeddings for lyrics matching tasks. |
E. Mancini; J. Serrà; P. Torroni; Y. Mitsufuji; | icassp | 2026-05-04 |
| 46 | The Singing Voice Conversion Challenge 2025: From Singer Identity Conversion to Singing Style Conversion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present the findings of the latest iteration of the Singing Voice Conversion Challenge, a scientific event aiming to compare and understand different voice conversion systems in a controlled environment. |
L. P. Violeta; | icassp | 2026-05-04 |
| 47 | SAUNA: Song-Level Audio & User-Listening Data Neural Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce SAUNA (Song-level Audio and User-listening data Neural Alignment), a deep learning model that predicts listener engagement directly from audio using large-scale implicit user feedback. |
M. Buisson; J. J. Bosch; D. Stoller; | icassp | 2026-05-04 |
| 48 | VMSP: Video-to-Music Generation with Two-Stage Alignment and Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose VMSP, a two-stage generation framework based on hierarchical conditional mapping. |
X. Gu; W. Jiang; Y. Jiang; Z. Su; M. Yan; | icassp | 2026-05-04 |
| 49 | MR-FlowDPO: Multi-Reward Direct Preference Optimization for Flow-Matching Text-to-Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce MR-FlowDPO, a novel approach that enhances Flow Matching music generation models – a major class of modern music generation systems, using Direct Preference Optimization (DPO) with multiple musical rewards. |
A. ZIV et. al. | icassp | 2026-05-04 |
| 50 | Towards Effective Negation Modeling in Joint Audio-Text Models for Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Negation is fundamental for distinguishing the absence (or presence) of musical elements (e.g., with vocals vs. without vocals), but current systems fail to represent this reliably. In this work, we investigate and mitigate this limitation by training CLAP models from scratch on the Million Song Dataset with LP-MusicCaps-MSD captions. |
Y. Vasilakis; R. Bittner; J. Pauwels; | icassp | 2026-05-04 |
| 51 | Subsequence SDTW: Differentiable Alignment with Flexible Boundary Conditions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose subsequence SDTW (subSDTW), an extension of classical SDTW that includes flexible boundary conditions, to allow for the differentiable and soft alignment of sequences with potential boundary mismatch. |
J. Zeitler; M. Müller; | icassp | 2026-05-04 |
| 52 | Improving Active Learning for Melody Estimation By Disentangling Uncertainties Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we follow a framework that disentangles aleatoric and epistemic uncertainties to guide active learning for melody estimation. |
A. Jaiswal; P. Singh; V. Arora; | icassp | 2026-05-04 |
| 53 | AnyAccomp: Generalizable Accompaniment Generation Via Quantized Melodic Bottleneck Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This creates a critical train-test mismatch, leading to failure on clean, real-world vocal inputs. We introduce AnyAccomp, a framework that resolves this by decoupling accompaniment generation from source-dependent artifacts. |
J. Zhang; Y. Zhang; X. Zhang; Z. Wu; | icassp | 2026-05-04 |
| 54 | Sing What You Fit: A Perception-Based Dataset and Benchmark for Vocal-Song Suitability Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces Vocal-Song Suitability Analysis (VSSA), a novel task designed to address the subjective suitability between vocal performances and songs, serving as a complementary and initial exploration of user singing demand analysis within personalized Music Recommendation Systems (MRS). |
A. Y. ZHAO et. al. | icassp | 2026-05-04 |
| 55 | HEAR: Hierarchically Enhanced Aesthetic Representations for Multidimensional Music Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose HEAR, a robust music aesthetic evaluation framework that combines: (1) a multi-source multi-scale representations module to obtain complementary segment- and track-level features, (2) a hierarchical augmentation strategy to mitigate overfitting, and (3) a hybrid training objective that integrates regression and ranking losses for accurate scoring and reliable top-tier song identification. |
S. LIU et. al. | icassp | 2026-05-04 |
| 56 | MuseTok: Symbolic Music Tokenization for Generation and Semantic Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Discrete representation learning has shown promising results across various domains, including generation and understanding in image, speech and language. Inspired by these advances, we propose MuseTok, a tokenization method for symbolic music, and investigate its effectiveness in both music generation and understanding tasks. |
J. Huang; | icassp | 2026-05-04 |
| 57 | D3PIA: A Discrete Denoising Diffusion Model for Piano Accompaniment Generation from Lead Sheet Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a discrete diffusion-based piano accompaniment generation model, D3PIA, leveraging local alignment between lead sheet and accompaniment in piano-roll representation. |
E. Choi; H. Kim; H. Bang; T. Kwon; J. Nam; | icassp | 2026-05-04 |
| 58 | A Unsupervised Domain Adaptation Framework For Semi-Supervised Melody Extraction Using Confidence Matrix Replace and Nearest Neighbour Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing unsupervised domain adaptation framework for semi-supervised melody extraction methods often adopt a passive adaptation paradigm, attempting to fit uncertainties in unlabeled data, which is frequently constrained by pseudo-label noise and large domain discrepancies. To overcome these limitations, we propose an innovative active repair paradigm, introducing a novel unsupervised domain adaptation framework for semi-supervised melody extraction that incorporates Confidence Matrix Replace (CMR) and Nearest Neighbor Supervision (NNS). |
S. Wang; S. Yu; W. Li; | icassp | 2026-05-04 |
| 59 | MIDI-LLaMA: An Instruction-Following Multimodal LLM for Symbolic Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce MIDI-LLaMA, the first instruction-following MLLM for symbolic music understanding. |
M. Yang; J. McCormack; M. T. Llano; W. Su; C. Lei; | icassp | 2026-05-04 |
| 60 | Ailive Mixer: A Deep Learning Based Zero Latency Automatic Music Mixer for Live Music Performances Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present a deep learning-based automatic multitrack music mixing system catered towards live performances. |
D. Zurale; I. Lorente; M. Lester; A. Mitchell; | icassp | 2026-05-04 |
| 61 | Do Foundational Audio Encoders Understand Music Structure? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, their use for music structure analysis (MSA) remains underexplored: only a small subset of FAEs has been examined for MSA, and the impact of factors such as learning methods, training data, and model context length on MSA performance remains unclear. In this study, we conduct comprehensive experiments on 11 types of FAEs to investigate how these factors affect MSA performance. |
K. Toyama; Z. Zhong; A. Takahashi; S. Takahashi; Y. Mitsufuji; | icassp | 2026-05-04 |
| 62 | AI-Generated Music Detection in Broadcast Monitoring Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce AI-OpenBMAT 1, the first dataset tailored to AI-generated music detection in a broadcast setting. |
D. López-Ayala; A. Cabello; P. Zinemanas; E. Molina; M. Rocamora; | icassp | 2026-05-04 |
| 63 | Multi-Stream Music Transformer for Multi-Dimension Automatic Song Aesthetics Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a unified multi-stream music transformer for the ICASSP 2026 SongEval Challenge. |
X. Fan; G. Niu; | icassp | 2026-05-04 |
| 64 | Rethinking Music Captioning with Music Metadata LLMS Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As a more direct approach, we propose metadata-based captioning. |
I. Bukey; Z. Wang; C. Donahue; N. J. Bryan; | icassp | 2026-05-04 |
| 65 | TMD-Bench: A Multi-Level Evaluation Paradigm for Music-Dance Co-Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce TMD-Bench, a benchmark for text-driven music-dance co-generation that assesses systems across unimodal generation quality, instruction adherence, and cross-modal rhythmic alignment. |
XIAODA YANG et. al. | arxiv-cs.SD | 2026-05-03 |
| 66 | MindMelody: A Closed-Loop EEG-Driven System for Personalized Music Intervention Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Furthermore, directly mapping electroencephalography (EEG) to music generation remains challenging due to severe paired-data scarcity and a lack of interpretability. To address these limitations, we propose MindMelody, a fully functional, closed-loop real-time system for EEG-driven personalized music intervention. |
Yimeng Zhang; Yueru Sun; Haoyu Gu; | arxiv-cs.SD | 2026-05-02 |
| 67 | MG-Former: A Transformer-Based Framework for Music-Driven 3D Conducting Gesture Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents TransConductor, a Transformer-based framework for music-driven conducting gesture generation. |
KE QIU et. al. | arxiv-cs.SD | 2026-05-01 |
| 68 | GaMMA: Towards Joint Global-Temporal Music Understanding in Large Multimodal Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose GaMMA, a state-of-the-art (SoTA) large multimodal model (LMM) designed to achieve comprehensive musical content understanding. |
ZUYAO YOU et. al. | arxiv-cs.SD | 2026-04-30 |
| 69 | Generative AI and The Future of Musical Diversity Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Abstract I argue that the current proliferation of generative artificial intelligence (AI) represents a new stage in a longer historical process of distancing humans from their unique individual psyches and of reducing participation and cultural diversity in music. |
Dor Shilton; | Topics in Cognitive Science | 2026-04-27 |
| 70 | Emotion Classification in Music: Leveraging Machine Learning for Music Therapy and Emotional Response Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Jing Wu; | International Journal of Computational Intelligence Systems | 2026-04-26 |
| 71 | Adopting State-of-the-Art Pretrained Audio Representations for Music Recommender Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study establishes a foundation for further exploration of pretrained audio representations to enhance music recommendation systems. |
Yan-Martin Tamm; Anna Aljanaki; | arxiv-cs.IR | 2026-04-24 |
| 72 | A Focused Survey of Generative AI-Based Music Therapy Systems: Recent Progress and Open Challenges Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Based on this survey, we discuss open research challenges at the intersection of generative music, adaptive systems, and digital health, and outline future research directions toward scalable and personalized generative AI-based music therapy. |
Jin S. Seo; | Applied Sciences | 2026-04-23 |
| 73 | Application of Ethnic Music Retrieval System Integrating Multimodal Attention Mechanism in Humanoid Robot Education Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: With the rapid development of Internet technology and multimedia applications, and the continuous expansion of interactive scenes of humanoid robot education, the number of digital resources of ethnic music has reached an unprecedented scale and continues to grow. This study mainly focuses on the construction of a national music retrieval system that integrates multimodal attention mechanisms and its application implementation in humanoid robot education interaction. |
Chuanshan Peng; Laifa Sun; Zhongci Zhou; | International Journal of Humanoid Robotics | 2026-04-22 |
| 74 | A Novel LSTM Music Generator Based on The Fractional Time-frequency Feature Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel approach for generating music based on an artificial intelligence (AI) system. |
LI YA et. al. | arxiv-cs.SD | 2026-04-20 |
| 75 | Video-Robin: Autoregressive Diffusion Planning for Intent-Grounded Video-to-Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present Video-Robin, a novel text-conditioned video-to-music generation model that enables fast, high-quality, semantically aligned music generation for video content. |
VAIBHAVI LOKEGAONKAR et. al. | arxiv-cs.SD | 2026-04-19 |
| 76 | TeMuDance: Contrastive Alignment-Based Textual Control for Music-Driven Dance Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This limitation primarily stems from the absence of large-scale datasets that jointly align music, text, and motion for supervised learning of text-conditioned control. To address this challenge, we propose TeMuDance, a framework that enables text-based control for music-conditioned dance generation without requiring any manually annotated music-text-motion triplet dataset. |
Xinran Liu; Diptesh Kanojia; Wenwu Wang; Zhenhua Feng; | arxiv-cs.CV | 2026-04-18 |
| 77 | TinyMU: A Compact Audio-Language Model for Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present TinyMU, a lightweight (229M) Music-Language Model (MLM) that achieves performance comparable to much larger LALMs while remaining efficient and compact. |
Xiquan Li; Aurian Quelennec; Slim Essid; | arxiv-cs.SD | 2026-04-17 |
| 78 | INTERPRETING MUSICAL MOOD THROUGH MULTIMODAL FEATURE INTEGRATION: A SCALABLE FRAMEWORK FOR INTELLIGENT MUSIC ANALYSIS Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In the given paper, a multimodal deep neural framework that is scaled to identify music emotion in a context-aware manner is introduced to combine acoustic and semantic modalities and achieve improved performance. |
Shital Shankar Gujar; Ali Yawar Reha; | ShodhKosh: Journal of Visual and Performing Arts | 2026-04-17 |
| 79 | Retraction Note: An Ensemble Model of CNN with Bi-LSTM for Automatic Singer Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Mukkamala S. N. V. Jitendra; Y. Radhika; | Multimedia Tools and Applications | 2026-04-17 |
| 80 | Virtual Music Therapy During The COVID-19 Pandemic-an Updated Scoping Review Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: An update to the previous scoping review was necessary to explore the current state of this emerging music therapy discipline and provide essential information for health care practitioners, scholars, and researchers. Eligibility criteria: This scoping review included studies examining how music therapists (population) delivered virtual, remote, or online music therapy (concept) across all client groups during the COVID-19 pandemic (context). |
Monika Bucharová; Barbora Hořejší; Jiří Kantor; Lua Perimal-Lewis; Miloslav Klugar; | JBI Evidence Synthesis | 2026-04-16 |
| 81 | Ontology-Guided Multimodal Framework for Explainable Music Similarity and Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This article introduces a multimodal framework that uses an ontology to make music similarity and recommendation more explainable. |
Mikhail Rumiantcev; | Big Data and Cognitive Computing | 2026-04-15 |
| 82 | A Multidimensional MIR Analysis of Acoustic, Linguistic and Cultural Gaps Between Maskandi and Western Music Genres IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study examines whether commonly used MIR features and multilingual NLP models adequately represent the acoustic, linguistic, and cultural structures of Maskandi music in comparison to Western music and identifies where representational gaps and biases arise. |
Absolom Muzambi; Tebatso Gorgina Moape; Bester Chimbo; | Applied Sciences | 2026-04-14 |
| 83 | Mediation in Carnatic Diasporas: Rendering Sri Ganapathini in Western Notation Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper offers a preliminary examination of how notation functions as a mediating tool in diasporic Carnatic pedagogy through analysis of notation sheets of Tyagaraja’s Sri Ganapathini, a canonical 18th century kriti (song). |
Anisha Srinivasan; | Indialogs | 2026-04-13 |
| 84 | A Study on Facial Emotion Based Song Recommendations Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This project presents a Facial Emotion-Based Song Recommendation System that automatically detects a user’s emotional state through facial expressions and recommends music accordingly. |
Prof. Asha Gaikar; | International Journal for Research in Applied Science and … | 2026-04-13 |
| 85 | The Possibilities of Personalized Music Education Through Exercise Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this article we review the literature to assess the current standing of musical technique education personalization, and what is possible given the current available technology. |
Filippo Carnovalini; Sean Scofield; Louis Verstraeten; Geraint A. Wiggins; | Open Research Europe | 2026-04-13 |
| 86 | MAGE: Modality-Agnostic Music Generation and Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose MAGE, a modality-agnostic framework that unifies multimodal music generation and mixture-grounded editing within a single continuous latent formulation. |
MUHAMMAD USAMA SALEEM et. al. | arxiv-cs.SD | 2026-04-10 |
| 87 | A Multimodal Graph-based Music Auto-tagging Framework: Integrating Social and Content Intelligence Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, most studies ignore the co-occurrence dependencies among tags, which are essential for multi-label prediction. To address these challenges, we propose MuCoGraph, a novel multimodal graph-based hybrid learning framework for music auto-tagging. |
Yang Huang; Duen-Ren Liu; Yi-Hsuan Chen; | Information Technology and Management | 2026-04-10 |
| 88 | Adaptive Feature Fusion Gate and Gated Channel-Spatial Attention in CNN-Transformer Models for Music Genre Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes the Convolutional Neural Network-Gated Transformer Network (CT-GateNet), a hybrid architecture that integrates a gated channel-spatial attention mechanism with an adaptive feature fusion gating mechanism to achieve discriminative feature learning and efficient feature integration. |
Yunyan Ma; Zhenwu Ding; Shuang Wan; Hui Li; Yuan Xu; | PLOS One | 2026-04-09 |
| 89 | Let Me Introduce You: Stimulating Taste-Broadening Serendipity Through Song Introductions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a user study examining the efficacy of immersive and informative introductions in stimulating interest in songs that are beyond one’s usual preferences, an experience called Taste-Broadening Serendipity. |
Brett Binst; Ulysse Maes; Martijn C. Willemsen; Annelien Smets; | arxiv-cs.HC | 2026-04-09 |
| 90 | Explaining Cultural Emotion in Chinese Pop Music with Multimodal AI: Educational and Socio-emotional Implications Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a multimodal emotion recognition model (M2ER) that integrates lyric and audio features to achieve fine-grained, segment-level annotation. |
Qilong Shi; Yan Zhou; | Frontiers in Psychology | 2026-04-09 |
| 91 | Influence of Music on Cortisol Levels in Mechanically Ventilated Critically Ill Patients: A Systematic Review Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Relevance to Clinical Practice This review highlights a major gap in evidence regarding the biological impact of music in IMV patients. |
Carmen Fernández‐Álvarez; David Zuazua‐Rico; M. Pilar Mosteiro‐Díaz; Alba Maestro‐González; | Nursing in Critical Care | 2026-04-08 |
| 92 | YMIR: A New Benchmark Dataset and Model for Arabic Yemeni Music Genre Classification Using Convolutional Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce the Yemeni Music Information Retrieval (YMIR) dataset, which contains 1,475 carefully selected audio clips covering five traditional Yemeni genres: Sanaani, Hadhrami, Lahji, Tihami, and Adeni. |
Moeen AL-Makhlafi; Abdulrahman A. AlKannad; Eiad Almekhlafi; Nawaf Q. Othman Ahmed Mohammed; Saher Qaid; | arxiv-cs.SD | 2026-04-06 |
| 93 | Anchored Cyclic Generation: A Novel Paradigm for Long-Sequence Symbolic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose the Anchored Cyclic Generation (ACG) paradigm, which relies on anchor features from already identified music to guide subsequent generation during the autoregressive process, effectively mitigating error accumulation in autoregressive methods. |
BOYU CAO et. al. | arxiv-cs.SD | 2026-04-06 |
| 94 | Effects of Music Choice on Performance and Psychophysiological Responses to Exercise—A Scoping Review Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Listening to music is a well-established strategy to enhance exercise capacity, yet the specific mechanisms linking music choice to performance enhancement remain fragmented. This scoping review systematically summarizes the existing literature on the effects of music choice (i.e., self-selected, preferred music) on performance and psychophysiological determinants of exercise capacity to establish an updated rationale for the use of personalized music interventions in training. |
Emily S. Pounds; Scott W. Snyder; Rebecca R. Billings; Haley M. Nguyen; Christopher G. Ballmann; | Journal of Functional Morphology and Kinesiology | 2026-03-31 |
| 95 | AUDIENCE RECEPTION AND SOCIAL IMPACT OF HYBRID AFRICAN AMERICAN MUSIC: A SCOPING REVIEW Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This review examines how audiences engage with hybrid African American music and the social outcomes it produces. |
Chinenye Okoro Modesta; Esther Shardey; | EPRA International Journal of Socio-Economic and … | 2026-03-31 |
| 96 | Analysis of The Impact of Machine Learning Algorithms on The Quality of Generated Sounds Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Music generation using broadly understood AI is an evolving field with many challenges and opportunities. This thesis explores the use of generative adversarial networks for this endeavour, focusing on and comparing variety of different solutions that are already developed. |
Krzysztof Pedrycz; Mateusz Pikula; | Journal of Computer Sciences Institute | 2026-03-30 |
| 97 | HumMusQA: A Human-written Music Understanding QA Benchmark Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The evaluation of music understanding in Large Audio-Language Models (LALMs) requires a rigorously defined benchmark that truly tests whether models can perceive and interpret music, a standard that current data methodologies frequently fail to meet. This paper introduces a meticulously structured approach to music evaluation, proposing a new dataset of 320 hand-written questions curated and validated by experts with musical training, arguing that such focused, manual curation is superior for probing complex audio comprehension. |
Benno Weck; Pablo Puentes; Andrea Poltronieri; Satyajeet Prabhu; Dmitry Bogdanov; | arxiv-cs.CL | 2026-03-29 |
| 98 | TokenDance: Token-to-Token Music-to-Dance Generation with Bidirectional Mamba Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Consequently, generated dances often become overly simplistic and repetitive, substantially degrading expressiveness and realism. To tackle this problem, we present TokenDance, a two-stage music-to-dance generation framework that explicitly addresses this limitation through dual-modality tokenization and efficient token-level generation. |
Ziyue Yang; Kaixing Yang; Xulong Tang; | arxiv-cs.AI | 2026-03-28 |
| 99 | Commercialization of Intellectual Property Assets As Intangible Cultural Heritage in The Performing Arts By Developing The National Database on Cultural Heritage in Vietnam (A Case Study on The Art of Đờn Ca Tài Tử Music and Song in Southern Vietnam) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Adopting a systemic approach and intellectual property management, this study identifies opportunities and solutions for the commercialization of digital intangible heritage in the performing arts, using the case of the Art of Đờn ca tài tử music and song base on develope national database on cultural heritage. |
Nguyen Do Duy Quan; | VNU Journal of Science: Policy and Management Studies | 2026-03-25 |
| 100 | The HuMus Project: Corpus Compilation and Multi-experiment Study of Hungarian Popular Music Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The findings are compared with prior studies on English-language lyrics, offering insights into cross-linguistic and cultural patterns in musical expression, and we hope that this work will pave the way for further similar research. The primary objective of the present study is to demonstrate the feasibility of constructing corpora of lyrical texts for languages that are not widely utilized, and of concomitantly accumulating metadata for these corpora. |
Zalán Bodó; Annamária Szenkovits; Csaba Pătcaş; | Acta Universitatis Sapientiae, Informatica | 2026-03-25 |
| 101 | Echoes: A Semantically-aligned Music Deepfake Detection Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Echoes, a new dataset for music deepfake detection designed for training and benchmarking detectors under realistic and provider-diverse conditions. |
Octavian Pascu; Dan Oneata; Horia Cucu; Nicolas M. Muller; | arxiv-cs.SD | 2026-03-24 |
| 102 | DexDrummer: In-Hand, Contact-Rich, and Long-Horizon Dexterous Robot Drumming Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To further test the capabilities of dexterity, we propose drumming as a testbed for dexterous manipulation. |
Hung-Chieh Fang; Amber Xie; Jennifer Grannen; Kenneth Llontop; Dorsa Sadigh; | arxiv-cs.RO | 2026-03-23 |
| 103 | Visual Lyrics: Generating Animated Text for Music Lyric Videos with An Augmented Text Editor Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Animated lyric videos transform song lyrics into dynamic visual experiences, offering a powerful medium for artistic expression and audience engagement. However, creating these … |
David Chuan-En Lin; Cuong Nguyen; Hijung Valentina Shin; Nikolas Martelaro; | Proceedings of the 31st International Conference on … | 2026-03-22 |
| 104 | The Impact of Music Therapy on Cognitive Abilities in Patients with Alzheimer’s Disease and Related Dementias: A Systematic Review Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As such, this review aimed to systematically evaluate the impact of music therapy on cognitive function and any parameters that maximise efficacy. |
Emily Su; Leanne Kenway; | British Journal of Music Therapy | 2026-03-19 |
| 105 | Probing Cultural Signals in Large Language Models Through Author Profiling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This finding emerges from both the models’ prediction distributions and an analysis of their generated rationales. To quantify these disparities, we introduce two fairness metrics, Modality Accuracy Divergence (MAD) and Recall Divergence (RD), and show that Ministral-8B displays the strongest ethnicity bias among the evaluated models, whereas Gemma-12B shows the most balanced behavior. |
Valentin Lafargue; Ariel Guerra-Adames; Emmanuelle Claeys; Elouan Vuichard; Jean-Michel Loubes; | arxiv-cs.CL | 2026-03-17 |
| 106 | Tarab: A Multi-Dialect Corpus of Arabic Lyrics and Poetry Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We describe the data collection, normalisation, and validation pipeline and present baseline analyses for variety identification and genre differentiation. |
Mo El-Haj; | arxiv-cs.CL | 2026-03-17 |
| 107 | Music Genre Classification: A Comparative Analysis of Classical Machine Learning and Deep Learning Approaches Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we construct a novel dataset of approximately 8,000 labeled 30-second audio clips spanning eight Nepali music genres and conduct a systematic comparison of nine classification models across two paradigms. |
Sachin Prajuli; Abhishek Karna; OmPrakash Dhakl; | arxiv-cs.SD | 2026-03-16 |
| 108 | An Intelligent Music Recommendation System Using Machine Learning and User Preference Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This research presents a comprehensive hybrid music recommendation system that integrates Content-Based Filtering, Collaborative Filtering, and Machine Learning techniques to enhance personalization, scalability, and accuracy. |
Kandula Manikanta; | International Journal for Research in Applied Science and … | 2026-03-16 |
| 109 | Research on Song Dynasty Copper Mirror Pattern Recognition Based on MOEAD Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Qing Feng; Kexin Yu; Yaxuan Li; Tian Ma; | npj Heritage Science | 2026-03-16 |
| 110 | Music Emotion Classification with Neural Network Architecture and Librosa Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Traditional models often rely on raw audio or textual features, which may not fully capture the rich emotional content embedded in music. To address this, we propose a Convolutional Neural Network (CNN)-based model combined with Librosa for feature extraction to classify musical emotions effectively. |
K. Navaneetha; Dr.M.V.A Naidu; D. Raghu; K. Ruchika; | International Journal of Scientific Research in Engineering … | 2026-03-15 |
| 111 | Music Source Restoration with Ensemble Separation and Targeted Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address MSR, we propose a two-stage system. |
Xinlong Deng; Yu Xia; Jie Jiang; | arxiv-cs.SD | 2026-03-13 |
| 112 | Enhancing Music Recommendation with User Mood Input Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we explore a mood-assisted recommendation system that suggests songs based on the desired mood using the energy-valence spectrum. |
Terence Zeng; | arxiv-cs.IR | 2026-03-12 |
| 113 | V2M-Zero: Zero-Pair Time-Aligned Video-to-Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce V2M-Zero, a zero-pair video-to-music generation approach that outputs time-aligned music for video. |
YAN-BO LIN et. al. | arxiv-cs.CV | 2026-03-11 |
| 114 | Bayesian-Optimized Ensemble Learning for Music Popularity Prediction with Shapley-Based Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, music popularity prediction is formulated as a supervised regression problem, and six widely-used tree ensemble models (Random Forest, XGBoost, CatBoost, LightGBM, Extra Trees, and Decision Tree) are systematically evaluated using large-scale Spotify data. |
Liang Qiu; Penghui Wang; Jing Zhao; Hong Zhang; Mujiangshan Wang; | Mathematics | 2026-03-11 |
| 115 | The Costs of Reproducibility in Music Separation Research: A Replication of Band-Split RNN Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Unfortunately, it is not straightforward to reproduce since its full code is not available. In this paper, we attempt to replicate BSRNN as closely as possible to the original paper through extensive experiments, which allows us to conduct a critical reflection on this reproducibility issue. |
Paul Magron; Romain Serizel; Constance Douwes; | arxiv-cs.SD | 2026-03-10 |
| 116 | From Daily Song to Daily Self: Supporting Reflective Songwriting of Deaf and Hard-of-Hearing Individuals Through Generative Music AI Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce SoulNote, a GenAI system enabling DHH to engage in iterative songwriting. |
YOUJIN CHOI et. al. | arxiv-cs.HC | 2026-03-09 |
| 117 | Designing A Generative AI-Assisted Music Psychotherapy Tool for Deaf and Hard-of-Hearing Individuals Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In response, this study presents a music psychotherapy tool co-designed with therapists, integrating conversational agents (CAs) and music generative AI as symbolic and therapeutic media. |
Youjin Choi; Jaeyoung Moon; Jinyoung Yoo; Jennifer G. Kim; Jin-Hyuk Hong; | arxiv-cs.HC | 2026-03-09 |
| 118 | Multi-source Music Melody Extraction Based on GCN and Time Frequency Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Ruixuan Shu; Bin Huang; | Discover Artificial Intelligence | 2026-03-09 |
| 119 | EDMFormer: Genre-Specific Self-Supervised Learning for Music Structure Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce EDMFormer, a transformer model that combines self-supervised audio embeddings using an EDM-specific dataset and taxonomy. |
Sahal Sajeer; Krish Patel; Oscar Chung; Joel Song Bae; | arxiv-cs.SD | 2026-03-08 |
| 120 | Retraction Note: Using Deep Learning and Genetic Algorithms for Melody Generation and Optimization in Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Ling Dong; | Soft Computing | 2026-03-03 |
| 121 | Analysis of Mental Health Treatment with AI Music Creation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper mainly uses AI music to alleviate and prevent psychological problems, talks about the way how AI analyzes and understands the melodies present, and what kind of emotions the problems will face in a technical way or even in ethics. |
Mengfei Wu; | Applied and Computational Engineering | 2026-03-02 |
| 122 | False But Phonologically Plausible Linguistic Priors Induce Cross-linguistic Auditory Illusions and Attenuate Electrophysiological Markers of Surprise Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Time-frequency analysis revealed an early gamma response followed by beta-band dominance in the illusory condition, consistent with initial sensory mismatch followed by top-down assimilation. |
ENRICO GIRALDI et. al. | Imaging Neuroscience | 2026-03-02 |
| 123 | Effects of The Single and Combined Effect of Music and Other Strategies on Combat Sport Performance: A Systematic Review and Meta-analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Introduction There is a lack of systematic mechanism regarding the single and combined effect of listening to music with other strategies on the physical and psychophysiological performance of combat sport athletes. This systematic review and meta-analysis examined the single and combined effects of musical interventions on the technical, physical, physiological, and psychological performance of combat sports athletes, while identifying possible synergistic ergogenic strategies with music. |
NIDHAL JEBABLI et. al. | Frontiers in Sports and Active Living | 2026-03-02 |
| 124 | SyncTrack: Rhythmic Stability and Synchronization in Multi-Track Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce SyncTrack, a synchronous multi-track waveform music generation model designed to capture the unique characteristics of multi-track music. |
HONGRUI WANG et. al. | arxiv-cs.SD | 2026-03-01 |
| 125 | CMI-RewardBench: Evaluating Music Reward Models with Compositional Multimodal Instruction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While music generation models have evolved to handle complex multimodal inputs mixing text, lyrics, and reference audio, evaluation mechanisms have lagged behind. In this paper, we bridge this critical gap by establishing a comprehensive ecosystem for music reward modeling under Compositional Multimodal Instruction (CMI), where the generated music may be conditioned on text descriptions, lyrics, and audio prompts. |
YINGHAO MA et. al. | arxiv-cs.SD | 2026-02-28 |
| 126 | Efficient Long-Sequence Diffusion Modeling for Symbolic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Though recent diffusion-based models produce high quality generations, they tend to suffer from high training and inference costs with long symbolic sequences due to iterative denoising and sequence-length-related costs. To deal with such problem, we put forth a diffusing strategy named SMDIM to combine efficient global structure construction and light local refinement. |
JINHAN XU et. al. | arxiv-cs.SD | 2026-02-28 |
| 127 | Voices of Civilizations: A Multilingual QA Benchmark for Global Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Voices of Civilizations, the first multilingual QA benchmark for evaluating audio LLMs’ cultural comprehension on full-length music recordings. |
SHANGDA WU et. al. | arxiv-cs.SD | 2026-02-28 |
| 128 | Artificial Intelligence Approaches in Music Audio Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In recent years, endto-end deep learning methods have shown stronger adaptability and accuracy by automatically extracting features for classification and recognition, promoting the advancement of pure music audio analysis technology. This article aims to provide theoretical support and practical guidance for researchers in related fields by organizing the application of artificial intelligence methods in pure music audio analysis, comparing and analyzing the advantages and disadvantages of various methods, and promoting the sustainable development and technological innovation of this field. |
Weihan Wang; | Science and Technology of Engineering, Chemistry and … | 2026-02-28 |
| 129 | A Personalized Emotional Therapy System Driven By Generative Artificial Intelligence and Transformer Technology Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The study takes 214 adult users aged 18-45 as the object, and collects their music behavior data and text evaluation in the past half year for experimental verification. |
Dongyun Chang; | Journal of Mechanics in Medicine and Biology | 2026-02-27 |
| 130 | SongSong: A Time Phonograph for Chinese SongCi Music from Thousand of Years Away Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce SongSong, the first music generation model capable of restoring Chinese SongCi to our knowledge. |
JIAJIA LI et. al. | arxiv-cs.SD | 2026-02-27 |
| 131 | Effect of Music Intervention on Heart Rate Variability: A Systematic Review and Meta-analysis of Randomized Controlled Trials Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Objective To evaluate the effects of music intervention on heart rate variability (HRV). Methods The protocol of this systematic review has been submitted for registration in the … |
ENYUAN ZHANG et. al. | Frontiers in Psychology | 2026-02-25 |
| 132 | Audio-Based Song Identification System Using Python and Cloud Music Recognition API Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This research presents the development of an audio-based song identification system using Python and a cloud-based music service. |
Sagar A. Gavade; Mayuresh D. Kadam; Aditya S. Mane; Prajyot R. Mohite; Madhuri M. Kamble; | International Journal of Scientific Research in Engineering … | 2026-02-25 |
| 133 | CoLyricist: Enhancing Lyric Writing with AI Through Workflow-Aligned Support Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose CoLyricist, an AI-assisted lyric writing tool designed to support the typical workflows of experienced lyricists and enhance their creative efficiency. |
MASAHIRO YOSHIDA et. al. | arxiv-cs.HC | 2026-02-25 |
| 134 | MIDI-Informed Singing Accompaniment Generation in A Compositional Song Pipeline Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We advocate a compositional alternative that decomposes the task into melody composition, singing voice synthesis, and singing accompaniment generation. |
FANG-DUO TSAI et. al. | arxiv-cs.SD | 2026-02-24 |
| 135 | Voices of The Mountains: Deep Learning-Based Vocal Error Detection System for Kurdish Maqams Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We developed a two-headed CNN-BiLSTM with attention mode to decide whether a window contains an error and to classify it based on the chosen errors. |
Darvan Shvan Khairaldeen; Hossein Hassani; | arxiv-cs.SD | 2026-02-24 |
| 136 | A Bibliometric Analysis of Sustainable Music Education Research in Malaysia: Trends, Themes, and Scholarly Influence Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, the thematic focus and development of research in music education in Malaysia has been examined by a bibliometric analysis of the articles using VOSviewer. |
AH HUAI AH CHAN et. al. | Architecture Image Studies | 2026-02-24 |
| 137 | Depth-Structured Music Recurrence: Budgeted Recurrent Attention for Full-Piece Symbolic Music Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Depth-Structured Music Recurrence (DSMR), a recurrent long-context Transformer for full-piece symbolic music modeling that extends context beyond fixed-length excerpts via segment-level recurrence with detached cross-segment states, featuring a layer-wise memory-horizon schedule that budgets recurrent KV states across depth. |
Yungang Yi; | arxiv-cs.SD | 2026-02-23 |
| 138 | SongEcho: Towards Cover Song Generation Via Instance-Adaptive Element-wise Linear Modulation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we reformulate our cover song generation as a conditional generation, which simultaneously generates new vocals and accompaniment conditioned on the original vocal melody and text prompts. |
SIFEI LI et. al. | arxiv-cs.SD | 2026-02-23 |
| 139 | Methods for Pitch Analysis in Contemporary Popular Music: Multiphonic Tones Across Genres Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study argues that electronic tones routinely used in contemporary popular music – including 808-style bass and power chords – are structurally and perceptually equivalent to multiphonics in contemporary classical music. |
EMMANUEL DERUTY et. al. | arxiv-cs.SD | 2026-02-20 |
| 140 | THE ERGOGENIC POTENTIAL OF RHYTHMIC AUDITORY STIMULATION ON EXERCISE KINETICS AND PSYCHOPHYSIOLOGICAL RESPONSES Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This review evaluates the impact of music on aerobic and anaerobic performance and RPE, focusing on the roles of synchronization, attentional dissociation, and individual preference. |
Klaudia Brzoza; Filip Matusiak; | International Journal of Innovative Technologies in Social … | 2026-02-20 |
| 141 | Enhancing STEAM Education Through A “Know-and-Create” Learning Loop: A Case Study of A Generative AI Workshop on Art at Osaka Metropolitan University College of Technology Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study not only demonstrates that STEAM education incorporating generative AI is an effective approach to fostering students’ creativity and motivation to learn, but also addresses practical challenges such as ethical considerations and the need for supportive educational environments. |
Sachiko Nakajima; ; Takeshi Suzuka; Kohei Moroi; Tomoharu Doi; Suguru Higashida; | Journal of Robotics and Mechatronics | 2026-02-19 |
| 142 | Art2Mus: Artwork-to-Music Generation Via Visual Conditioning and Large-Scale Cross-Modal Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing image-conditioned systems suffer from two fundamental limitations: (i) they are typically trained on natural photographs, limiting their ability to capture the richer semantic, stylistic, and cultural content of artworks; and (ii) most rely on an image-to-text conversion stage, using language as a semantic shortcut that simplifies conditioning but prevents direct visual-to-audio learning. Motivated by these gaps, we introduce ArtSound, a large-scale multimodal dataset of 105,884 artwork-music pairs enriched with dual-modality captions, obtained by extending ArtGraph and the Free Music Archive. |
IVAN RINALDI et. al. | arxiv-cs.CV | 2026-02-19 |
| 143 | MusicSem: A Semantically Rich Language–Audio Dataset of Natural Music Descriptions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce MusicSem, a dataset of 32,493 language-audio pairs derived from organic music-related discussions on the social media platform Reddit. |
REBECCA SALGANIK et. al. | arxiv-cs.MM | 2026-02-19 |
| 144 | YuE: Scaling Open Foundation Models for Long-Form Music Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We tackle the task of long-form music generation, particularly the challenging \textbf{lyrics-to-song} problem, by introducing \textbf{YuE (乐)}, a family of open-source music generation foundation models. |
RUIBIN YUAN et. al. | iclr | 2026-02-17 |
| 145 | Music Flamingo: Scaling Music Understanding in Audio Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Music Flamingo, a novel large audio–language model, designed to advance music (including song) understanding in foundational audio models.We believe this work provides both a benchmark and a foundation for the community to build the next generation of models that engage with music as richly and meaningfully as humans do. |
SREYAN GHOSH et. al. | iclr | 2026-02-17 |
| 146 | Can Music Therapy Improve Cognition in Dementia As Measured with Magnetoencephalography: A Hypothesis Study Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Background/Objectives: The incidence of dementia and the concurrent burden on healthcare will increase with a population that continues to age. Pharmaceutical interventions for … |
BENJAMIN SLADE et. al. | Biomedicines | 2026-02-17 |
| 147 | Generative Adversarial Post-Training Mitigates Reward Hacking in Live Human-AI Music Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel adversarial training method on policy-generated trajectories to mitigate reward hacking in RL post-training for melody-to-chord accompaniment. |
YUSONG WU et. al. | iclr | 2026-02-17 |
| 148 | Semantic Communities and Boundary-Spanning Lyrics in K-pop: A Graph-Based Unsupervised Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a graph-based framework for unsupervised discovery and evaluation of semantic communities in K-pop lyrics using line-level semantic representations. |
Oktay Karakuş; | arxiv-cs.SI | 2026-02-13 |
| 149 | Artificial Intelligence in Music: A Bibliometric and Systematic Review of Creation, Performance, and Education Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces and frames the concept of music intelligence and employs bibliometric and systematic review methodologies to comprehensively analyze music intelligence. |
Fei Tong; Dongjing Jiang; Qingchong Jiao; Albina Isufi; Flynnwell Jianfei Zhang; | Journal of Artificial Intelligence and Soft Computing … | 2026-02-13 |
| 150 | Deep Beats, Deep Thoughts? Predicting General Cognitive Ability from Natural Music-Listening Behavior Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Accordingly, natural music-listening patterns may reveal insights into individual differences in general cognitive ability (GCA). In this study (N = 185), we used real-world smartphone-based music-listening records collected over five months to explore this question. |
Larissa Sust; Maximilian Bergmann; Markus Bühner; Ramona Schoedel; | Journal of Intelligence | 2026-02-13 |
| 151 | Culturally-Responsive AI Assessment Systems for Ohangla Music: A Literature Report Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The purpose of this article is to address the systematic bias in artificial intelligence (AI) music assessment systems that marginalise Indigenous African musical forms, specifically Ohangla music from Kenya’s Luo community. |
Brian Bichanga Nyandieka; | Journal of Music and Creative Arts (JMCA) | 2026-02-13 |
| 152 | Mobile Music Therapy Integrating AI-Driven Emotion Prediction and A Human-Computer Interaction Experience Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing artificial intelligence (AI) emotion prediction methods also lack robust multimodal data fusion in mobile environments and alignment with clinical workflows. A closed-loop mobile music therapy system incorporating multimodal AI emotion prediction was developed in this study to address these limitations. |
Ruqi Bai; | International Journal of Interactive Mobile Technologies … | 2026-02-13 |
| 153 | Navigating The Intersection of Generative Artificial Intelligence and Democratic Pedagogy in Music Education in Macau Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Abstract This study investigates the intersection of generative artificial intelligence (GenAI) and democratic pedagogy in K-12 music education in Macau, centring on the experiences of a single teacher, Adam, throughout a school year. |
Katy Ieong Cheng Ho Weatherly; | British Journal of Music Education | 2026-02-12 |
| 154 | IncompeBench: A Permissively Licensed, Fine-Grained Benchmark for Music Information Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, there is a lack of high-quality benchmarks for evaluating music retrieval performance. To address this issue, we introduce \textbf{IncompeBench}, a carefully annotated benchmark comprising $1,574$ permissively licensed, high-quality music snippets, $500$ diverse queries, and over $125,000$ individual relevance judgements. |
BENJAMIN CLAVIÉ et. al. | arxiv-cs.IR | 2026-02-12 |
| 155 | Beyond Musical Descriptors: Extracting Preference-Bearing Intent in Music Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce MusicRecoIntent, a manually annotated corpus of 2,291 Reddit music requests, labeling musical descriptors across seven categories with positive, negative, or referential preference-bearing roles. |
Marion Baranes; Romain Hennequin; Elena V. Epure; | arxiv-cs.SD | 2026-02-11 |
| 156 | Revisiting Content-Based Music Recommendation: Efficient Feature Aggregation from Large-Scale Music Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, current recommender system evaluation frameworks remain inadequate, as they neither fully leverage multimodal information nor support a diverse range of algorithms, especially multimodal methods. To address these limitations, we propose TASTE, a comprehensive dataset and benchmarking framework designed to highlight the role of multimodal information in music recommendation. |
Yizhi Zhou; Jia-Qi Yang; De-Chuan Zhan; Da-Wei Zhou; | arxiv-cs.IR | 2026-02-10 |
| 157 | Sentiment Analysis of The Song Lost – Bring Me The Horizon Based on Reviews on YouTube Using The SVM Algorithm Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study was conducted to analyze listener sentiment towards the song Lost based on YouTube comments using the Support Vector Machine (SVM) algorithm and the Term Frequency–Inverse Document Frequency (TF-IDF) feature extraction technique. |
Efraim Moningkey; Wanki Dwi Warsun Sombo; | Journal La Multiapp | 2026-02-10 |
| 158 | Submodular Maximization Over A Matroid $k$-Intersection: Multiplicative Improvement Over Greedy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the problem of maximizing a non-negative monotone submodular objective $f$ subject to the intersection of $k$ arbitrary matroid constraints. |
Moran Feldman; Justin Ward; | arxiv-cs.DS | 2026-02-09 |
| 159 | Glow with The Flow: AI-Assisted Creation of Ambient Lightscapes for Music Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite the growing availability of consumer smart lighting, the creation of designed light for music visualization remains limited to professional contexts due to time and skill constraints. To address this, we present an AI-assisted system for generating ambient light sequences for music videos. |
Frederic Anthony Robinson; Vishnu Raj; David Cooper; Fan Du; David Gunawan; | arxiv-cs.HC | 2026-02-09 |
| 160 | Tutti: Expressive Multi-Singer Synthesis Via Structure-Level Timbre Control and Vocal Texture Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While existing Singing Voice Synthesis systems achieve high-fidelity solo performances, they are constrained by global timbre control, failing to address dynamic multi-singer arrangement and vocal texture within a single song. To address this, we propose Tutti, a unified framework designed for structured multi-singer generation. |
JIATAO CHEN et. al. | arxiv-cs.SD | 2026-02-08 |
| 161 | AI-Generated Music Detection in Broadcast Monitoring Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce AI-OpenBMAT, the first dataset tailored to broadcast-style AI-music detection. |
David Lopez-Ayala; Asier Cabello; Pablo Zinemanas; Emilio Molina; Martin Rocamora; | arxiv-cs.SD | 2026-02-06 |
| 162 | Video-based Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To enhance temporal synchronization, we introduce a novel temporal boundary conditioning method, called boundary offset encodings, aligning musical chords with scene changes. |
Serkan Sulun; | arxiv-cs.LG | 2026-02-05 |
| 163 | Fine-Tuning Large Language Models for Automatic Detection of Sexually Explicit Content in Spanish-Language Song Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents an approach to the automatic detection of sexually explicit content in Spanish-language song lyrics by fine-tuning a Generative Pre-trained Transformer (GPT) model on a curated corpus of 100 songs, evenly divided between expert-labeled explicit and non-explicit categories. |
Dolores Zamacola Sánchez de Lamadrid; Eduardo C. Garrido-Merchán; | arxiv-cs.CY | 2026-02-05 |
| 164 | Chatbot Song Recommender System Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The swift progress of Artificial Intelligence (AI) and Machine Learning (ML) has made it possible to create intelligent conversation agents that not only understand but also react to human feelings. In this context, the project presents the Chatbot Song Recommender System which is a smart integration of Natural Language Processing (NLP), sentiment analysis, and music recommendation algorithms. |
N Advytha Reddy; Lahari T; Harshith Yepuri; Jai Ganesh; | International Journal For Multidisciplinary Research | 2026-02-05 |
| 165 | Effects of Combined Music and Exercise Therapy on Depression: A Systematic Review and Meta-Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Background: Depression is a common psychiatric disorder that impacts millions globally. Non-pharmacological treatments, such as combining music and exercise therapy, have shown … |
Xiaoqin Luo; Xingyue Zhang; Carmelo Mario Vicario; Zhu Mei; Fengxue Qi; | Journal of Integrative and Complementary Medicine | 2026-02-04 |
| 166 | A Design Space for Live Music Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the interdisciplinary nature of music has led to fragmented development across research communities, hindering effective communication and collaborative progress. In this work, we bring together perspectives from these diverse fields to map the current landscape of live music agents. |
YEWON KIM et. al. | arxiv-cs.HC | 2026-02-04 |
| 167 | Music Therapy for Anxiety Reduction in Non-acute Surgical Fracture Patients: A Systematic Review and Meta-analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: BACKGROUND Anxiety is a common issue among non-acute surgical fracture patients. Non-pharmacological interventions are needed. This meta-analysis aims to synthesize evidence on … |
Yu-Lan Tang; Ying-Xin Zhao; Xiao Wang; | World Journal of Orthopedics | 2026-02-04 |
| 168 | D3PIA: A Discrete Denoising Diffusion Model for Piano Accompaniment Generation From Lead Sheet Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a discrete diffusion-based piano accompaniment generation model, D3PIA, leveraging local alignment between lead sheet and accompaniment in piano-roll representation. |
Eunjin Choi; Hounsu Kim; Hayeon Bang; Taegyun Kwon; Juhan Nam; | arxiv-cs.SD | 2026-02-03 |
| 169 | Multitrack Music Transcription Based on Joint Learning of Onset and Frame Streams Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a framework to jointly transcribe onsets and frames for multiple instruments by integrating a deep learning architecture based on U-Net with an architecture based on Perceiver, which is a variant of the Transformer architecture. |
Tomoki Matsunaga; Hiroaki Saito; | Signals | 2026-02-02 |
| 170 | Rethinking Music Captioning with Music Metadata LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As a more direct approach, we propose metadata-based captioning. |
Irmak Bukey; Zhepei Wang; Chris Donahue; Nicholas J. Bryan; | arxiv-cs.SD | 2026-02-02 |
| 171 | GAN-Based Model for Multi-Instrument Collaborative Music Generation Using Deep Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we propose a multi-instrument music generation model based on a conditional Generative Adversarial Network (cGAN) that explicitly learns different instrument performance patterns and their coordination. |
Yuanhang Lin; | Informatica | 2026-02-02 |
| 172 | Continuous ValenceArousal Regression for Music Emotion Estimation Using Machine Learning with OpenL3 Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Based on the public MediaEval Database for Emotional Analysis of Music (DEAM), this study employs OpenL3 pre-trained audio embeddings as a unified feature representation and performs frame-level feature extraction with temporal alignment to 2 Hz continuous Valence-Arousal (VA) annotations. |
Jingyi Lyu; | Communications in Humanities Research | 2026-02-02 |
| 173 | Investigating Common Cognitive Processes Between Music and Mathematics in Educational Contexts: A Systematic Review of The Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: A qualitative systematic review, following the PRISMA 2020 guidelines, analyzed 15 studies published between 1998 and 2024 in the Scopus, Web of Science, and EBSCO databases. |
Karla Valdebenito; Sergio Sepulveda-Vallejos; Alejandro Almonacid-Fierro; Mirko Aguilar-Valdés; | Online Journal of Music Sciences | 2026-02-01 |
| 174 | AoA Estimation Using Sufficient Statistic-Based MUSIC Algorithms for Low SNR Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This letter primarily focuses on efficient angular estimation in low signal-to-noise ratio (SNR) conditions, as well as small-sample scenarios, aiming to facilitate the … |
CHUNLEI SUN et. al. | IEEE Transactions on Vehicular Technology | 2026-02-01 |
| 175 | ACE-Step 1.5: Pushing The Boundaries of Open-Source Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present ACE-Step v1.5, a highly efficient open-source music foundation model that brings commercial-grade generation to consumer hardware. |
JUNMIN GONG et. al. | arxiv-cs.SD | 2026-01-31 |
| 176 | How Far Can Pretrained LLMs Go in Symbolic Music? Controlled Comparisons of Supervised and Preference-based Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a controlled comparative study of finetuning strategies for ABC-based generation and understanding, comparing an off-the-shelf instruction-tuned backbone to domain-adapted variants and a music-specialized LLM baseline. |
Deepak Kumar; Emmanouil Karystinaios; Gerhard Widmer; Markus Schedl; | arxiv-cs.SD | 2026-01-30 |
| 177 | Restoration of Phonograms from Collection of Rare Records of Department of Music Fonds of V.I. Vernadskyi National Library of Ukraine: Digitization, Digital Restoration, Creation of Metadata, Integration Into International Music Web Resources Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Andrii Kriuchyn; Yurii Kovtaniuk; ; Larysa Ivchenko; ; Igor Kosiak; ; Olena Tsybulska; | Rukopisna ta knižkova spadŝina Ukraïni | 2026-01-29 |
| 178 | Modeling Music Student Teachers’ Behavioral Intention of Using Artificial Intelligence in China Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Methods This study employed an online questionnaire based on an extended Unified Theory of Acceptance and Use of Technology (UTAUT) model. |
Yanlong Niu; | Frontiers in Psychology | 2026-01-29 |
| 179 | MIDI-LLaMA: An Instruction-Following Multimodal LLM for Symbolic Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce MIDI-LLaMA, the first instruction-following MLLM for symbolic music understanding. |
Meng Yang; Jon McCormack; Maria Teresa Llano; Wanchao Su; Chao Lei; | arxiv-cs.MM | 2026-01-29 |
| 180 | Music Plagiarism Detection: Problem Formulation and A Segment-based Solution Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To fix this situation, we defined how Music Plagiarism Detection is different from other MIR tasks and explained what problems need to be solved. We introduce the Similar Music Pair dataset to support this newly defined task. |
Seonghyeon Go; Yumin Kim; | arxiv-cs.SD | 2026-01-28 |
| 181 | Enhancing Automatic Music Transcription of Thai Xylophone Music Performed with Hard Mallets: A Deep Learning Approach and Comparative Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Apichai Huaysrijan; Sunee Pongpinigpinyo; | Multimedia Tools and Applications | 2026-01-27 |
| 182 | Investigating How Music Affects Persuasion, Engagement, and Emotion in Data Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We conducted a preregistered study into the effect of music across three dimensions: persuasion, engagement, and emotion. |
SARMISTHA SARNA GOMASTA et. al. | arxiv-cs.HC | 2026-01-25 |
| 183 | The Application of AI-assisted Music Therapy Tools in Mental Health Interventions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This review examines the application potential and implementation pathways of AI-assisted music therapy tools for mental health interventions. |
Qiuyan Wei; Wenting He; | Frontiers in Psychology | 2026-01-22 |
| 184 | PF-D2M: A Pose-free Diffusion Model for Universal Dance-to-Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose PF-D2M, a universal diffusion-based dance-to-music generation model that incorporates visual features extracted from dance videos. |
Jaekwon Im; Natalia Polouliakh; Taketo Akama; | arxiv-cs.SD | 2026-01-22 |
| 185 | Pay (Cross) Attention to The Melody: Curriculum Masking for Single-Encoder Melodic Harmonization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce a training curriculum, FF (full-to-full), which keeps all harmony tokens masked for several training steps before progressively unmasking entire sequences during training to strengthen melody-harmony interactions. |
MAXIMOS KALIAKATSOS-PAPAKOSTAS et. al. | arxiv-cs.SD | 2026-01-22 |
| 186 | Bangla Music Genre Classification Using Bidirectional LSTMS Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work introduces a novel music dataset comprising ten distinct genres of Bangla music. |
Muntakimur Rahaman; Md Mahmudul Hoque; Md Mehedi Hassain; | arxiv-cs.SD | 2026-01-21 |
| 187 | Is Symbolic Music A Specific Language? Exploring Inspiration-to-Structure Machine Composition Via LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose Inspiration-to-Structure (IoS), a cognitively inspired framework that enables LLMs to generate structured musical sections from melodic ideas. |
ZHEJING HU et. al. | aaai | 2026-01-20 |
| 188 | Aligning Generative Music AI with Human Preferences: Methods and Challenges Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We identify key research challenges including scalability to long-form compositions, reliability amongst others in preference modelling. |
Dorien Herremans; Abhinaba Roy; | aaai | 2026-01-20 |
| 189 | AutoSchA: Automatic Hierarchical Music Representations Via Multi-Relational Node Isolation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper thus introduces a novel approach, AutoSchA, which extends recent developments in graph neural networks (GNNs) for hierarchical music analysis. |
STEPHEN NI-HAHN et. al. | aaai | 2026-01-20 |
| 190 | Let The Model Learn to Feel: Mode-Guided Tonality Injection for Symbolic Music Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While these models effectively capture distributional musical semantics, they often overlook tonal structures, particularly musical modes, which play a critical role in emotional perception according to music psychology. In this paper, we investigate the representational capacity of MIDIBERT and identify its limitations in capturing mode-emotion associations. |
Haiying Xia; Zhongyi Huang; Yumei Tan; Shuxiang Song; | aaai | 2026-01-20 |
| 191 | Pengembangan Model Deep Learning Long Short-Term Memory Untuk Generasi Musik Berbasis Data MIDI Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: An important contribution of this study is the provision of a systematic methodological documentation of the LSTM-based music generation pipeline, which can serve as a practical reference for future development and research in deep learning–based music generation. |
Muhammad Djamaluddin; Imam Fathurrahman; M. Nurul Wathani; | Infotek: Jurnal Informatika dan Teknologi | 2026-01-20 |
| 192 | Video Echoed in Music: Semantic, Temporal, and Rhythmic Alignment for Video-to-Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the challenges, we propose Video Echoed in Music (VeM), a latent music diffusion that generates high-quality soundtracks with semantic, temporal, and rhythmic alignment for input videos. |
XINYI TONG et. al. | aaai | 2026-01-20 |
| 193 | MPJudge: Towards Perceptual Assessment of Music-Induced Paintings Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing methods primarily rely on emotion recognition models to assess the similarity between music and painting, but such models introduce considerable noise and overlook broader perceptual cues beyond emotion. To address these limitations, we propose a novel framework for music-induced painting assessment that directly models perceptual coherence between music and visual art. |
Shiqi Jiang; Tianyi Liang; Huayuan Ye; Changbo Wang; Chenhui Li; | aaai | 2026-01-20 |
| 194 | SteerMusic: Enhanced Musical Consistency for Zero-shot Text-Guided and Personalized Music Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose two music editing methods that improve the consistency between the original and edited music by leveraging score distillation. |
XINLEI NIU et. al. | aaai | 2026-01-20 |
| 195 | Abusive Music and Song Transformation Using GenAI and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we explore the use of generative artificial intelligence (GenAI) and Large Language Models (LLMs) to automatically transform abusive words (vocal delivery) and lyrical content in popular music. |
Jiyang Choi; Rohitash Chandra; | arxiv-cs.SD | 2026-01-20 |
| 196 | Every Little Bit Helps: Exploring Better Utilization of Unlabeled Data for Semi-supervised Singing Melody Extraction Using Multi-bands Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present \emph{ELH-SME}, a novel framework that better utilizes the unlabeled musical data for SSME task. |
Shuai Yu; Xiaoliang He; Kangjie Dong; Yi Yu; | aaai | 2026-01-20 |
| 197 | Towards Effective Negation Modeling in Joint Audio-Text Models for Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Negation is fundamental for distinguishing the absence (or presence) of musical elements (e.g., with vocals vs. without vocals), but current systems fail to represent this reliably. In this work, we investigate and mitigate this limitation by training CLAP models from scratch on the Million Song Dataset with LP-MusicCaps-MSD captions. |
Yannis Vasilakis; Rachel Bittner; Johan Pauwels; | arxiv-cs.SD | 2026-01-20 |
| 198 | Melodia: Training-Free Music Editing Guided By Attention Probing in Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In contrast, self-attention maps are essential for preserving the temporal structure of the source music during its conversion into the target music. Building upon this understanding, we present Melodia, a training-free technique that selectively manipulates self-attention maps in particular layers during the denoising process and leverages an attention repository to store source music information, achieving accurate modification of musical characteristics while preserving the original structure without requiring textual descriptions of the source music. |
YI YANG et. al. | aaai | 2026-01-20 |
| 199 | Supervised Learning for Game Music Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The aim of this study is to explore the performance of supervised learning methods in the task of structural segmentation, which is the initial step in music structure modelling. |
Shangxuan Luo; Joshua Reiss; | arxiv-cs.SD | 2026-01-19 |
| 200 | Democratizing Music Therapy: LLM-Based Automated EEG Analysis and Progress Tracking for Low-Cost Home Devices Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While large language models (LLMs) have shown promise in various domains, their application to automated physiological report generation for music therapy represents an unexplored task. We present a prototype system that leverages LLMs to bridge this gap — transforming raw EEG and cardiovascular data into human-readable therapeutic reports and personalized music recommendations. |
HUIXIN XUE et. al. | arxiv-cs.HC | 2026-01-18 |
| 201 | A Similarity Network for Correlating Musical Structure to Military Strategy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we explore the similarities between music structure and military strategy while creating the Music Clips Correlation Network (MCCN) based on Mel-frequency Cepstral Coefficients (MFCCs). |
Yiwen Zhang; Hui Zhang; Fanqin Meng; | arxiv-cs.SD | 2026-01-18 |
| 202 | MuseAgent-1: Interactive Grounded Multimodal Understanding of Music Scores and Performance Audio Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce MuseAgent, a music-centric multimodal agent that augments language models with structured symbolic representations derived from sheet music images and performance audio. |
QIHAO ZHAO et. al. | arxiv-cs.MM | 2026-01-17 |
| 203 | VidTune: Creating Video Soundtracks with Generative Music and Contextual Thumbnails Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present VidTune, a system that supports soundtrack creation by generating diverse music options from a creator’s prompt and producing contextual thumbnails for rapid review. |
Mina Huh; Ailie C. Fraser; Dingzeyu Li; Mira Dontcheva; Bryan Wang; | arxiv-cs.HC | 2026-01-17 |
| 204 | Song Aesthetics Evaluation with Multi-Stem Attention and Hierarchical Uncertainty Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, conventional approaches often predict a precise Mean Opinion Score (MOS) value directly, which struggles to capture the nuances of human perception in song aesthetics evaluation. This paper proposes a song-oriented aesthetics evaluation framework, featuring two novel modules: 1) Multi-Stem Attention Fusion (MSAF) builds bidirectional cross-attention between mixture-vocal and mixture-accompaniment pairs, fusing them to capture complex musical features; 2) Hierarchical Granularity-Aware Interval Aggregation (HiGIA) learns multi-granularity score probability distributions, aggregates them into a score interval, and applies a regression within the interval to produce the final score. |
YISHAN LV et. al. | arxiv-cs.SD | 2026-01-17 |
| 205 | SISTEM REKOMENDASI METADATA LAGU BERDASARKAN DETEKSI EMOSI WAJAH MENGGUNAKAN VIT-B/16 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study presents a music metadata recommendation system based on facial emotion detection using the Vision Transformer (ViT-B/16) model. |
Ainan Zaky Nurrofiq; Bambang Irawan –; | Jurnal Informatika dan Teknik Elektro Terapan | 2026-01-17 |
| 206 | Scalable Music Cover Retrieval Using Lyrics-Aligned Audio Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce LIVI (Lyrics-Informed Version Identification), an approach that seeks to balance retrieval accuracy with computational efficiency. |
Joanne Affolter; Benjamin Martin; Elena V. Epure; Gabriel Meseguer-Brocal; Frédéric Kaplan; | arxiv-cs.SD | 2026-01-16 |
| 207 | Emotional Based Music Recommendation System for Mental Wellness Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This project presents an Emotion-Based Music Recommendation System designed to enhance mental wellness through intelligent and personalized music therapy. |
Aayush Verdhan; | International Journal for Research in Applied Science and … | 2026-01-15 |
| 208 | HeartMuLa: A Family of Open Sourced Music Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a family of open-source Music Foundation Models designed to advance large-scale music understanding and generation across diverse tasks and modalities. |
DONGCHAO YANG et. al. | arxiv-cs.SD | 2026-01-15 |
| 209 | FusID: Modality-Fused Semantic IDs for Generative Music Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing approaches that tokenize each modality independently face two critical limitations: (1) redundancy across modalities that reduces efficiency, and (2) failure to capture inter-modal interactions that limits item representation. We introduce FusID, a modality-fused semantic ID framework that addresses these limitations through three key components: (i) multimodal fusion that learns unified representations by jointly encoding information across modalities, (ii) representation learning that brings frequently co-occurring item embeddings closer while maintaining distinctiveness and preventing feature redundancy, and (iii) product quantization that converts the fused continuous embeddings into multiple discrete tokens to mitigate ID conflict. |
Haven Kim; Yupeng Hou; Julian McAuley; | arxiv-cs.IR | 2026-01-13 |
| 210 | Speech and Music Source Separation for Cochlear Implant Users: Front-end and End-to-end Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In the present study, we compare front-end and end-to-end DNN-based source separation approaches for two tasks: speech masked by competing speech and singing music. |
Sina Tahmasebi; Waldo Nogueira; | Frontiers in Neuroscience | 2026-01-13 |
| 211 | Effectiveness of Music Therapy for Delirium in Acute Hospital Settings: A Scoping Review Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although there are reviews on non-pharmacological approaches to delirium, few have focused specifically on music therapy within acute hospital environments. This scoping review examined the evidence relating to music-based interventions for older adults who are experiencing delirium or who are at risk of delirium in acute care settings. |
Stacey Leonard; Elizabeth Henderson; Gary Mitchell; | Nursing Reports | 2026-01-12 |
| 212 | End-to-End Full-Page Optical Music Recognition for Pianoform Sheet Music Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we present the first truly end-to-end approach for page-level OMR in complex layouts. |
Antonio Ríos-Vila; Jorge Calvo-Zaragoza; David Rizo; Thierry Paquet; | International Journal of Computer Vision | 2026-01-09 |
| 213 | Music Consumption: A Systematic Review Across The Lifespan Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The present study aimed to systematically review research concerning changes in music consumption across the lifespan to better understand how adults of all ages consume music. |
Shannon J. Skeffington; Adam J. Lonsdale; Clare J. Rathbone; Mark Burgess; | Empirical Studies of the Arts | 2026-01-08 |
| 214 | Supervised Contrastive Models for Music Information Retrieval in Classical Persian Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Ali Ahmadi Katamjani; Seyed Abolghasem Mirroshandel; Mahdi Aminian; | Transactions of the International Society for Music … | 2026-01-07 |
| 215 | Muse: Towards Reproducible Long-Form Song Generation with Fine-Grained Style Control Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We train Muse via single-stage supervised finetuning of a Qwen-based language model extended with discrete audio tokens using MuCodec, without task-specific losses, auxiliary objectives, or additional architectural components. |
CHANGHAO JIANG et. al. | arxiv-cs.SD | 2026-01-07 |
| 216 | Predicting Hit Songs: An Exploratory and Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The study explores the connection between audio features from an objective statistical standpoint and identifies that tree-based ensembles perform better when dealing with tabular datasets. |
Yanrui Jerry Wu; | Scholarly Review Journal | 2026-01-05 |
| 217 | Understanding Human Perception of Music Plagiarism Through A Computational Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Therefore, we aim to conduct a study to examine the key criteria of human perception of music plagiarism, focusing on the three commonly used musical features in similarity analysis: melody, rhythm, and chord progression. After identifying the key features and levels of variation humans use in perceiving musical similarity, we propose a LLM-as-a-judge framework that applies a systematic, step-by-step approach, drawing on modules that extract such high-level attributes. |
Daeun Hwang; Hyeonbin Hwang; | arxiv-cs.SD | 2026-01-05 |
| 218 | Abstracts of The Resonance Annual Conference 2026: The Next Decade of Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This abstract collection records the conference abstracts and extended abstracts presented at the Resonance Annual Conference 2026, convened online under the theme “The Next … |
ANQI CHEN et. al. | Resonance: Journal of Global Music Studies | 2026-01-05 |
| 219 | SongSage: A Large Musical Language Model with Lyric Generative Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Comprehensive evaluations indicate that current general-purpose LLMs still have potential for improvement in playlist understanding. Inspired by this, we introduce SongSage, a large musical language model equipped with diverse lyric-centric intelligence through lyric generative pretraining. |
JIANI GUO et. al. | arxiv-cs.CL | 2026-01-03 |
| 220 | Timed Text Extraction from Taiwanese Kua-á-hì TV Series Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These videos, while potentially valuable for in-depth studies of Taiwanese opera, often have low quality and require substantial manual effort during data preparation. To streamline this process, we developed an interactive system for real-time OCR correction and a two-step approach integrating OCR-driven segmentation with Speech and Music Activity Detection (SMAD) to efficiently identify vocal segments from archival episodes with high precision. |
TZU-HUNG HUANG et. al. | arxiv-cs.SD | 2026-01-01 |
| 221 | Distributed Cross‐Domain Music Style Transfer in The SAGIN Environment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these improvements face significant challenges when implemented in space‐air‐ground integrated network (SAGIN) environments, due to issues such as high latency, limited bandwidth, and privacy concerns. This research explores a distributed, cross‐domain music style transfer model based on SAGIN environments, proposing a federated learning (FL) approach to mitigate these challenges. |
Jinzi Huang; Chihhsiong Shih; | Transactions on Emerging Telecommunications Technologies | 2025-12-30 |
| 222 | Analisis Emosi Dalam Lirik Lagu Menggunakan Natural Language Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Music is a universal medium for expressing emotions, with song lyrics serving as a narrative component rich in affective content. This study aims to analyze the emotional landscape within popular English song lyrics collected from the Spotify platform and to examine the effectiveness of Natural Language Processing (NLP) approaches in classifying these emotions. |
Michael Sabda Husada; Sri Yulianto J.P; | Jurnal Indonesia : Manajemen Informatika dan Komunikasi | 2025-12-30 |
| 223 | Effects of Integrative Music Therapy in Children with Autism: A Multiple Case Study Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The primary objective of this study is to examine the potential of integrating five approaches— the Orff Schulwer, Dalcroze Eurhythmics, traditional Chinese music of Five Elements music, the Mozart Effect, and Neurologic Music Therapy—within a comprehensive framework of seven elements, for the improvement of attention processes, emotional self-regulation and social interaction in children diagnosed with Autism Spectrum Disorder (ASD) (American Psychiatric Association, 2022). |
Tianjiao Ma; Inmaculada Chiva Sanchis; Genoveva Ramos Santana; Zhenhan Liu; | RELIEVE – Revista Electrónica de Investigación y Evaluación … | 2025-12-30 |
| 224 | Multi Agents Semantic Emotion Aligned Music to Image Generation with Music Derived Captions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: When people listen to music, they often experience rich visual imagery. We aim to externalize this inner imagery by generating images conditioned on music. |
Junchang Shi; Gang Li; | arxiv-cs.MM | 2025-12-29 |
| 225 | Song Lyric Meaning Generator Using Transformer (Case Study: Drake’s Song Lyrics) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study aims to develop a web-based lyric meaning generator capable of automatically interpreting Drake’s lyrics using the Transformer architecture. |
Tubagus Alwasi’i; Sam Farisa Chaerul Haviana; | Journal of Software Engineering and Multimedia (JASMED) | 2025-12-29 |
| 226 | MUSIC, SPIRITUALITY, AND MENTAL HEALTH Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the scientific evidence on its impacts on mental health is dispersed. This systematic review, conducted under the PRISMA guidelines, seeks to consolidate the evidence on the relationship between religious/gospel music and outcomes in mental health, well-being, and resilience. |
Jessika Camila Souza Gonçalez; | Revista Gênero e Interdisciplinaridade | 2025-12-24 |
| 227 | Processing of Vocal Music Using Artificial Intelligence: Unveiling Creative Potential and Shaping Listener Perception Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study quantitatively assesses the impact of AI on vocal-music processing. |
Ning Wang; Zhenyao Cai; | Perceptual and Motor Skills | 2025-12-23 |
| 228 | Do Foundational Audio Encoders Understand Music Structure? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although many open-source FAE models are available, only a small subset has been examined for MSA, and the impact of factors such as learning methods, training data, and model context length on MSA performance remains unclear. In this study, we conduct comprehensive experiments on 11 types of FAEs to investigate how these factors affect MSA performance. |
Keisuke Toyama; Zhi Zhong; Akira Takahashi; Shusuke Takahashi; Yuki Mitsufuji; | arxiv-cs.SD | 2025-12-18 |
| 229 | EXPRESS: Lyrics of Gender Well-Being: A Definition and Exploration of Well-being Dimensions That Define Gendered Experiences Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This research introduces Gender Well-Being (GWB), a novel framework that defines and structures gendered experiences of well-being through six core dimensions spanning individual and societal levels: safety, access, belonging, empowerment, agency, and health. |
Sheen Kachen; Paula C. Peter; Lorena García Ramón; Ranny Liu; Michelle van Solt; | Journal of Public Policy & Marketing | 2025-12-18 |
| 230 | Teaching Critical Thinking Using Public Music Theory Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This article presents a practical methodology for incorporating critical thinking and writing skills in music theory courses using online videos of public music theory. |
Jeremy Robins; | Engaging Students: Essays in Music Pedagogy | 2025-12-18 |
| 231 | WeMusic-Agent: Efficient Conversational Music Recommendation Via Knowledge Internalization and Agentic Boundary Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes WeMusic-Agent, a training framework for efficient LLM-based conversational music recommendation. |
WENDONG BI et. al. | arxiv-cs.AI | 2025-12-17 |
| 232 | MuseCPBench: An Empirical Study of Music Editing Methods Through Music Context Preservation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While some studies do consider MCP, they adopt inconsistent evaluation protocols and metrics, leading to unreliable and unfair comparisons. To address this gap, we introduce the first MCP evaluation benchmark, MuseCPBench, which covers four categories of musical facets and enables comprehensive comparisons across five representative music editing baselines. |
YASH VISHE et. al. | arxiv-cs.SD | 2025-12-16 |
| 233 | Web-Based Collaborative Music Lessons: Approaches, Challenges and Perspectives Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Through a survey of instructor needs and platform evaluations, the study identifies key development priorities and highlights challenges such as latency, synchronization, and user experience. |
Chrisoula Alexandraki; Konstantinos Tsioutas; | Journal of the Audio Engineering Society | 2025-12-16 |
| 234 | Survey Paper on Music Recommendation System Using Facial Recognition and Deep Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study introduces an emotion-aware music recommendation system that adapts song suggestions in real time based on a user’s facial expressions. |
Namarta Gawande; | International Journal for Research in Applied Science and … | 2025-12-15 |
| 235 | The Renaissance of Expert Systems: Optical Recognition of Printed Chinese Jianpu Musical Scores with Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a modular expert-system pipeline that converts printed Jianpu scores with lyrics into machine-readable MusicXML and MIDI, without requiring massive annotated training data. |
FAN BU et. al. | arxiv-cs.CV | 2025-12-15 |
| 236 | AIoT-Driven Harmonic Lifecycle Assessment and Predictive Analytics for Green Music Production Supply Chain Systems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Traditional carbon footprint measurement approaches fail to address the unique requirements of green music production, where conventional lifecycle assessment methods cannot … |
Yan Fang; Dandan Peng; Imran Memon; Li Li; | IEEE Internet of Things Journal | 2025-12-15 |
| 237 | JoruriPuppet: Learning Tempo-Changing Mechanisms Beyond The Beat for Music-to-Motion Generation with Expressive Metrics Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In music-to-motion generation, the interplay between movements and music tempo variations significantly influences the emotional expressiveness and realism of performances. … |
Ran Dong; Shaowen Ni; Xi Yang; | Proceedings of the SIGGRAPH Asia 2025 Conference Papers | 2025-12-14 |
| 238 | Video2Song: Video-to-Song Generation Via Retrieval-Augmented Prompt Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this paper, we introduce an end-to-end framework for generating video soundtracks that enhances content alignment and controllability compared to existing approaches. … |
Zijiao Yin; Sifei Li; Weiming Dong; | Proceedings of the SIGGRAPH Asia 2025 Posters | 2025-12-14 |
| 239 | AutoMV: An Automatic Multi-Agent System for Music Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose AutoMV, a multi-agent system that generates full music videos (MVs) directly from a song. |
XIAOXUAN TANG et. al. | arxiv-cs.MM | 2025-12-13 |
| 240 | MUSIGAIN: Adaptive Graph Attention Network for Multi-Relationship Mining in Music Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The framework introduces three key innovations: (1) a layer-wise dynamic skipping mechanism that adaptively controls propagation depth based on third-order embedding stability, reducing computation by 30–40% while preventing over-smoothing; (2) the DiGRAF adaptive activation function that enables node-specific nonlinear transformations to capture semantic heterogeneity across different entity types; and (3) ranking-based optimization supervised by graph robustness metrics, focusing on relative importance ordering rather than absolute value prediction. |
Mian Chen; Tinghao Wang; Chunhao Li; Yuheng Li; | Electronics | 2025-12-12 |
| 241 | Reframing Music-Driven 2D Dance Pose Generation As Multi-Channel Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We address this by reframing music-to-dance generation as a music-token-conditioned multi-channel image synthesis problem: 2D pose sequences are encoded as one-hot images, compressed by a pretrained image VAE, and modeled with a DiT-style backbone, allowing us to inherit architectural and training advances from modern text-to-image models and better capture high-variance 2D pose distributions. On top of this formulation, we introduce (i) a time-shared temporal indexing scheme that explicitly synchronizes music tokens and pose latents over time and (ii) a reference-pose conditioning strategy that preserves subject-specific body proportions and on-screen scale while enabling long-horizon segment-and-stitch generation. |
YAN ZHANG et. al. | arxiv-cs.CV | 2025-12-12 |
| 242 | PhraseVAE and PhraseLDM: Latent Diffusion for Full-Song Multitrack Symbolic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This technical report presents a new paradigm for full-song symbolic music generation. |
Longshen Ou; Ye Wang; | arxiv-cs.SD | 2025-12-12 |
| 243 | The Rise of Technology in Music Education: A Bibliometric Study of A Rapidly Growing Field Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The aim of this study is to examine publications on the use of technology in music education between 2016 and 2025 using bibliometric methods. |
Festa Nevzati Thaçi; | Architecture Image Studies | 2025-12-11 |
| 244 | Application Of Random Forest Algorithm in Music Recommendation System Using Content-Based Filtering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the overwhelming number of available songs often confuses users, particularly new users who have no listening history. To address this, the study proposes a music recommendation system using a content-based filtering approach that recommends songs based on similarities in both textual and numerical features, such as genre, artist, lyrics, tempo, energy, and danceability. |
Rubby Malik Fajar; Indra Yustiana; Alun Sujjada; | bit-Tech | 2025-12-10 |
| 245 | MR-FlowDPO: Multi-Reward Direct Preference Optimization for Flow-Matching Text-to-Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce MR-FlowDPO, a novel approach that enhances flow-matching-based music generation models – a major class of modern music generative models, using Direct Preference Optimization (DPO) with multiple musical rewards. |
ALON ZIV et. al. | arxiv-cs.SD | 2025-12-10 |
| 246 | Text-to-Music Generation Using AI: Theoretical Foundations and Practical Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This research explores the theoretical foundations linking language and music through semantic, emotional, and structural analysis, and demonstrates practical integration of AI music generation into software via APIs. |
Quang Minh Trinh; Thi Lan Ngo; Hue Trinh; Xuan Tung Bui; | Journal of Science and Technology | 2025-12-08 |
| 247 | PolyLingua: Margin-based Inter-class Transformer for Robust Cross-domain Language Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce PolyLingua, a lightweight Transformer-based model for in-domain language detection and fine-grained language classification. |
ALI LOTFI REZAABAD et. al. | arxiv-cs.LG | 2025-12-08 |
| 248 | Incorporating Structure and Chord Constraints in Symbolic Transformer-based Melodic Harmonization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper studies the inclusion of predefined chord constraints in melodic harmonization, i.e., where a desired chord at a specific location is provided along with the melody as inputs and the autoregressive transformer model needs to incorporate the chord in the harmonization that it generates. The peculiarities of involving such constraints is discussed and an algorithm is proposed for tackling this task. |
MAXIMOS KALIAKATSOS-PAPAKOSTAS et. al. | arxiv-cs.SD | 2025-12-08 |
| 249 | VibeMus: Proactive Agentic System for Music Personalization Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Large language models (LLMs) enable diverse forms of AI-assisted creation, yet they often struggle to bridge the preference-articulation gap: users may provide incomplete or vague … |
Zhiliang Guo; Teng Tu; Yunshan Ma; Xun Yang; | Proceedings of the 7th ACM International Conference on … | 2025-12-06 |
| 250 | Morphologically-Informed Tokenizers for Languages with Non-Concatenative Morphology: A Case Study of Yoloxóchtil Mixtec ASR Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present two novel tokenization schemes that separate words in a nonlinear manner, preserving information about tonal morphology as much as possible. |
Chris Crawford; | arxiv-cs.CL | 2025-12-05 |
| 251 | Who Will Top The Charts? Multimodal Music Popularity Prediction Via Adaptive Fusion of Modality Experts and Temporal Engagement Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing methods suffer from four limitations:(i) temporal dynamics in audio and lyrics are averaged away; (ii) lyrics are represented as a bag of words, disregarding compositional structure and affective semantics; (iii) artist- and song-level historical performance is ignored; and (iv) multimodal fusion approaches rely on simple feature concatenation, resulting in poorly aligned shared representations. To address these limitations, we introduce GAMENet, an end-to-end multimodal deep learning architecture for music popularity prediction. |
Yash Choudhary; Preeti Rao; Pushpak Bhattacharyya; | arxiv-cs.SD | 2025-12-05 |
| 252 | ArtistMus: A Globally Diverse, Artist-Centric Benchmark for Retrieval-Augmented Music Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce MusWikiDB, a vector database of 3.2M passages from 144K music-related Wikipedia pages, and ArtistMus, a benchmark of 1,000 questions on 500 diverse artists with metadata such as genre, debut year, and topic. |
Daeyong Kwon; SeungHeon Doh; Juhan Nam; | arxiv-cs.CL | 2025-12-05 |
| 253 | Lyrics Matter: Exploiting The Power of Learnt Representations for Music Popularity Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present an automated pipeline that uses LLM to extract high-dimensional lyric embeddings, capturing semantic, syntactic, and sequential information. |
Yash Choudhary; Preeti Rao; Pushpak Bhattacharyya; | arxiv-cs.SD | 2025-12-05 |
| 254 | YingMusic-Singer: Zero-shot Singing Voice Synthesis and Editing with Annotation-free Melody Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Singing Voice Synthesis (SVS) remains constrained in practical deployment due to its strong dependence on accurate phoneme-level alignment and manually annotated melody contours, requirements that are resource-intensive and hinder scalability. To overcome these limitations, we propose a melody-driven SVS framework capable of synthesizing arbitrary lyrics following any reference melody, without relying on phoneme-level alignment. |
JUNJIE ZHENG et. al. | arxiv-cs.SD | 2025-12-04 |
| 255 | YingMusic-SVC: Real-World Robust Zero-Shot Singing Voice Conversion with Flow-GRPO and Singing-Specific Inductive Biases Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose YingMusic-SVC, a robust zero-shot framework that unifies continuous pre-training, robust supervised fine-tuning, and Flow-GRPO reinforcement learning. |
GONGYU CHEN et. al. | arxiv-cs.SD | 2025-12-04 |
| 256 | Outward Threads – Intuitive Computers / Rational Composers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The project ‘Outward Threads’ is an artistic investigation rooted in music composition, integrating computational frameworks from machine learning and artificial intelligence to create new music. |
Juan Sebastián Vassallo; | KMD Artistic Research | 2025-12-03 |
| 257 | Wastewater Analyses for Psychoactive Substances at Music Festivals: A Systematic Review Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Music festivals have emerged as venues for the consumption of recreational drugs and novel psychoactive substances. This systematic review provides the first critical evaluation and synthesis of published wastewater analyses for detecting recreational drug use at music festivals worldwide. |
Ringala Cainamisir; Xiao Zeng; Samuel B. Himmerich; Hubertus Himmerich; | Behavioral Sciences | 2025-12-03 |
| 258 | Interpretable Content-Based Music Genre Classification Utilizing A Modified Artificial Immune System with Binary Similarity Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work bridges computational intelligence and music analysis, offering a novel perspective on immune-inspired learning for content classification. |
Noor Azilah Muda; Choo Yun Huoy; Azah Kamilah Muda; | International Journal of Research and Innovation in Social … | 2025-12-03 |
| 259 | Elements of Music That Work to Improve Sleep, A Narrative Review Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Overall, this review highlights the elements at work that make music a safe, scalable, and culturally adaptable adjunct to traditional sleep therapies. |
Ethan Y. Pan; Wei Wang; | Frontiers in Sleep | 2025-12-02 |
| 260 | Perception of AI-Generated Music – The Role of Composer Identity, Personality Traits, Music Preferences, and Perceived Humanness Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study investigates how composer information and listener characteristics shape the perception of AI-generated music, adopting a mixed-method approach. |
David Stammer; Hannah Strauss; Peter Knees; | arxiv-cs.HC | 2025-12-02 |
| 261 | YingVideo-MV: Music-Driven Multi-Stage Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present YingVideo-MV, the first cascaded framework for music-driven long-video generation. |
JIAHUI CHEN et. al. | arxiv-cs.CV | 2025-12-02 |
| 262 | Generative Multi-modal Feedback for Singing Voice Synthesis Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, current methods like reward systems often rely on single numerical scores, struggle to capture various dimensions such as phrasing or expressiveness, and require costly annotations, limiting interpretability and generalization. To address these issues, we propose a generative feedback (i.e., reward model) framework that provides multi-dimensional language and audio feedback for SVS assessment. |
XUEYAN LI et. al. | arxiv-cs.SD | 2025-12-02 |
| 263 | Story2MIDI: Emotionally Aligned Music Generation from Text Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Story2MIDI, a sequence-to-sequence Transformer-based model for generating emotion-aligned music from a given piece of text. |
Mohammad Shokri; Alexandra C. Salem; Gabriel Levine; Johanna Devaney; Sarah Ita Levitan; | arxiv-cs.SD | 2025-12-01 |
| 264 | Sensing Mobile Targets in Near-Field ISAC Systems: A Low Complexity Modified MUSIC Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this paper, we investigate the mobile target sensing problem within the near-field integrated sensing and communication (ISAC) system, an area that has received limited … |
Hongjia Huang; Weijie Yuan; Arumugam Nallanathan; | IEEE Transactions on Vehicular Technology | 2025-12-01 |
| 265 | Melody or Machine: Detecting Synthetic Music with Dual-Stream Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: MoM is the most diverse dataset to date, built with a mix of open and closed-source models and a curated OOD test set designed specifically to foster the development of truly generalizable detectors. Alongside this benchmark, we introduce CLAM, a novel dual-stream detection architecture. |
ARNESH BATRA et. al. | arxiv-cs.SD | 2025-11-29 |
| 266 | Synthesizing Nostalgia: How An AI-Generated ‘Ejaje 1981’ Polish Hit Rewired Memory, Virality, and Copyrights Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Using an interdisciplinary approach, we conducted (i) a musicological analysis of Ejaje 1981, (ii) legal and phonographic market perspective (iii) a sociological analysis of ~1200 social-media comments and sharing patterns across platforms. |
Andrzej Buda; Andrzej Jarynowski; | E-methodology | 2025-11-29 |
| 267 | An Intelligent Approach Toward Lyrics Text Classification Using Multilevel Cross Attention‐Based Adaptive BiLSTM With Relevant Feature Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Hence, it is important to replace the standard text classification approach with other advanced methods to classify the different language lyrics effectively. To solve these issues, an efficient deep learning‐based lyrics text classification is developed in this work to classify the lyrics text. |
Jasmine Raja Lawrence; Saswati Mukherjee; C. R. Rene Robin; David Raj Gnanamuthu; | Computational Intelligence | 2025-11-28 |
| 268 | A Multimodal Deep Learning Model for Optimizing Music Emotion Recognition Through Temporal and Semantic Feature Integration Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper uses deep learning (DL) to study the accuracy of optimizing MER. |
Peng Yan; Weina Chu; Hongyuan Wu; | Journal of Circuits, Systems and Computers | 2025-11-28 |
| 269 | A Comparative Study of LLM Prompting and Fine-Tuning for Cross-genre Authorship Attribution on Chinese Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel study on authorship attribution for Chinese lyrics, a domain where clean, public datasets are sorely lacking. |
Yuxin Li; Lorraine Xu; Meng Fan Wang; | arxiv-cs.CL | 2025-11-26 |
| 270 | Artificial Intelligence in Music Generation: Models, Evaluation, Applications, and Future Prospects Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a comprehensive review of the current state of AI music generation, covering the historical development of computer-assisted music production and AI-assisted music from early analog and digital tools to modern neural network architectures, and highlighting key developments such as MIDI, DAWs, plugins, and early algorithmic composition systems. |
Tiffany Chiu; | Theoretical and Natural Science | 2025-11-26 |
| 271 | DUO-TOK: Dual-Track Semantic Music Tokenizer for Vocal-Accompaniment Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Duo-Tok is a source-aware dual-codebook tokenizer for vocal-accompaniment music that targets the growing tension between reconstruction quality and language-model (LM) … |
RUI LIN et. al. | arxiv-cs.SD | 2025-11-25 |
| 272 | The Use of Tele-Music Interventions in Supportive Cancer Care: A Systematic Review Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Objectives: This systematic review seeks to provide an in-depth overview of current research on tele-music interventions in supportive cancer care and identifies key areas where further research is warranted. |
LORE MERTENS et. al. | Brain Sciences | 2025-11-25 |
| 273 | The Effect of The Typicality of Song Lyrics on Song Popularity: A Natural Language Processing Analysis of The British Top Singles Chart Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study investigates the relationship between the lexical typicality of song lyrics and song popularity in the UK Official Singles Chart from 1999 to 2013. |
Khaoula Chehbouni; Florian Carichon; Adrien Simonnot-Lanciaux; Gilles Caporossi; Danilo C. Dantas; | Psychology of Music | 2025-11-24 |
| 274 | Music-induced Physiological Markers for Detecting Alzheimer’s Disease Using Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The Random Forest classifier distinguished AD patients from healthy controls with 70.5% accuracy, while the Naïve Bayes model predicted severity with 65.6% accuracy, demonstrating that ML models can detect subtle music-evoked physiological differences even in individuals with AD. |
Rodrigo Lima; Gonçalo Barradas; Sergi Bermúdez i Badia; | Frontiers in Aging Neuroscience | 2025-11-24 |
| 275 | Multidimensional Music Aesthetic Evaluation Via Semantically Consistent C-Mixup Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a robust music aesthetic evaluation framework that combines (1) multi-source multi-scale feature extraction to obtain complementary segment- and track-level representations, (2) a hierarchical audio augmentation strategy to enrich training data, and (3) a hybrid training objective that integrates regression and ranking losses for accurate scoring and reliable top-song identification. |
SHUYANG LIU et. al. | arxiv-cs.SD | 2025-11-24 |
| 276 | GeeSanBhava: Sentiment Tagged Sinhala Music Video Comment Data Set Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This research contributes a valuable annotated dataset and provides insights for future work in Sinhala Natural Language Processing and music emotion recognition. |
Yomal De Mel; Nisansa de Silva; | arxiv-cs.CL | 2025-11-22 |
| 277 | MusicAIR: A Multimodal AI Music Generation Framework Powered By An Algorithm-Driven Core Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In contrast, we propose MusicAIR, an innovative multimodal AI music generation framework powered by a novel algorithm-driven symbolic music core, effectively mitigating copyright infringement risks. |
Callie C. Liao; Duoduo Liao; Ellie L. Zhang; | arxiv-cs.SD | 2025-11-21 |
| 278 | AI System for Music Generation Based on User Preferences Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a lightweight, transparent, rule-based framework for affective melody generation coupled with a deterministic validation engine. |
MR. LANKE RAVI KUMAR et. al. | International Scientific Journal of Engineering and … | 2025-11-21 |
| 279 | Difficulty-Controlled Simplification of Piano Scores with Synthetic Data for Inclusive Music Education Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work introduces a transformer-based method for adjusting the difficulty of MusicXML piano scores. |
Pedro Ramoneda; Emilia Parada-Cabaleiro; Dasaem Jeong; Xavier Serra; | arxiv-cs.SD | 2025-11-20 |
| 280 | LargeSHS: A Large-scale Dataset of Music Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recent advances in AI-based music generation have focused heavily on text-conditioned models, with less attention given to reference-based generation such as song adaptation. To support this line of research, we introduce LargeSHS, a large-scale dataset derived from SecondHandSongs, containing over 1.7 million metadata entries and approximately 900k publicly accessible audio links. |
Chih-Pin Tan; Hsuan-Kai Kao; Li Su; Yi-Hsuan Yang; | arxiv-cs.SD | 2025-11-19 |
| 281 | Music Genre Classification Using Prosodic, Stylistic, Syntactic and Sentiment-Based Features Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Incorporating different influences at different times, today it boasts a wide range of both autochthonous and imported genres, such as traditional folk music, rock, rap, pop, and manele, to name a few. We aim to trace the linguistic differences between the lyrics of these genres using natural language processing and a computational linguistics approach by studying the prosodic, stylistic, syntactic, and sentiment-based features of each genre. |
Erik-Robert Kovacs; Stefan Baghiu; | Big Data and Cognitive Computing | 2025-11-19 |
| 282 | HOW AI SHAPES MUSIC RECOMMENDATIONS AND CONSUMER PREFERENCES Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The paper is analysed using machine learning algorithms on different parameters like precision, recall and accuracy scores. |
Ashutosh Rastogi; | EPRA International Journal of Research & Development … | 2025-11-19 |
| 283 | MuCPT: Music-related Natural Language Model Continued Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: On the training side, we introduce reference-model (RM)-based token-level soft scoring for quality control: a unified loss-ratio criterion is used both for data selection and for dynamic down-weighting during optimization, reducing noise gradients and amplifying task-aligned signals, thereby enabling more effective music-domain continued pretraining and alignment. |
Kai Tian; Yirong Mao; Wendong Bi; Hanjie Wang; Que Wenhui; | arxiv-cs.CL | 2025-11-18 |
| 284 | A Controllable Perceptual Feature Generative Model for Melody Harmonization Via Conditional Variational Autoencoder Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Many studies have introduced emotion models to guide the generative process. |
Dengyun Huang; Yonghua Zhu; | arxiv-cs.SD | 2025-11-18 |
| 285 | A Deep Learning Approach for The Analysis of Birdsong Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We developed a novel deep learning-based behavior analysis pipeline, Avian Vocalization Network (AVN), for zebra finch learned vocalizations – the most widely studied vocal learning model species. |
Therese MI Koch; Ethan S Marks; Todd F Roberts; | eLife | 2025-11-18 |
| 286 | Applications of The MUSIC Model of Motivation and Its Associated Inventories: A Systematic Review and Meta-Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Searches were conducted across five databases (Education Research Complete, ERIC, PsycInfo, Web of Science, and ProQuest Dissertations & Theses Global), supplemented by Google Scholar and the MUSIC Model Research Lab website. Peer-reviewed journal articles and dissertations published between 2009 and 2022 were included if they employed all five components of the MUSIC model (eMpowerment, Usefulness, Success, Interest, and Caring) as part of their study framework or assessment procedure. |
Brett D. Jones; Zeynep Ambarkutuk; Dale H. Schunk; | International Journal of Applied Positive Psychology | 2025-11-18 |
| 287 | A Descriptive Literature Review of Zahara’s Plea for Africa’s Healing, Hope, and Answers in ‘Phendula: A Cry for God’s Intervention Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores Zahara’s song Phendula as a powerful musical plea for Africa’s healing, hope, and divine intervention. |
Sakhiseni Joseph Yende; | International Journal of Research in Business and Social … | 2025-11-16 |
| 288 | Decoding Nature’s Melody: Significance and Challenges of Machine Learning in Assessing Bird Diversity Via Soundscape Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
JIANGJIAN XIE et. al. | Artificial Intelligence Review | 2025-11-14 |
| 289 | PISA: Combining Transformers and ACT-R for Repeat-Aware Sequential Listening Session Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This presents a key challenge as repeatedly listening to the same song over time is a common habit that can shape how users experience and interpret this song. In this paper, we introduce PISA ( P sychology- I nformed S ession embedding using A CT-R), a session-level sequential recommender system designed to overcome this challenge. |
Viet Anh TRAN; Guillaume SALHA-GALVAN; Bruno SGUERRA; Romain HENNEQUIN; | ACM Transactions on Recommender Systems | 2025-11-14 |
| 290 | Video Echoed in Music: Semantic, Temporal, and Rhythmic Alignment for Video-to-Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the challenges, we propose Video Echoed in Music (VeM), a latent music diffusion that generates high-quality soundtracks with semantic, temporal, and rhythmic alignment for input videos. |
XINYI TONG et. al. | arxiv-cs.SD | 2025-11-12 |
| 291 | Chord-conditioned Melody and Bass Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We evaluate five Transformer-based strategies for chord-conditioned melody and bass generation using a set of music theory-motivated metrics capturing pitch content, pitch interval size, and chord tone usage. |
Alexandra C Salem; Mohammad Shokri; Johanna Devaney; | arxiv-cs.SD | 2025-11-11 |
| 292 | Multisensory Interactive Music in Extended Reality: A Comprehensive Investigation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper investigates the emerging field of multisensory interactive music within Extended Reality (XR), focusing on the convergence of spatialized audio, embodied interaction, reactive visuals, narrative coherence, as well as emotional resonance. |
Liying Huang; | Communications in Humanities Research | 2025-11-11 |
| 293 | Progressive Semantic Residual Quantization for Multimodal-Joint Interest Modeling in Music Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing methods suffer from two critical limitations: 1) intra-modal semantic degradation, where residual-based quantization processes gradually decouple discrete IDs from original content semantics, leading to semantic drift; and 2) inter-modal modeling gaps, where traditional fusion strategies either overlook modal-specific details or fail to capture cross-modal correlations, hindering comprehensive user interest modeling. To address these challenges, we propose a novel multimodal recommendation framework with two stages. |
SHIJIA WANG et. al. | cikm | 2025-11-10 |
| 294 | MPJudge: Towards Perceptual Assessment of Music-Induced Paintings Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing methods primarily rely on emotion recognition modelsto assess the similarity between music and painting, but such models introduceconsiderable noise and overlook broader perceptual cues beyond emotion. Toaddress these limitations, we propose a novel framework for music inducedpainting assessment that directly models perceptual coherence between music andvisual art. |
Shiqi Jiang; Tianyi Liang; Changbo Wang; Chenhui Li; | arxiv-cs.CV | 2025-11-10 |
| 295 | Who Gets Heard? Rethinking Fairness in AI for Music Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Suchharms risk reinforcing biases, limiting creativity, and contributing tocultural erasure. To address this, we offer recommendations at dataset, modeland interface level in music-AI systems. |
ATHARVA MEHTA et. al. | arxiv-cs.CY | 2025-11-08 |
| 296 | BNMusic: Blending Environmental Noises Into Personalized Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, misalignment between the dominant sound and the noise—such as mismatched downbeats—often requires an excessive volume increase to achieve effective masking. Motivated by recent advances in cross-modal generation, in this work, we introduce an alternative method to acoustic masking, aiming to reduce the noticeability of environmental noises by blending them into personalized music generated based on user-provided text prompts. |
CHI ZUO et. al. | nips | 2025-11-07 |
| 297 | SongBloom: Coherent Song Generation Via Interleaved Autoregressive Sketching and Diffusion Refinement IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces SongBloom, a novel framework for full-length song generation that leverages an interleaved paradigm of autoregressive sketching and diffusion-based refinement. |
CHENYU YANG et. al. | nips | 2025-11-07 |
| 298 | Persian Musical Instruments Classification Using Polyphonic Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a culturally informed data augmentationstrategy that generates realistic polyphonic mixtures from monophonic samples.Using the MERT model (Music undERstanding with large-scale self-supervisedTraining) with a classification head, we evaluate our approach without-of-distribution data which was obtained by manually labeling segments oftraditional songs. |
Diba Hadi Esfangereh; Mohammad Hossein Sameti; Sepehr Harfi Moridani; Leili Javidpour; Mahdieh Soleymani Baghshah; | arxiv-cs.SD | 2025-11-07 |
| 299 | Audio Super-Resolution with Latent Bridge Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Towards high-quality audio super-resolution, we present a new system with latent bridge models (LBMs), where we compress the audio waveform into a continuous latent space and design an LBM to enable a latent-to-latent generation process that naturally matches the LR-to-HR upsampling process, thereby fully exploiting the instructive prior information contained in the LR waveform. |
Chang Li; Zehua Chen; Liyuan Wang; Jun Zhu; | nips | 2025-11-07 |
| 300 | Unifying Symbolic Music Arrangement: Track-Aware Reconstruction and Structured Tokenization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a unified framework for automatic multitrack music arrangement that enables a single pre-trained symbolic music model to handle diverse arrangement scenarios, including reinterpretation, simplification, and additive generation. |
LONGSHEN OU et. al. | nips | 2025-11-07 |
| 301 | LeVo: High-Quality Song Generation with Multi-Preference Alignment IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing approaches still struggle with the complex composition of songs and the scarcity of high-quality data, leading to limitations in audio quality, musicality, instruction following, and vocal-instrument harmony. To address these challenges, we introduce LeVo, a language model based framework consisting of LeLM and Music Codec. |
SHUN LEI et. al. | nips | 2025-11-07 |
| 302 | EMO100DB: An Open Dataset of Improvised Songs with Emotion Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we introduce Emo100DB: a dataset consisting of improvisedsongs that were recorded and transcribed with emotion data based on Russell’scircumplex model of emotion. |
Daeun Hwang; Saebyul Park; | arxiv-cs.SD | 2025-11-06 |
| 303 | Sing & Spell with AI: Enhancing Vocabulary Acquisition Through Song-Based Learning in ESL Classrooms Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Sing & Spell with AI is an innovation that combines Artificial Intelligence with music-based learning to support vocabulary instruction in ESL primary classrooms. The tool … |
Mohammad Radzi bin Manap; Nor Fazlin binti Mohd Ramli; Siti Nur Badriah binti Mohd Tahir; | International Journal of Research and Innovation in Social … | 2025-11-05 |
| 304 | Harmony of Language and Technology: AI-Supported Music Education Through Song Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores the integration of artificial intelligence (AI) in music education as an innovative approach to enhance language learning through song lyrics. |
Juriani Jamaludin; Nurulhamimi Abdul Rahman; Enrieka Ervina Ugama; Raudhatul Jannah Mohd Shahril; Brenda Louisa Bede; | International Journal of Research and Innovation in Social … | 2025-11-05 |
| 305 | A Systematic Review of Music Therapy for Symptoms of Traumatic Brain Injury and Posttraumatic Stress Disorder in Adults Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Music therapy (MT) is implemented in healthcare settings for a range of symptoms and conditions. This systematic review provides an update of the available evidence by searching a combination of keywords related to MT, traumatic brain injury (TBI), and posttraumatic stress disorder (PTSD) across four databases (PubMed, PsycINFO, PTSDpubs, Scopus) from inception to February 2025. |
JAY M. UOMOTO et. al. | NeuroRehabilitation | 2025-11-04 |
| 306 | Effect of Music Therapy on Blood Pressure and Quality of Life Among Individuals with Essential Hypertension: A Systematic Review and Meta-analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To investigate the effects of music therapy on blood pressure levels, negative emotions, and quality of life in patients with essential hypertension, a systematic review and meta-analysis based on Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines were performed in the present study. |
Zewen Li; Yi Zhang; | Well-Being Sciences Review | 2025-11-03 |
| 307 | Options for Dubbing English Movie Lyrics Into Arabic Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study examines the Arabic dubs of English movie lyrics and explores the options available to the translator for rendering both the content and the form. |
Sundus Hassan; Ahmad S Haider; | An-Najah University Journal for Research – B (Humanities) | 2025-11-03 |
| 308 | MAVL: A Multilingual Audio-Video Lyrics Dataset for Animated Song Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: By integrating text, audio, and video, MAVL enables richer and more expressive translations than text-only approaches. Building on this, we propose Syllable-Constrained Audio-Video LLM with Chain-of-Thought (SylAVL-CoT), which leverages audio-video cues and enforces syllabic constraints to produce natural-sounding lyrics. |
Woohyun Cho; Youngmin Kim; Sunghyun Lee; Youngjae Yu; | emnlp | 2025-11-02 |
| 309 | WildScore: Benchmarking MLLMs In-the-Wild Symbolic Music Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To facilitate a comprehensive evaluation, we propose a systematic taxonomy,comprising both high-level and fine-grained musicological ontologies. |
GAGAN MUNDADA et. al. | emnlp | 2025-11-02 |
| 310 | ToneCraft: Cantonese Lyrics Generation with Harmony of Tones and Pitches Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current research has not yet addressed the challenge of generating lyrics that adhere to Cantonese harmony rules. To tackle this issue, we propose ToneCraft, a novel framework for generating Cantonese lyrics that ensures tonal and melodic harmony. |
Junyu Cheng; Chang Pan; Shuangyin Li; | emnlp | 2025-11-02 |
| 311 | Factual and Musical Evaluation Metrics for Music Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To measure the true performance of Music LMs,we propose (1) a better general-purpose evaluation metric for Music LMs adaptedto the music domain and (2) a factual evaluation framework to quantify thecorrectness of a Music LM’s responses. |
Daniel Chenyu Lin; Michael Freeman; John Thickstun; | arxiv-cs.SD | 2025-11-02 |
| 312 | DeepResonance: Enhancing Multimodal Music Understanding Via Music-centric Multi-way Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the potential of incorporating additional modalities such as images, videos and textual music features to enhance music understanding remains unexplored. To bridge this gap, we propose DeepResonance, a multimodal music understanding LLM fine-tuned via multi-way instruction tuning with multi-way aligned music, text, image, and video data. |
Zhuoyuan Mao; Mengjie Zhao; Qiyu Wu; Hiromi Wakaki; Yuki Mitsufuji; | emnlp | 2025-11-02 |
| 313 | Exploring The Correlation Between The Type of Music and The Emotions Evoked: A Study Using Subjective Questionnaires and EEG Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The subject of this work is to check how different types of music affecthuman emotions. |
Jelizaveta Jankowska; Bożena Kostek; Fernando Alonso-Fernandez; Prayag Tiwari; | arxiv-cs.CV | 2025-10-30 |
| 314 | Whose Intelligence? Whose Music? Critical Reflections on AI in Music Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This commentary critically examines the integration of artificial intelligence (AI) into music education through two guiding questions: Whose intelligence is encoded within these … |
Jincheng Ma; Qiang Wan; | Education as Change | 2025-10-30 |
| 315 | Artificial Intelligence and Healing Education: Bibliotherapy and Musicotherapy in Primary Schooling – An Innovative Theoretical Model of Bibliotherapy and Musicotherapy Questionnaire (BMQ) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The study examines the use of the Donna AI Song Generator within healing education, aiming to identify optimal strategies for both teachers and pupils. |
Mirzana Pašić Kodrić; Merima čaušević; | Online Journal of Music Sciences | 2025-10-29 |
| 316 | The IEEE-IS2 2025 Music Packet Loss Concealment Challenge Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We present the IEEE-IS2 2025 Music Packet Loss Concealment Challenge. Building on the foundations laid in the inaugural edition, this second installment of the challenge … |
A. Mezza; Alberto Bernardini; C. Rinaldi; | 2025 IEEE 6th International Symposium on the Internet of … | 2025-10-29 |
| 317 | Utilising Generative AI to Assist in The Creation and Production of Chinese Popular Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Objectives: This study introduces the GenAI Melody-LSTM algorithm. |
Xinhao Li; Hyuntai Kim; | Online Journal of Music Sciences | 2025-10-29 |
| 318 | Artificial Intelligence in Music: Applications, Challenges, and Future Prospects Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In particular, further research may focus on developing more transparent algorithms to improve user trust, exploring hybrid systems that integrate human creativity with machine intelligence, and establishing clearer frameworks for copyright and ownership of AI-generated works. By providing this structured overview, the review seeks to promote a deeper understanding of AIs potential as a collaborative tool in reshaping the future of music. |
Yichen Zhu; | Communications in Humanities Research | 2025-10-28 |
| 319 | Music2Palette: Emotion-aligned Color Palette Generation Via Cross-Modal Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Others rely on indirect mappings through text or images, resulting in the loss of crucial emotion details. To address these challenges, we present Music2Palette, a novel method for emotion-aligned color palette generation via cross-modal representation learning. |
Jiayun Hu; Yueyi He; Tianyi Liang; Changbo Wang; Chenhui Li; | mm | 2025-10-27 |
| 320 | Temporal-Conditioned Symbolic Alignment for Controllable Text-to-Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While existing models have made notable progress in sound quality, instrument identification, and stylistic alignment, they still exhibit clear limitations in modeling musical structure and musicality-particularly in terms of harmonic coherence and rhythmic alignment. To address these issues, we propose a Temporal-Conditioned Symbolic Alignment for Controllable Text-to-Music Generation(TCSA), which introduces explicit local condition controls to enhance structural fidelity in music generation. |
ZIHAO ZHANG et. al. | mm | 2025-10-27 |
| 321 | DUDA: A Two-stage Decoupling Unsupervised Domain Adaptation Framework for Semi-supervised Singing Melody Extraction from Polyphonic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: There is a lack of a consistency regularization method that utilizes fine-grained information to validate the availability of unlabeled data. To address these issues, in this paper, we propose a novel two-stage decoupling unsupervised domain adaptation framework for semi-supervised singing melody extraction, termed as DUDA. |
Shuai Yu; Xiaoliang He; Kangjie Dong; Yi Yu; | mm | 2025-10-27 |
| 322 | Cross-Modal Metrics for Capturing Correspondences Between Music Audio and Stage Lighting Signals Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This also applies to stage lighting and its correspondence with the music audio in live music performances. In this paper, we aim for measuring cross-modal correlations between music audio signals and stage lighting signals and formalize these correlations by introducing specific metrics. |
Michael Kohl; Tobias Wursthorn; Christof Wei\ss{}; | mm | 2025-10-27 |
| 323 | Wearable Music2Emotion : Assessing Emotions Induced By AI-Generated Music Through Portable EEG-fNIRS Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: (3) Portability Limitation : Cumbersome setups (e.g., 64+ channel gel-based EEG caps) hinder real-world applicability due to procedural complexity and portability barriers. To address these limitations, we propose MEEtBrain, a portable and multimodal framework for emotion analysis (valence/arousal), integrating AI-generated music stimuli with synchronized EEG-fNIRS acquisition via a wireless headband. |
SHA ZHAO et. al. | mm | 2025-10-27 |
| 324 | MusFlow: Multimodal Music Generation Via Conditional Flow Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite advancements in generating music from specific textual descriptions (e.g., style, genre, instruments), the practical application is still hindered by ordinary users’ limited expertise to write accurate prompts. To bridge this application gap, this paper introduces MusFlow, a novel multimodal music generation model using Conditional Flow Matching (CFM). |
Jiahao Song; Yu-Zhao Wang; | mm | 2025-10-27 |
| 325 | Granular Music Attribute Transformation with Proximal Policy Optimization Adapters for Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing methods still lack continuous strength regulation over stylistic attributes-specifically, they cannot achieve scalable adjustment of intensity (e.g., smooth transitions between ”gentle” and ”intense” jazz) while preserving spectral-temporal coherence. To address this, we propose RLScale-LoRA, a two-stage finetuning framework built on a structurally modified low-rank adaptation (LoRA) architecture with scale layers. |
Kunsheng Ma; Fan Qi; Changsheng Xu; | mm | 2025-10-27 |
| 326 | MuCodec: Ultra Low-Bitrate Music Codec for Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This highlights the need for high-compression, high-fidelity music codecs that can reconstruct both vocals and accompaniment with high quality at low frame rates and bitrates, thereby better assisting music generation. To address this, we introduce MuCodec, designed for high-quality music reconstruction at ultra-low bitrates, facilitating more efficient music generation. |
YAOXUN XU et. al. | mm | 2025-10-27 |
| 327 | Context-aware Image-to-Music Generation Via Bridging Modalities Through Musical Captions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Therefore, if musical captions effectively convey the intended message of the image, they serve as an excellent intermediary between the images and music. The proposed method connects these two different modalities through the medium of musical captions that describe the specialized content of the music. |
Shilin Liu; Kyohei Kamikawa; Keisuke Maeda; Takahiro Ogawa; Miki Haseyama; | mm | 2025-10-27 |
| 328 | MelodyEdit: Zero-shot Music Editing with Disentangled Inversion Control Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, existing editing methods struggle with conducting complex non-rigid music edits while maintaining content integrity and high fidelity. To address these challenges, we propose MelodyEdit, a novel zero-shot music editing system based on innovative Disentangled Inversion Control (DIC) technique, which comprises Harmonized Attention Control and Disentangled Inversion. |
HUADAI LIU et. al. | mm | 2025-10-27 |
| 329 | EgoMusic: An Egocentric Augmented Reality Glasses Dataset for Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces EgoMusic, a multimodal dataset featuring synchronised egocentric audio-visual data captured with AR glasses during live performances, alongside studio-quality audio references. |
ALESSANDRO RAGANO et. al. | mm | 2025-10-27 |
| 330 | Controllable Video-to-Music Generation with Multiple Time-Varying Conditions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing V2M methods relying solely on visual features or supplementary textual inputs generate music in a black-box manner, often failing to meet user expectations. To address this challenge, we propose a novel multi-condition guided V2M generation framework that incorporates multiple time-varying conditions for enhanced control over music generation. |
JUNXIAN WU et. al. | mm | 2025-10-27 |
| 331 | FG-Midiformer: A Symbolic Music Understanding Model Towards Fine-Grained Learning of Multi-Attributes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For instance, recurring melodies often appear in musical pieces with subtle variations, which requires methods that are aware of both local musical details and overall repetitive patterns, yet current methods can not fit the request. To address this issue, we introduce FG-Midiformer, a modified transformer framework that incorporates a multi-scale-aware feature learning (MSAFL) module and a local feature enhanced classification (LFEC) module for fine-grained understanding of multi-attributes. |
Haonan Cheng; Junwei Zhang; Hengyan Huang; Long Ye; | mm | 2025-10-27 |
| 332 | Spatial-Temporal Decomposition and Alignment in Controllable Video-to-Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we delve into the spatial-temporal decomposition and alignment in controllable video-to-music generation. |
WEITAO YOU et. al. | mm | 2025-10-27 |
| 333 | VietLyrics: A Large-Scale Dataset and Models for Vietnamese Automatic Lyrics Transcription Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We publicly release VietLyricsand our models, aiming to advance Vietnamese music computing research whiledemonstrating the potential of this approach for ALT in low-resource languageand music. |
Quoc Anh Nguyen; Bernard Cheng; Kelvin Soh; | arxiv-cs.AI | 2025-10-25 |
| 334 | StylePitcher: Generating Style-Following and Expressive Pitch Curves for Versatile Singing Tasks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Existing pitch curve generators face two main challenges: they often neglectsinger-specific expressiveness, reducing their ability to capture individualsinging styles. And they … |
JINGYUE HUANG et. al. | arxiv-cs.SD | 2025-10-24 |
| 335 | Effects of Music Intervention in Patients with Burn Injury: A Systematic Review and Meta-analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Hao Wang; Riti Qiu; Hua Huang; Bin Chen; | Medicine | 2025-10-24 |
| 336 | Streaming Generation for Music Accompaniment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a model design considering inevitablesystem delays in practical deployment with two design variables: futurevisibility $t_f$, the offset between the output playback time and the latestinput time used for conditioning, and output chunk duration $k$, the number offrames emitted per call. |
YUSONG WU et. al. | arxiv-cs.SD | 2025-10-24 |
| 337 | A Backpropagation Neural Network Model with Adaptive Feature Extraction for Music Emotion Recognition in Online Music Appreciation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores the relationship between students’ emotions and music categories in online music appreciation courses. |
Yang Chen; Chang Gao; Sahin Akdag; | PeerJ Computer Science | 2025-10-23 |
| 338 | XRMusic4VIP: Enabling Simultaneous Sheet Music Reading and Playing for Visually Impaired Musicians Through Extended Reality Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The use of assistive technologies for visually impaired musicians has been well documented, but the ability to visually read music and play simultaneously is still largely … |
Julia K. Anken; Delia Blaess; Karin Müller; | Proceedings of the 27th International ACM SIGACCESS … | 2025-10-22 |
| 339 | Evaluation of Music Generated By Artificial Intelligence from A Compositional Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The study compares AI-generated music and human composition in terms of aesthetic value, originality, and coherence. |
Selin Oyan Küpeli; | ARTS: Artuklu Sanat ve Beşeri Bilimler Dergisi | 2025-10-22 |
| 340 | Sonic Agency: A Group Autoethnography of Technology-mediated Performance Practice By Deaf and Hard of Hearing Musicians Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: While many assistive devices and accessible technologies aim to increase access and participation of d/Deaf and Hard of Hearing (DHH) audiences, DHH people’s active participation … |
D. Cavdir; Dillion Simone; Myles de Bastion; Shawn Trail; Nate Hergert; | Proceedings of the 27th International ACM SIGACCESS … | 2025-10-22 |
| 341 | Access Beyond The Score: Understanding Notation Needs and Workflows of Low Vision Musicians Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: While music performance can be an engaging hobby or career path for blind and low vision people, music notation is typically shared in print or PDF formats that are rarely … |
W. Payne; Yu Lee An; | Proceedings of the 27th International ACM SIGACCESS … | 2025-10-22 |
| 342 | Emotion Recognition in Javanese Music: A Comparative Study of Classifier Models with A Human-Annotated Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study investigates the effectiveness of three well-established machine learning models, 1D Convolutional Neural Networks (1D-CNNs), support Vector Machines (SVMs), and XGBoost, in classifying emotions in Javanese music using a manually annotated dataset. |
Moh Erwin Septianto; Ariana Tulus Purnomo; Ding Bing Lin; Chang Soo Kim; | Indonesian Journal of Computing, Engineering, and Design … | 2025-10-22 |
| 343 | Tug-of-War of Emotion: Measuring and Modeling Sentiment Cycles in Chinese-Language Pop Song Lyrics, 1967-2023 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For example, the detected monotone downward trend in the sentiment of English-language pop lyrics is typically interpreted as “reflecting” the deteriorating emotional and mental state in listener populations and/or the increasing demand for more negative (or less positive) lyric sentiment. This study challenges this “mirror interpretation” with an alternative “equilibration interpretation,” which posits that the average listener sentiment preference may remain largely stable across decades, and it is the equilibrating process that either brings the sentiment of pop lyrics closer to the listener preference or make the lyric sentiment oscillate around the listener preference. |
Xiaolu Wang; | Journal of Cultural Analytics | 2025-10-21 |
| 344 | SegTune: Structured and Fine-Grained Control for Song Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In thispaper, we propose SegTune, a non-autoregressive framework for structured andcontrollable song generation. |
PENGFEI CAI et. al. | arxiv-cs.SD | 2025-10-21 |
| 345 | MDD: A Dataset for Text-and-Music Conditioned Duet Dance Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Multimodal DuetDance (MDD), a diverse multimodal benchmark dataset designed for text-controlled and music-conditioned 3D duet dance motion generation. |
Prerit Gupta; Jason Alexander Fotso-Puepi; Zhengyuan Li; Jay Mehta; Aniket Bera; | iccv | 2025-10-20 |
| 346 | RapVerse: Coherent Vocals and Whole-Body Motion Generation from Text Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce a challenging task for simultaneously generating 3D holistic body motions and singing vocals directly from textual lyrics inputs, advancing beyond existing works that typically address these two modalities in isolation. |
JIABEN CHEN et. al. | iccv | 2025-10-20 |
| 347 | Music Grounding By Short Video Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In order to bridge the gap between the practical need for music moment localization and V2MR, we propose a new task termed Music Grounding by Short Video (MGSV). |
ZIJIE XIN et. al. | iccv | 2025-10-20 |
| 348 | Music-Aligned Holistic 3D Dance Generation Via Hierarchical Motion Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address these challenges, we introduce SoulDance, a high-precision music-dance paired dataset captured via professional motion capture systems, featuring meticulously annotated holistic dance movements. Building on this dataset, we propose SoulNet, a framework designed to generate music-aligned, kinematically coordinated holistic dance sequences. |
XIAOJIE LI et. al. | iccv | 2025-10-20 |
| 349 | Not All Deepfakes Are Created Equal: Triaging Audio Forgeries for Robust Deepfake Singer Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Based on the premise that the mostharmful deepfakes are those of the highest quality, we introduce a two-stagepipeline to identify a singer’s vocal likeness. |
Davide Salvi; Hendrik Vincent Koops; Elio Quinton; | arxiv-cs.SD | 2025-10-20 |
| 350 | MuseTok: Symbolic Music Tokenization for Generation and Semantic Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Discrete representation learning has shown promising results across variousdomains, including generation and understanding in image, speech and language.Inspired by these advances, we propose MuseTok, a tokenization method forsymbolic music, and investigate its effectiveness in both music generation andunderstanding tasks. |
JINGYUE HUANG et. al. | arxiv-cs.SD | 2025-10-17 |
| 351 | Application of Time GAN-LSTM Algorithm in Constructing Music Aesthetic Classification Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study proposes an algorithm that fuses temporal generative adversarial network (Time GAN) and long short-term memory network (LSTM), which is applied to construct a music aesthetic classification model in order to more accurately identify and classify music works. |
Feifei Li; Changyue Yu; | Journal of Computational Methods in Sciences and Engineering | 2025-10-17 |
| 352 | The Virtual Concert-goer: Audience Perspectives on Remote Music Performances Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our work explores how audiences perceive and engage with remote music events. |
SOPHIA PPALI et. al. | Proceedings of the ACM on Human-Computer Interaction | 2025-10-16 |
| 353 | MotionBeat: Motion-Aligned Music Representation Via Embodied Contrastive Learning and Bar-Equivariant Contact-Aware Encoding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We proposeMotionBeat, a framework for motion-aligned music representation learning.MotionBeat is trained with two newly proposed objectives: the EmbodiedContrastive Loss (ECL), an enhanced InfoNCE formulation with tempo-aware andbeat-jitter negatives to achieve fine-grained rhythmic discrimination, and theStructural Rhythm Alignment Loss (SRAL), which ensures rhythm consistency byaligning music accents with corresponding motion events. |
Xuanchen Wang; Heng Wang; Weidong Cai; | arxiv-cs.SD | 2025-10-15 |
| 354 | Phrase-Oriented Generative Rhythmic Patterns for Jazz Solos Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study introduces a novel generative approach for crafting phrase-oriented rhythmic patterns in jazz solos, leveraging statistical analyses of a comprehensive corpus, the Weimar Jazz Database. |
Adriano N. Raposo; Vasco N. G. J. Soares; | Applied Sciences | 2025-10-15 |
| 355 | Automatic Generation of Music Elements Based on Artificial Intelligence Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The designed method can effectively achieve music production, meet high precision design requirements, and achieve good design results. This indicates that the music element generation method based on recurrent gradient frequency proposed in the study has good performance. |
Sha Li; | Journal of Computational Methods in Sciences and Engineering | 2025-10-14 |
| 356 | Music Genre Classification with Modified Residual Learning and Dual Neural Network Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes an approach to improve the music genre classification tasks with modified residual learning and hybrid convolutional neural networks. |
MOHSIN ASHRAF et. al. | PLOS One | 2025-10-14 |
| 357 | Uncovering Hidden Themes in Indie Music: Crisp-Dm Guided LDA Topic Modeling on A Kaggle-Based Lyric Generation Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For future research, it is recommended to use larger datasets and more diverse interpretations and apply more machine learning models. |
Thoyyibah T; Yan Mitha Djaksana; | JURNAL TEKNIK INFORMATIKA | 2025-10-13 |
| 358 | Enhanced Television Broadcast Monitoring with Source Separation-assisted Audio Fingerprinting: A Case Study Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present the first extensive study comprising 13 source separation algorithms and five AFP models. |
Guillem Cortès-Sebastià; Marius Miron; Emilio Molina; Alex Ciurana; Xavier Serra; | Multimedia Tools and Applications | 2025-10-13 |
| 359 | Exploring Singing Breath: Physiological Insights and Directions for Breath-Aware Augmentation in Mixed Reality Design Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper investigates singing breath, the distinctive respiratory pattern that occurs while singing a melody, as a structured and expressive form of breathing that can support … |
Kanyu Chen; Zhuang Chang; Qianyuan Zou; Kai Kunze; | Companion of the 2025 ACM International Joint Conference on … | 2025-10-12 |
| 360 | The Human Record Needle: A Novel Interface for Embodied Music Interaction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Many people listen to music as a background activity during various tasks, such as driving, walking their dog, or showering. The availability of streaming services has made music … |
Brandon Waylan Ables; | Proceedings of the 27th International Conference on … | 2025-10-12 |
| 361 | MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite the critical role of spatial audio in immersive technologiessuch as VR/AR, most existing multimodal datasets provide only monaural audio,which limits the development of spatial audio generation and understanding. Toaddress these challenges, we introduce MRSAudio, a large-scale multimodalspatial audio dataset designed to advance research in spatial audiounderstanding and generation. |
WENXIANG GUO et. al. | arxiv-cs.SD | 2025-10-11 |
| 362 | DiTSinger: Scaling Singing Voice Synthesis with Diffusion Transformer and Implicit Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a two-stage pipeline: a compact seed set ofhuman-sung recordings is constructed by pairing fixed melodies with diverseLLM-generated lyrics, and melody-specific models are trained to synthesize over500 hours of high-quality Chinese singing data. |
ZONGCAI DU et. al. | arxiv-cs.SD | 2025-10-10 |
| 363 | Leveraging Whisper Embeddings for Audio-based Lyrics Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduceWEALY, a fully reproducible pipeline that leverages Whisper decoder embeddingsfor lyrics matching tasks. |
Eleonora Mancini; Joan Serrà; Paolo Torroni; Yuki Mitsufuji; | arxiv-cs.SD | 2025-10-09 |
| 364 | Content-based Music Recommender System Based on Music Emotion Using Deep Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Mohammad Ali Talaghat; Elham Parvinnia; Reza Boostani; | Iran Journal of Computer Science | 2025-10-09 |
| 365 | Writing The Future: The Impact of Artificial Intelligence and Knowledge Graphs on The Music Industry Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The objective of the study is to conceptualize and justify a hybrid model that unites the deterministic precision of knowledge graph structures with the timbral and textural richness of neural-network generative approaches. |
Popova Anastasiia; | Universal Library of Arts and Humanities | 2025-10-09 |
| 366 | LARA-Gen: Enabling Continuous Emotion Control for Music Generation Models Via Latent Affective Representation Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce LARA-Gen, a framework for continuous emotion controlthat aligns the internal hidden states with an external music understandingmodel through Latent Affective Representation Alignment (LARA), enablingeffective training. |
JIAHAO MEI et. al. | arxiv-cs.SD | 2025-10-07 |
| 367 | Segment-Factorized Full-Song Generation on Symbolic Piano Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose the Segmented Full-Song Model (SFS) for symbolic full-songgeneration. |
Ping-Yi Chen; Chih-Pin Tan; Yi-Hsuan Yang; | arxiv-cs.SD | 2025-10-07 |
| 368 | Transcribing Rhythmic Patterns of The Guitar Track in Polyphonic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To transcribe the strums and their corresponding rhythmicpatterns, we propose a three-step framework. |
Aleksandr Lukoianov; Anssi Klapuri; | arxiv-cs.SD | 2025-10-07 |
| 369 | Language Models for Longitudinal Analysis of Abusive Content in Billboard Music Charts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we utilise deep learning methods to analyse songs(lyrics) from Billboard Charts of the United States in the last seven decades.We provide a longitudinal study using deep learning and language models andreview the evolution of content using sentiment analysis and abuse detection,including sexually explicit content. |
Rohitash Chandra; Yathin Suresh; Divyansh Raj Sinha; Sanchit Jindal; | arxiv-cs.CL | 2025-10-05 |
| 370 | Large Model‐Enhanced CNN –Transformer Architecture for Adaptive Music Quality Classification in Wireless Communication Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: ABSTRACT This letter presents an enhanced convolutional neural network (CNN)–transformer architecture integrated with large model (LM) capabilities for adaptive music quality classification in wireless communication networks (WCNs). |
Tianyu Chen; | Internet Technology Letters | 2025-10-04 |
| 371 | Detecting Notational Errors in Digital Music Scores Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Yet, data quality isa major issue when dealing with musical information extraction and retrieval.We present an automated approach to detect notational errors, aiming atprecisely localizing defects in scores. |
Géré Léo; Nicolas Audebert; Florent Jacquemard; | arxiv-cs.MM | 2025-10-03 |
| 372 | Go WitheFlow: Real-time Emotion Driven Audio Effects Modulation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, weintroduce the witheFlow system, designed to enhance real-time music performanceby automatically modulating audio effects based on features extracted from bothbiosignals and the audio itself. |
Edmund Dervakos; Spyridon Kantarelis; Vassilis Lyberatos; Jason Liartis; Giorgos Stamou; | arxiv-cs.SD | 2025-10-02 |
| 373 | SingMOS-Pro: An Comprehensive Benchmark for Singing Quality Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduceSingMOS-Pro, a dataset for automatic singing quality assessment. |
YUXUN TANG et. al. | arxiv-cs.SD | 2025-10-02 |
| 374 | Bias Beyond Borders: Global Inequalities in AI-Generated Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address thesechallenges, we introduce GlobalDISCO, a large-scale dataset consisting of 73kmusic tracks generated by state-of-the-art commercial generative music models,along with paired links to 93k reference tracks in LAION-DISCO-12M. |
Ahmet Solak; Florian Grötschla; Luca A. Lanzendörfer; Roger Wattenhofer; | arxiv-cs.SD | 2025-10-02 |
| 375 | The Impact of Listening to Music on Stress Level for Anxiety, Depression, and PTSD: Mixed-Effect Models and Propensity Score Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The intersection of music and mental health has gained increasing attention, with previous studies highlighting music’s potential to reduce stress and anxiety. Despite these … |
M. ABDALLA et. al. | IEEE Transactions on Computational Social Systems | 2025-10-01 |
| 376 | Do Listeners Devalue AI-generated Pop Music? Exploring Negative Biases in Listeners’ Responses to AI-labelled Vs Human-labelled Pop Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
S. Chia; A. Hartanto; Eddie M. W. Tong; | Comput. Hum. Behav. Artif. Humans | 2025-10-01 |
| 377 | VMM: Video-Music Mamba for Generating Background Music from Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Jiajun Xu; Zixiang Lu; Ping Gao; Qiguang Miao; Kun Xie; | Comput. Vis. Image Underst. | 2025-10-01 |
| 378 | Feeling The Music: Exploring Emotional Effects of Auditory-tactile Musical Experiences Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Tactile pairing with auditory stimulation has been shown to enhance various capabilities, including the intensity of the stimulus, its location, and its comprehensibility in … |
Naama Schwartz; Adi Snir; Amir Amedi; | Frontiers Virtual Real. | 2025-10-01 |
| 379 | Source Separation for A Cappella Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we study the task of multi-singer separation in a cappellamusic, where the number of active singers varies across mixtures. |
Luca A. Lanzendörfer; Constantin Pinkl; Florian Grötschla; | arxiv-cs.SD | 2025-09-30 |
| 380 | Data Melodification FM: Where Musical Rhetoric Meets Sonification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a design space for data melodification, where standardvisualization idioms and fundamental data characteristics map to rhetoricaldevices of music for a more affective experience of data. |
Ke Er Amy Zhang; David Grellscheid; Laura Garrison; | arxiv-cs.HC | 2025-09-30 |
| 381 | SAGE-Music: Low-Latency Symbolic Music Generation Via Attribute-Specialized Key-Value Head Sharing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our main contributions are (1) thefirst systematic study of BPE’s generalizability in multi-track symbolic music,and (2) the introduction of AS-KVHS for low-latency symbolic music generation.Beyond these, we also release SAGE-Music, an open-source benchmark that matchesor surpasses state-of-the-art models in generation quality. |
JIAYE TAN et. al. | arxiv-cs.SD | 2025-09-30 |
| 382 | Beyond Genre: Diagnosing Bias in Music Embeddings Using Concept Activation Vectors Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we apply Concept Activation Vectors(CAVs) to investigate whether non-musical singer attributes – such as genderand language – influence genre representations in unintended ways. |
Roman B. Gebhardt; Arne Kuhle; Eylül Bektur; | arxiv-cs.SD | 2025-09-29 |
| 383 | The Shape of Surprise: Structured Uncertainty and Co-Creativity in AI Music Tools Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Thispaper presents a thematic review of contemporary AI music systems, examininghow designers incorporate randomness and uncertainty into creative practice. |
Eric Browne; | arxiv-cs.SD | 2025-09-29 |
| 384 | Discovering Words in Music: Unsupervised Learning of Compositional Sparse Code for Symbolic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents an unsupervised machine learning algorithm thatidentifies recurring patterns — referred to as “music-words” — fromsymbolic music data. |
TIANLE WANG et. al. | arxiv-cs.SD | 2025-09-29 |
| 385 | Echoes of Humanity: Exploring The Perceived Humanness of AI Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present results from alistener-focused experiment aimed at understanding how humans perceive AIM. |
FLAVIO FIGUEIREDO et. al. | arxiv-cs.AI | 2025-09-29 |
| 386 | ABC-Eval: Benchmarking Large Language Models on Symbolic Music Understanding and Instruction Following Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While symbolic music has been widely used in generation tasks, LLMcapabilities in understanding and reasoning about symbolic music remain largelyunderexplored. To address this gap, we propose ABC-Eval, the first open-sourcebenchmark dedicated to the understanding and instruction-following capabilitiesin text-based ABC notation scores. |
Jiahao Zhao; Yunjia Li; Wei Li; Kazuyoshi Yoshii; | arxiv-cs.SD | 2025-09-27 |
| 387 | Global Beats, Local Tongue: Studying Code Switching in K-pop Hits on Billboard Charts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Code switching, particularly between Korean and English, has become adefining feature of modern K-pop, reflecting both aesthetic choices and globalmarket strategies. This paper is … |
Aditya Narayan Sankaran; Reza Farahbakhsh; Noel Crespi; | arxiv-cs.CL | 2025-09-27 |
| 388 | Zero-Effort Image-to-Music Generation: An Interpretable RAG-based VLM Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Even methods based on emotion mappingface controversy, as emotion represents only a singular aspect of art.Additionally, most learning-based methods require substantial computationalresources and large datasets for training, hindering accessibility for commonusers. To address these challenges, we propose the first Vision Language Model(VLM)-based I2M framework that offers high interpretability and lowcomputational cost. |
Zijian Zhao; Dian Jin; Zijing Zhou; | arxiv-cs.SD | 2025-09-26 |
| 389 | A Multimodal Dataset of Greek Folk Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents a multimodal dataset of Greek folk dance music, focusing on syrtos and balos. Developed to support research in computational musicology, the dataset improves … |
Anna-Maria Christodoulou; Olivier Lartillot; | Proceedings of the 12th International Conference on Digital … | 2025-09-25 |
| 390 | MusicWeaver: Coherent Long-Range and Editable Music Generation from A Beat-Aligned Structural Plan Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present MusicWeaver, a music generationmodel conditioned on a beat-aligned structural plan. |
Xuanchen Wang; Heng Wang; Weidong Cai; | arxiv-cs.SD | 2025-09-25 |
| 391 | SiNGER: A Clearer Voice Distills Vision Transformers Further Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Prior work attempted to remove artifacts but encountered aninherent trade-off between artifact suppression and preserving informativesignals from teachers. To address this, we introduce Singular Nullspace-GuidedEnergy Reallocation (SiNGER), a novel distillation framework that suppressesartifacts while preserving informative signals. |
Geunhyeok Yu; Sunjae Jeong; Yoonyoung Choi; Jaeseung Kim; Hyoseok Hwang; | arxiv-cs.CV | 2025-09-25 |
| 392 | Smashcima: Full-Page Handwritten Music Document Synthesizer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Despite massive progress made in Optical Music Recognition (OMR) with deep learning, data scarcity remains an issue, especially for manuscripts. Synthetic data has been shown to … |
Jiří Mayer; Pavel Pecina; Jan Hajič; | Proceedings of the 12th International Conference on Digital … | 2025-09-25 |
| 393 | Muse-it: A Tool for Analyzing Music Discourse on Reddit Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Muse-it, aplatform that retrieves comprehensive Reddit data centered on user-definedqueries. |
Jatin Agarwala; George Paul; Nemani Harsha Vardhan; Vinoo Alluri; | arxiv-cs.IR | 2025-09-24 |
| 394 | CoMelSinger: Discrete Token-Based Zero-Shot Singing Synthesis With Structured Melody Control and Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present CoMelSinger, a zero-shot SVS frameworkthat enables structured and disentangled melody control within a discrete codecmodeling paradigm. |
Junchuan Zhao; Wei Zeng; Tianle Lyu; Ye Wang; | arxiv-cs.SD | 2025-09-24 |
| 395 | SINGER: An Onboard Generalist Vision-Language Navigation Policy for Drones Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present SINGER for language-guidedautonomous drone navigation in the open world using only onboard sensing andcompute. |
Maximilian Adang; JunEn Low; Ola Shorinwa; Mac Schwager; | arxiv-cs.RO | 2025-09-22 |
| 396 | Dorabella Cipher As Musical Inspiration Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We weigh the evidence for and againstthe hypothesis, devise a simplified music notation, and attempt to reconstructa melody from the cipher. |
Bradley Hauer; Colin Choi; Abram Hindle; Scott Smallwood; Grzegorz Kondrak; | arxiv-cs.CL | 2025-09-22 |
| 397 | RISE: Adaptive Music Playback for Realtime Intensity Synchronization with Exercise Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a system to adapt a user’s music to their exercise by aligninghigh-energy music segments with intense intervals of the workout. |
Alexander Wang; Chris Donahue; Dhruv Jain; | arxiv-cs.SD | 2025-09-21 |
| 398 | Etude: Piano Cover Generation with A Three-Stage Approach — Extract, StrucTUralize, and DEcode Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Etude, a three-stage architecture consisting ofExtract, strucTUralize, and DEcode stages. |
Tse-Yang Che; Yuh-Jzer Joung; | arxiv-cs.SD | 2025-09-20 |
| 399 | The Singing Voice Conversion Challenge 2025: From Singer Identity Conversion To Singing Style Conversion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present the findings of the latest iteration of the Singing VoiceConversion Challenge, a scientific event aiming to compare and understanddifferent voice conversion systems in a controlled environment. |
LESTER PHILLIP VIOLETA et. al. | arxiv-cs.SD | 2025-09-19 |
| 400 | Jamendo-QA: A Large-Scale Music Question Answering Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Jamendo-QA, a large-scale dataset for Music Question Answering(Music-QA). |
Junyoung Koh; Soo Yong Kim; Yongwon Choi; Gyu Hyeong Choi; | arxiv-cs.MM | 2025-09-19 |
| 401 | Music4All A+A: A Multimodal Dataset for Music Information Retrieval Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, most datasets for multimodal MIR neglectthis aspect and provide data at the level of individual music tracks. We aim tofill this gap by providing Music4All Artist and Album (Music4All A+A), adataset for multimodal MIR tasks based on music artists and albums. |
Jonas Geiger; Marta Moscati; Shah Nawaz; Markus Schedl; | arxiv-cs.MM | 2025-09-18 |
| 402 | RETRACTED: Research on Intelligent Music Personalized Recommendation Algorithm Based on MLP-Mixer Efficient Feature Extraction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the rapid development of the digital music market, users’ demand for personalized music recommendations is increasing daily. Traditional recommendation algorithms often face … |
Ying Li; Hao Chen; | J. Comput. Methods Sci. Eng. | 2025-09-17 |
| 403 | AnyAccomp: Generalizable Accompaniment Generation Via Quantized Melodic Bottleneck Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This creates a critical train-test mismatch, leading to failure onclean, real-world vocal inputs. We introduce AnyAccomp, a framework thatresolves this by decoupling accompaniment generation from source-dependentartifacts. |
Junan Zhang; Yunjia Zhang; Xueyao Zhang; Zhizheng Wu; | arxiv-cs.SD | 2025-09-17 |
| 404 | Osu2MIR: Beat Tracking Dataset Derived From Osu! Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we explore the use of Osu! |
Ziyun Liu; Chris Donahue; | arxiv-cs.SD | 2025-09-16 |
| 405 | The Role of Fairness and Diversity in User Choices and Perceptions of Music Playlists Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Nowadays, most listeners access music through streaming platforms, which has transformed how recommendations are delivered and received. Notable work has been done on improving … |
Karlijn Dinnissen; Shah Noor Khan; H. Hauptmann; Eelco Herder; Judith Masthoff; | Proceedings of the 36th ACM Conference on Hypertext and … | 2025-09-15 |
| 406 | Data-Driven Analysis of Text-Conditioned AI-Generated Music: A Case Study with Suno and Udio Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Usinga combination of state-of-the-art text embedding models, dimensionalityreduction and clustering methods, we analyze the prompts, tags and lyrics, andautomatically annotate and display the processed data in interactive plots. |
Luca Casini; Laura Cros Vila; David Dalmazzo; Anna-Kaisa Kaila; Bob L. T. Sturm; | arxiv-cs.IR | 2025-09-15 |
| 407 | Socially Aware Music Recommendation: A Multi-Modal Graph Neural Networks for Collaborative Music Consumption and Community-Based Engagement Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study presents a novel Multi-Modal Graph Neural Network (MM-GNN)framework for socially aware music recommendation, designed to enhancepersonalization and foster community-based engagement. |
Kajwan Ziaoddini; | arxiv-cs.IR | 2025-09-12 |
| 408 | An Adaptive CMSA for Solving The Longest Filled Common Subsequence Problem with An Application in Audio Querying Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce a new benchmark dataset withsignificantly larger instances and demonstrate that existing datasets lack thediscriminative power needed to meaningfully assess algorithm performance atscale. |
Marko Djukanovic; Christian Blum; Aleksandar Kartelj; Ana Nikolikj; Guenther Raidl; | arxiv-cs.SD | 2025-09-12 |
| 409 | Segment Transformer: AI-Generated Music Detection Via Music Structural Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Also, it can be difficult todetermine whether a piece was generated by AI or composed by humans clearly. Toaddress these challenges, we aim to improve the accuracy of AIGM detection byanalyzing the structural patterns of music segments. |
Yumin Kim; Seonghyeon Go; | arxiv-cs.SD | 2025-09-10 |
| 410 | Real-world Music Plagiarism Detection With Music Segment Transcription System Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, wepropose a system for detecting music plagiarism by combining various MIRtechnologies. |
Seonghyeon Go; | arxiv-cs.AI | 2025-09-10 |
| 411 | No Encore: Unlearning As Opt-Out in Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present preliminary results on the firstapplication of machine unlearning techniques from an ongoing research toprevent inadvertent usage of creative content. |
Jinju Kim; Taehan Kim; Abdul Waheed; Jong Hwan; Rita Singh; | arxiv-cs.CL | 2025-09-07 |
| 412 | From Joy to Fear: A Benchmark of Emotion Estimation in Pop Song Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Amanually labeled dataset is constructed using a mean opinion score (MOS)approach, which aggregates annotations from multiple human raters to ensurereliable ground-truth labels. Leveraging this dataset, we conduct acomprehensive evaluation of several publicly available large language models(LLMs) under zero-shot scenarios. |
Shay Dahary; Avi Edana; Alexander Apartsin; Yehudit Aperstein; | arxiv-cs.CL | 2025-09-06 |
| 413 | Training A Perceptual Model for Evaluating Auditory Similarity in Music Adversarial Attack Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Music Information Retrieval (MIR) systems are highly vulnerable toadversarial attacks that are often imperceptible to humans, primarily due to amisalignment between model feature spaces and human auditory perception.Existing defenses and perceptual metrics frequently fail to adequately capturethese auditory nuances, a limitation supported by our initial listening testsshowing low correlation between common metrics and human judgments. To bridgethis gap, we introduce Perceptually-Aligned MERT Transformer (PAMT), a novelframework for learning robust, perceptually-aligned music representations. |
Yuxuan Liu; Rui Sang; Peihong Zhang; Zhixin Li; Shengchen Li; | arxiv-cs.SD | 2025-09-05 |
| 414 | Learning and Composing of Classical Music Using Restricted Boltzmann Machines Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we adopted J. S. Bach’s music for training ofa restricted Boltzmann machine (RBM). |
Mutsumi Kobayashi; Hiroshi Watanabe; | arxiv-cs.SD | 2025-09-05 |
| 415 | Towards An AI Musician: Synthesizing Sheet Music Problems for Musical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Thisapproach allows us to introduce a data synthesis framework that generatesverifiable sheet music questions in both textual and visual modalities, leadingto the Synthetic Sheet Music Reasoning Benchmark (SSMR-Bench) and acomplementary training set. |
ZHILIN WANG et. al. | arxiv-cs.CL | 2025-09-04 |
| 416 | PianoBind: A Multimodal Joint Embedding Model for Pop-piano Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address theselimitations, we propose PianoBind, a piano-specific multimodal joint embeddingmodel. |
Hayeon Bang; Eunjin Choi; Seungheon Doh; Juhan Nam; | arxiv-cs.SD | 2025-09-04 |
| 417 | HarmonyTok: Comparing Methods for Harmony Tokenization for Machine Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper explores different approaches to harmony tokenization in symbolic music for transformer-based models, focusing on two tasks: masked language modeling (MLM) and melodic … |
MAXIMOS A. KALIAKATSOS-PAPAKOSTAS et. al. | Inf. | 2025-09-01 |
| 418 | CoComposer: LLM Multi-agent Collaborative Music Composition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce CoComposer, a multi-agentsystem that consists of five collaborating agents, each with a task based onthe traditional music composition workflow. |
Peiwen Xing; Aske Plaat; Niki van Stein; | arxiv-cs.SD | 2025-08-29 |
| 419 | Amadeus: Autoregressive Model with Bidirectional Attribute Modelling for Symbolic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Based on thisinsight, we introduce Amadeus, a novel symbolic music generation framework.Amadeus adopts a two-level architecture: an autoregressive model for notesequences and a bidirectional discrete diffusion model for attributes. |
Hongju Su; Ke Li; Lan Yang; Honggang Zhang; Yi-Zhe Song; | arxiv-cs.SD | 2025-08-28 |
| 420 | CompLex: Music Theory Lexicon Constructed By Autonomous Agents for Automatic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a novel automatic music lexicon construction modelthat generates a lexicon, named CompLex, comprising 37,432 items derived fromjust 9 manually input category keywords and 5 sentence prompt templates. |
Zhejing Hu; Yan Liu; Gong Chen; Bruce X. B. Yu; | arxiv-cs.SD | 2025-08-27 |
| 421 | MQAD: A Large-Scale Question Answering Dataset for Training Music Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To compileMQAD, our methodology leverages specialized Music Information Retrieval (MIR)models to extract higher-level musical features and Large Language Models(LLMs) to generate natural language QA pairs. |
ZHIHAO OUYANG et. al. | arxiv-cs.SD | 2025-08-26 |
| 422 | A Survey on Evaluation Metrics for Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we shed light on thisresearch gap, introducing a detailed taxonomy for evaluation metrics for bothaudio and symbolic music representations. |
Faria Binte Kader; Santu Karmaker; | arxiv-cs.SD | 2025-08-24 |
| 423 | Vevo2: A Unified and Controllable Framework for Speech and Singing Voice Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Controllable human voice generation, particularly for expressive domains like singing, remains a significant challenge. This paper introduces Vevo2, a unified framework for … |
XUEYAO ZHANG et. al. | IEEE Transactions on Audio, Speech and Language Processing | 2025-08-22 |
| 424 | From Sound to Sight: Towards AI-authored Music Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Conventional music visualisation systems rely on handcrafted ad hoctransformations of shapes and colours that offer only limited expressiveness.We propose two novel pipelines for automatically generating music videos fromany user-specified, vocal or instrumental song using off-the-shelf deeplearning models. |
LEO VITASOVIC et. al. | arxiv-cs.SD | 2025-08-20 |
| 425 | Exploring The Feasibility of LLMs for Automated Music Emotion Annotation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we annotatedGiantMIDI-Piano, a classical MIDI piano music dataset, in a four-quadrantvalence-arousal framework using GPT-4o, and compared against annotationsprovided by three human experts. |
Meng Yang; Jon McCormack; Maria Teresa Llano; Wanchao Su; | arxiv-cs.SD | 2025-08-18 |
| 426 | FFD: Fine-Finger Diffusion Model for Music to Fine-grained Finger Dance Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Finger dance is an emerging social media trend using finger gesture motions for expression. Music to finger dance generation is challenging due to its fine-grained movements. … |
Boyan Dong; Wen-Ling Lei; Li Liu; | Interspeech 2025 | 2025-08-17 |
| 427 | Towards A Practical Tool for Music Composition: Using Constraint Programming to Model Chord Progressions and Modulations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The Harmoniser project aims to provide a practical tool to aid music composers in creating complete musical works. |
Damien Sprockeels; Peter Van Roy; | ijcai | 2025-08-16 |
| 428 | GETMusic: Generating Music Tracks with A Unified Representation and Diffusion Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a framework known as GETMusic, with “GET” standing for “GEnerate music Tracks.” |
ANG LV et. al. | ijcai | 2025-08-16 |
| 429 | Synthesizing Composite Hierarchical Structure from Symbolic Music Corpora Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In order to analyze music compositions holistically and at multiple granularities, we propose a unified, hierarchical meta-representation of musical structure called the structural temporal graph (STG). |
ILANA SHAPIRO et. al. | ijcai | 2025-08-16 |
| 430 | NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce NotaGen, a symbolic music generation model aiming to explore the potential of producing high-quality classical sheet music. |
YASHAN WANG et. al. | ijcai | 2025-08-16 |
| 431 | LivePoem: Improving The Learning Experience of Classical Chinese Poetry with AI-Generated Musical Storyboards Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper aims to improve the experience of classical Chinese poetry learning by introducing LivePoem—a system that generates musical storyboards (storyboards with background music) as audiovisual aids to support poetry comprehension. |
Qihao Liang; Xichu Ma; Torin Hopkins; Ye Wang; | ijcai | 2025-08-16 |
| 432 | Motive-level Analysis of Form-functions Association in Korean Folk Song Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a method forautomatic motive segmentation in Korean folk songs by fine-tuning a speechtranscription model on audio lyric with motif boundary annotation. |
Danbinaerin Han; Dasaem Jeong; Juhan Nam; | arxiv-cs.SD | 2025-08-14 |
| 433 | BeatFM: Improving Beat Tracking with Pre-trained Music Foundation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Beat tracking is a widely researched topic in music information retrieval.However, current beat tracking methods face challenges due to the scarcity oflabeled data, which limits their ability to generalize across diverse musicalstyles and accurately capture complex rhythmic structures. To overcome thesechallenges, we propose a novel beat tracking paradigm BeatFM, which introducesa pre-trained music foundation model and leverages its rich semantic knowledgeto improve beat tracking performance. |
GANGHUI RU et. al. | arxiv-cs.SD | 2025-08-13 |
| 434 | Opening Musical Creativity? Embedded Ideologies in Generative-AI Music Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our aim is toinvestigate ideologies that are driving the early-stage development andadoption of generative-AI in music making, with a particular focus ondemocratization. |
Liam Pram; Fabio Morreale; | arxiv-cs.SD | 2025-08-12 |
| 435 | Music and Artificial Intelligence: Artistic Trends Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study how musicians use artificial intelligence (AI) across formats likesingles, albums, performances, installations, voices, ballets, operas, orsoundtracks. |
JORDI PONS et. al. | arxiv-cs.CY | 2025-08-12 |
| 436 | DAFMSVC: One-Shot Singing Voice Conversion with Dual Attention Mechanism and Flow Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address thesechallenges, we propose DAFMSVC, where the self-supervised learning (SSL)features from the source audio are replaced with the most similar SSL featuresfrom the target audio to prevent timbre leakage. |
WEI CHEN et. al. | arxiv-cs.SD | 2025-08-07 |
| 437 | Live Music Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a new class of generative models for music called live musicmodels that produce a continuous stream of music in real-time with synchronizeduser control. |
LYRIA TEAM et. al. | arxiv-cs.SD | 2025-08-06 |
| 438 | Towards Hallucination-Free Music: A Reinforcement Learning Preference Optimization Framework for Reliable Song Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current supervised fine-tuning (SFT)approaches, limited by passive label-fitting, exhibit constrainedself-improvement and poor hallucination mitigation. To address this corechallenge, we propose a novel reinforcement learning (RL) framework leveragingpreference optimization for hallucination control. |
HUAICHENG ZHANG et. al. | arxiv-cs.SD | 2025-08-06 |
| 439 | Wearable Music2Emotion : Assessing Emotions Induced By AI-Generated Music Through Portable EEG-fNIRS Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address theselimitations, we propose MEEtBrain, a portable and multimodal framework foremotion analysis (valence/arousal), integrating AI-generated music stimuli withsynchronized EEG-fNIRS acquisition via a wireless headband. |
SHA ZHAO et. al. | arxiv-cs.SD | 2025-08-05 |
| 440 | Enhancing Typhlo Music Therapy with Personalized Action Rules: A Data-Driven Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the context of typhlo music therapy, personalized interventions can significantly enhance the therapeutic experience for visually impaired children. Leveraging a data-driven … |
Aileen Benedict; Zbigniew W. Ras; Pawel Cylulko; Joanna Gładyszewska-Cylulko; | Inf. | 2025-08-04 |
| 441 | Via Score to Performance: Efficient Human-Controllable Long Song Generation with Bar-Level Symbolic Notation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Song generation is regarded as the most challenging problem in music AIGC;nonetheless, existing approaches have yet to fully overcome four persistentlimitations: controllability, generalizability, perceptual quality, andduration. We argue that these shortcomings stem primarily from the prevailingparadigm of attempting to learn music theory directly from raw audio, a taskthat remains prohibitively difficult for current models. |
Tongxi Wang; Yang Yu; Qing Wang; Junlang Qian; | arxiv-cs.SD | 2025-08-02 |
| 442 | ArzEn-MultiGenre: An Aligned Parallel Dataset of Egyptian Arabic Song Lyrics, Novels, and Subtitles, with English Translations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: ArzEn-MultiGenre is a parallel dataset of Egyptian Arabic song lyrics,novels, and TV show subtitles that are manually translated and aligned withtheir English counterparts. The … |
Rania Al-Sabbagh; | arxiv-cs.CL | 2025-08-02 |
| 443 | Automatic Melody Reduction Via Shortest Path Finding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel and conceptuallysimple computational method for melody reduction using a graph-basedrepresentation inspired by principles from computational music theories, wherethe reduction process is formulated as finding the shortest path. |
Ziyu Wang; Yuxuan Wu; Roger B. Dannenberg; Gus Xia; | arxiv-cs.SD | 2025-08-02 |
| 444 | Uncovering Brain Network Modules in Major Depression During Naturalistic Music Perception With Block Term Decomposition Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Major depressive disorder (MDD) has been linked to altered brain networks and might be relieved by music therapy. Yet, the neurophysiological basis, especially the functional … |
Yongjie Zhu; Yuxing Hao; Jia Liu; Fengyu Cong; | IEEE Transactions on Cognitive and Developmental Systems | 2025-08-01 |
| 445 | Advancing The Foundation Model for Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we challenge thisparadigm by introducing a unified foundation model named MuFun for holisticmusic understanding. |
YI JIANG et. al. | arxiv-cs.SD | 2025-08-01 |
| 446 | A First Look at Generative Artificial Intelligence-Based Music Therapy for Mental Disorders IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Mental disorders show a rapid increase and cause considerable harm to individuals as well as the society in recent decade. Hence, mental disorders have become a serious public … |
LIN SHEN et. al. | IEEE Transactions on Consumer Electronics | 2025-08-01 |
| 447 | MusiQAl: A Dataset for Music Question-Answering Through Audio-Video Fusion Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music question–answering (MQA) is a machine learning task where a computational system analyzes and answers questions about music‑related data. Traditional methods prioritize … |
Anna-Maria Christodoulou; K. Glette; Olivier Lartillot; Alexander Refsum Jensenius; | Trans. Int. Soc. Music. Inf. Retr. | 2025-07-31 |
| 448 | Balancing Information Preservation and Disentanglement in Self-Supervised Music Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a multi-view SSL framework fordisentangling music audio representations that combines contrastive andreconstructive objectives. |
Julia Wilkins; Sivan Ding; Magdalena Fuentes; Juan Pablo Bello; | arxiv-cs.SD | 2025-07-30 |
| 449 | Music Arena: Live Evaluation for Text-to-Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Music Arena, an open platform for scalable human preferenceevaluation of text-to-music (TTM) models. |
YONGHYUN KIM et. al. | arxiv-cs.SD | 2025-07-28 |
| 450 | JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To the best of our knowledge, ourflow-matching-based JAM is the first effort toward endowing word-level timingand duration control in song generation, allowing fine-grained vocal control.To enhance the quality of generated songs to better align with humanpreferences, we implement aesthetic alignment through Direct PreferenceOptimization, which iteratively refines the model using a synthetic dataset,eliminating the need or manual data annotations. |
RENHANG LIU et. al. | arxiv-cs.SD | 2025-07-28 |
| 451 | Recommender Systems, Representativeness, and Online Music: A Psychosocial Analysis of Italian Listeners Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recommender systems shape music listening worldwide due to their widespreadadoption in online platforms. Growing concerns about representational harmsthat these systems may cause are nowadays part of the scientific and publicdebate, wherein music listener perspectives are oftentimes reported anddiscussed from a cognitive-behaviorism perspective, but rarely contextualisedunder a psychosocial and cultural lens. |
Lorenzo Porcaro; Chiara Monaldi; | arxiv-cs.HC | 2025-07-24 |
| 452 | Bob’s Confetti: Phonetic Memorization Attacks in Music and Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce AdversarialPhoneTic Prompting (APT), an attack that replaces iconic phrases withhomophonic alternatives–e.g., mom’s spaghetti becomes Bob’sconfetti–preserving the acoustic form while largely changing semanticcontent. |
JAECHUL ROH et. al. | arxiv-cs.SD | 2025-07-23 |
| 453 | SongComposer: A Large Language Model for Lyric and Melody Generation in Song Composition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce SongComposer, a pioneering step towards a unified song composition model that can readily create symbolic lyrics and melodies following instructions. |
SHUANGRUI DING et. al. | acl | 2025-07-21 |
| 454 | Just Ask for Music (JAM): Multimodal and Personalized Natural Language Music Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present JAM (Just Ask for Music), alightweight and intuitive framework for natural language music recommendation.JAM models user-query-item interactions as vector translations in a sharedlatent space, inspired by knowledge graph embedding methods like TransE. |
ALESSANDRO B. MELCHIORRE et. al. | arxiv-cs.IR | 2025-07-21 |
| 455 | INTERACT: Enabling Interactive, Question-Driven Learning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce INTERACT (INTERactive learning for Adaptive Concept Transfer), a framework in which a “student” LLM engages a “teacher” LLM through iterative inquiries to acquire knowledge across 1,347 contexts, including song lyrics, news articles, movie plots, academic papers, and images. |
Aum Kendapadi; Kerem Zaman; Rakesh R Menon; Shashank Srivastava; | acl | 2025-07-21 |
| 456 | Toward Music-based Stress Management: Contemporary Biosensing Systems for Affective Regulation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These systems — includinginteractive music applications, brain-computer interfaces, and biofeedbackdevices — aim to provide engaging, personalized experiences that improvetherapeutic outcomes. In this scoping and mapping review, we summarize andsynthesize systematic reviews and empirical research on biosensing systems withpotential applications in music-based affective regulation and stressmanagement, identify gaps in the literature, and highlight promising areas forfuture research. |
Natasha Yamane; Varun Mishra; Matthew S. Goodwin; | arxiv-cs.HC | 2025-07-21 |
| 457 | Learning Sparsity for Effective and Efficient Music Performance Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing Music AVQA methods often rely on dense and unoptimized representations, leading to inefficiencies in the isolation of key information, the reduction of redundancy, and the prioritization of critical samples. To address these challenges, we introduce Sparsify, a sparse learning framework specifically designed for Music AVQA. |
XINGJIAN DIAO et. al. | acl | 2025-07-21 |
| 458 | Large Language Models’ Internal Perception of Symbolic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large language models (LLMs) excel at modeling relationships between stringsin natural language and have shown promise in extending to other symbolicdomains like coding or mathematics. |
Andrew Shin; Kunitake Kaneko; | arxiv-cs.CL | 2025-07-17 |
| 459 | EditGen: Harnessing Cross-Attention Control for Instruction-Based Auto-Regressive Audio Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we investigate leveraging cross-attention control forefficient audio editing within auto-regressive models. |
Vassilis Sioros; Alexandros Potamianos; Giorgos Paraskevopoulos; | arxiv-cs.SD | 2025-07-15 |
| 460 | Grammatical Structure and Grammatical Variations in Non-Metric Iranian Classical Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study we introduce a symbolic dataset composed of non-metric Iranianclassical music, and algorithms for structural parsing of this music, andgeneration of variations. |
Maziar Kanani; Sean O Leary; James McDermott; | arxiv-cs.NE | 2025-07-14 |
| 461 | Radif Corpus: A Symbolic Dataset for Non-Metric Iranian Classical Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we introduce a digital corpus representing the completenon-metrical radif repertoire, covering all 13 existing components of thisrepertoire. |
Maziar Kanani; Sean O Leary; James McDermott; | arxiv-cs.SD | 2025-07-14 |
| 462 | MIDI-Zero: A MIDI-driven Self-Supervised Learning Approach for Music Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose MIDI-Zero, a novel self-supervisedlearning framework for CBMR that operates entirely on MIDI representations. |
Yuhang Su; Wei Hu; Hongfeng Gao; Fan Zhang; | sigir | 2025-07-13 |
| 463 | MusiScene: Leveraging MU-LLaMA for Scene Imagination and Enhanced Video Background Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, (1) we construct alarge-scale video-audio caption dataset with 3,371 pairs, (2) we finetune MusicUnderstanding LLaMA for the MSI task to create MusiScene, and (3) we conductcomprehensive evaluations and prove that our MusiScene is more capable ofgenerating contextually relevant captions compared to MU-LLaMA. |
Fathinah Izzati; Xinyue Li; Yuxuan Wu; Gus Xia; | arxiv-cs.AI | 2025-07-08 |
| 464 | EXPOTION: Facial Expression and Motion Control for Multimodal Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Expotion (Facial Expression and Motion Control for MultimodalMusic Generation), a generative model leveraging multimodal visual controls -specifically, human facial expressions and upper-body motion – as well as textprompts to produce expressive and temporally accurate music. |
Fathinah Izzati; Xinyue Li; Gus Xia; | arxiv-cs.SD | 2025-07-07 |
| 465 | Music Boomerang: Reusing Diffusion Models for Data Augmentation and Audio Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Boomerang sampling, recently proposed forthe image domain, allows generating output close to an existing example, usingany pretrained diffusion model. In this work, we explore its application in theaudio domain as a tool for data augmentation or content manipulation.Specifically, implementing Boomerang sampling for Stable Audio Open, we augmenttraining data for a state-of-the-art beat tracker, and attempt to replacemusical instruments in recordings. |
Alexander Fichtinger; Jan Schlüter; Gerhard Widmer; | arxiv-cs.SD | 2025-07-07 |
| 466 | Evaluating Fake Music Detection Performance Under Audio Augmentations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As aresponse, models for detecting fake music have been proposed. In this work, weexplore the robustness of such systems under audio augmentations. |
Tomasz Sroka; Tomasz Wężowicz; Dominik Sidorczuk; Mateusz Modrzejewski; | arxiv-cs.SD | 2025-07-07 |
| 467 | Toward More Inclusive Music Experience: Understanding Deaf and Hard-of-hearing Individuals’ Everyday Music Activities Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music can play an important role in the lives of some Deaf and Hard-of-Hearing (DHH) individuals, facilitating emotional expression, storytelling, and social interaction despite … |
HYEONBEOM YI et. al. | Proceedings of the 2025 ACM Designing Interactive Systems … | 2025-07-04 |
| 468 | OMAR-RQ: Open Music Audio Representation Model Trained with Multi-Feature Masked Token Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present OMAR-RQ, a modeltrained with self-supervision via masked token classification methodologiesusing a large-scale dataset with over 330,000 hours of music audio. |
Pablo Alonso-Jiménez; Pedro Ramoneda; R. Oguz Araz; Andrea Poltronieri; Dmitry Bogdanov; | arxiv-cs.SD | 2025-07-04 |
| 469 | “It’s More of A Vibe I’m Going For”: Designing Text-to-Music Generation Interfaces for Video Creators Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Background music plays a crucial role in social media videos, yet finding the right music remains a challenge for video creators. These creators, often not music experts, struggle … |
Noor Hammad; C. Fraser; Erik Harpstead; Jessica Hammer; Mira Dontcheva; | Proceedings of the 2025 ACM Designing Interactive Systems … | 2025-07-04 |
| 470 | MusGO: A Community-Driven Framework For Assessing Openness in Music-Generative AI Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Through this work,we aim to clarify the concept of openness in music-generative AI and promoteits transparent and responsible development. |
Roser Batlle-Roca; Laura Ibáñez-Martínez; Xavier Serra; Emilia Gómez; Martín Rocamora; | arxiv-cs.SD | 2025-07-04 |
| 471 | Content Filtering Methods for Music Recommendation: A Review Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Dueto this sparsity, there are several challenges that have to be addressed withother methods. This review examines the current state of research in addressingthese challenges, with an emphasis on the role of content filtering inmitigating biases inherent in collaborative filtering approaches. |
Terence Zeng; Abhishek K. Umrawal; | arxiv-cs.IR | 2025-07-02 |
| 472 | User-guided Generative Source Separation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To addressthis, we propose GuideSep, a diffusion-based MSS model capable ofinstrument-agnostic separation beyond the four-stem setup. |
Yutong Wen; Minje Kim; Paris Smaragdis; | arxiv-cs.SD | 2025-07-01 |
| 473 | Gregorian Melody, Modality, and Memory: Segmenting Chant with Bayesian Nonparametrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The segmentation we find achieves state-of-the-art performance in modeclassification. |
Vojtěch Lanz; Jan Hajič jr; | arxiv-cs.CL | 2025-06-30 |
| 474 | The Florence Price Art Song Dataset and Piano Accompaniment Generator Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Florence B. Price was a composer in the early 20th century whose musicreflects her upbringing in the American South, her African heritage, and herWestern classical training. She … |
Tao-Tao He; Martin E. Malandro; Douglas Shadle; | arxiv-cs.SD | 2025-06-29 |
| 475 | TOMI: Transforming and Organizing Music Ideas for Multi-Track Compositions with Full-Song Structure Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Aside from considering hierarchies in the temporal structure ofmusic, this paper explores an even more important aspect: concept hierarchy,which involves generating music ideas, transforming them, and ultimatelyorganizing them–across musical time and space–into a complete composition. Tothis end, we introduce TOMI (Transforming and Organizing Music Ideas) as anovel approach in deep music generation and develop a TOMI-based model viainstruction-tuned foundation LLM. |
Qi He; Gus Xia; Ziyu Wang; | arxiv-cs.SD | 2025-06-29 |
| 476 | MusiXQA: Advancing Visual Music Understanding in Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, their ability to interpret music sheets remainsunderexplored. To bridge this gap, we introduce MusiXQA, the firstcomprehensive dataset for evaluating and advancing MLLMs in music sheetunderstanding. |
JIAN CHEN et. al. | arxiv-cs.CV | 2025-06-28 |
| 477 | MuseControlLite: Multifunctional Music Generation with Lightweight Conditioners Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose MuseControlLite, a lightweight mechanism designed to fine-tune text-to-music generation models for precise conditioning using various time-varying musical attributes and reference audio signals. |
FANG-DUO TSAI et. al. | icml | 2025-06-25 |
| 478 | Efficient Fine-Grained Guidance for Diffusion Model Based Symbolic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Developing generative models to create or conditionally create symbolic music presents unique challenges due to the combination of limited data availability and the need for high precision in note pitch. To address these challenges, we introduce an efficient Fine-Grained Guidance (FGG) approach within diffusion models. |
Tingyu Zhu; Haoyu Liu; Ziyu Wang; Zhimin Jiang; Zeyu Zheng; | icml | 2025-06-25 |
| 479 | The AI Music Arms Race: On The Detection of AI-Generated Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Several companies now offer platforms for users to create music at unprecedented scales by textual prompting. As the quality of this music rises, concern grows about how to … |
Laura Cros Vila; Bob L. T. Sturm; Luca Casini; David Dalmazzo; | Trans. Int. Soc. Music. Inf. Retr. | 2025-06-25 |
| 480 | SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose **SongGen**, a fully open-source, single-stage auto-regressive transformer designed for controllable song generation.To foster community engagement and future research, we will release our model weights, training code, annotated data, and preprocessing pipeline. |
ZIHAN LIU et. al. | icml | 2025-06-25 |
| 481 | AI-Generated Song Detection Via Lyrics Transcripts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, in practice, suchperfect lyrics are not available (only the audio is); this leaves a substantialgap in applicability in real-life use cases. In this work, we instead proposesolving this gap by transcribing songs using general automatic speechrecognition (ASR) models. |
Markus Frohmann; Elena V. Epure; Gabriel Meseguer-Brocal; Markus Schedl; Romain Hennequin; | arxiv-cs.SD | 2025-06-23 |
| 482 | Large-Scale Training Data Attribution for Music Generative Models Via Unlearning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To validate the method, we perform a grid search over differenthyperparameter configurations and quantitatively evaluate the consistency ofthe unlearning approach. |
WOOSUNG CHOI et. al. | arxiv-cs.SD | 2025-06-23 |
| 483 | Let Your Video Listen to Your Music! — Beat-Aligned, Content-Preserving Video Editing with Arbitrary Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Aligning the rhythm of visual motion in a video with a given music track is a practical need in multimedia production, yet remains an underexplored task in autonomous video … |
Xinyu Zhang; Dong Gong; Zicheng Duan; Anton van den Hengel; Lingqiao Liu; | Proceedings of the 33rd ACM International Conference on … | 2025-06-23 |
| 484 | DuetGen: Music Driven Two-Person Dance Generation Via Hierarchical Masked Modeling IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present DuetGen, a novel framework for generating interactive two-persondances from music. |
ANINDITA GHOSH et. al. | arxiv-cs.GR | 2025-06-23 |
| 485 | AI Harmonizer: Expanding Vocal Expression with A Generative Neurosymbolic Music AI System Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present our methods, explore potentialapplications in performance and composition, and discuss future directions forreal-time implementations. |
Lancelot Blanchard; Cameron Holt; Joseph A. Paradiso; | arxiv-cs.HC | 2025-06-22 |
| 486 | From Generality to Mastery: Composer-Style Symbolic Music Generation Via Large-Scale Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we investigate how general musicknowledge learned from a broad corpus can enhance the mastery of specificcomposer styles, with a focus on piano piece generation. |
Mingyang Yao; Ke Chen; | arxiv-cs.SD | 2025-06-20 |
| 487 | Hallucination Level of Artificial Intelligence Whisperer: Case Speech Recognizing Pantterinousut Rap Song Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We will compare theFaster Whisperer algorithm and YouTube’s internal speech-to-text functionality.The reference truth will be Finnish rap lyrics, which the main author’s littlebrother, Mc Timo, has written. |
Ismo Horppu; Frederick Ayala; Erlin Gulbenkoglu; | arxiv-cs.LG | 2025-06-19 |
| 488 | Exploiting Music Source Separation for Automatic Lyrics Transcription with Whisper Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For long-form, we propose an algorithm using source separation as a vocal activity detector to derive segment boundaries, which results in a consistent reduction in WER relative to Whisper’s native long-form algorithm. |
Jaza Syed; Ivan Meresman Higgs; Ondřej Cífka; Mark Sandler; | arxiv-cs.SD | 2025-06-18 |
| 489 | Versatile Symbolic Music-for-Music Modeling Via Function Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, wepropose parameter-efficient solutions for a variety of symbolic music-for-musictasks. |
Junyan Jiang; Daniel Chin; Liwei Lin; Xuanjie Liu; Gus Xia; | arxiv-cs.SD | 2025-06-18 |
| 490 | Double Entendre: Robust Audio-Based AI-Generated Lyrics Detection Via Multi-View Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing detectors, relying on either audio orlyrics, face key practical limitations: audio-based detectors fail togeneralize to new or unseen generators and are vulnerable to audioperturbations; lyrics-based methods require cleanly formatted and accuratelyrics, unavailable in practice. To overcome these limitations, we propose anovel, practically grounded approach: a multimodal, modular late-fusionpipeline that combines automatically transcribed sung lyrics and speechfeatures capturing lyrics-related information within the audio. |
Markus Frohmann; Gabriel Meseguer-Brocal; Markus Schedl; Elena V. Epure; | arxiv-cs.CL | 2025-06-18 |
| 491 | SonicVerse: Multi-Task Learning for Music Feature-Informed Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces a multi-task music captioning model, SonicVerse, that integrates caption generation with auxiliary music feature detection tasks such as key detection, vocals detection, and more, so as to directly capture both low-level acoustic details as well as high-level musical attributes. |
Anuradha Chopra; Abhinaba Roy; Dorien Herremans; | arxiv-cs.SD | 2025-06-18 |
| 492 | Adaptive Accompaniment with ReaLchords Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose ReaLchords, an online generative model for improvising chord accompaniment to user melody. |
YUSONG WU et. al. | arxiv-cs.SD | 2025-06-17 |
| 493 | SLEEPING-DISCO 9M: A Large-scale Pre-training Dataset for Generative Music Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Sleeping-DISCO 9M, a large-scale pre-training dataset for musicand song. |
Tawsif Ahmed; Andrej Radonjic; Gollam Rabby; | arxiv-cs.SD | 2025-06-17 |
| 494 | Do Music Preferences Reflect Cultural Values? A Cross-National Analysis Using Music Embedding and World Values Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores the extent to which national music preferences reflect underlying cultural values. |
Yongjae Kim; Seongchan Park; | arxiv-cs.CL | 2025-06-16 |
| 495 | Why Context Matters: Exploring How Musical Context Impacts User Behavior, Mood, and Musical Preferences Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music consumption is shaped by both internal factors (e.g., mood, motivation) and external factors (e.g., activity, social environment), which together influence listeners’ … |
Anna Hausberger; Emilia Parada-Cabaleiro; Markus Schedl; | Proceedings of the 33rd ACM Conference on User Modeling, … | 2025-06-13 |
| 496 | DanceChat: Large Language Model-Guided Music-to-Dance Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce DanceChat, a Large LanguageModel (LLM)-guided music-to-dance generation approach. |
QING WANG et. al. | arxiv-cs.CV | 2025-06-12 |
| 497 | Let’s Chorus: Partner-aware Hybrid Song-Driven 3D Head Animation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Singing is a vital form of human emotional expression and social interaction, distinguished from speech by its richer emotional nuances and freer expressive style. Thus, … |
XIUMEI XIE et. al. | 2025 IEEE/CVF Conference on Computer Vision and Pattern … | 2025-06-10 |
| 498 | AffectMachine-Pop: A Controllable Expert System for Real-time Pop Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present \textit{AffectMachine-Pop}, an expert system capable of generating retro-pop music according to arousal and valence values, which can either be pre-determined or based on a listener’s real-time emotion states. |
Kat R. Agres; Adyasha Dash; Phoebe Chua; Stefan K. Ehrlich; | arxiv-cs.HC | 2025-06-09 |
| 499 | LeVo: High-Quality Song Generation with Multi-Preference Alignment IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing approaches still struggle with the complexcomposition of songs and the scarcity of high-quality data, leading tolimitations in audio quality, musicality, instruction following, andvocal-instrument harmony. To address these challenges, we introduce LeVo, alanguage model based framework consisting of LeLM and Music Codec. |
SHUN LEI et. al. | arxiv-cs.SD | 2025-06-09 |
| 500 | An Introduction to Pitch Strength in Contemporary Popular Music Analysis and Production Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music information retrieval distinguishes between low- and high-leveldescriptions of music. Current generative AI models rely on text descriptionsthat are higher level than the … |
Emmanuel Deruty; | arxiv-cs.SD | 2025-06-09 |
| 501 | Evolutionary Music Composition Using Tokenized Representation Based on Byte Pair Encoding Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Natural language processing (NLP) research has achieved remarkable advancement in recent decades. Language data and music data share several common features, enabling NLP … |
Yu-Wei Wen; Yu-Liang Tang; Chuan-Kang Ting; | 2025 IEEE Congress on Evolutionary Computation (CEC) | 2025-06-08 |
| 502 | Insights on Harmonic Tones from A Generative Music Experiment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: During a studio-lab experiment involving researchers, music producers, and an AI model for music generating bass-like audio, it was observed that the producers used the model’s output to convey two or more pitches with a single harmonic complex tone, which in turn revealed that the model had learned to generate structured and coherent simultaneous melodic lines using monophonic sequences of harmonic complex tones. |
Emmanuel Deruty; Maarten Grachten; | arxiv-cs.SD | 2025-06-08 |
| 503 | FilmComposer: LLM-Driven Music Production for Silent Film Clips Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we implement music production for silent film clips using LLM-driven method. |
Zhifeng Xie; Qile He; Youjia Zhu; Qiwei He; Mengtian Li; | cvpr | 2025-06-07 |
| 504 | VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we systematically study music generation conditioned solely on the video. |
ZEYUE TIAN et. al. | cvpr | 2025-06-07 |
| 505 | HarmonySet: A Comprehensive Dataset for Understanding Video-Music Semantic Alignment and Temporal Synchronization IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces HarmonySet, a comprehensive dataset designed to advance video-music understanding. |
Zitang Zhou; Ke Mei; Yu Lu; Tianyi Wang; Fengyun Rao; | cvpr | 2025-06-07 |
| 506 | Enhancing Dance-to-Music Generation Via Negative Conditioning Latent Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we focus on the problem of generating music synchronized with rhythmic visual cues of the given dance video. |
Changchang Sun; Gaowen Liu; Charles Fleming; Yan Yan; | cvpr | 2025-06-07 |
| 507 | Let’s Chorus: Partner-aware Hybrid Song-Driven 3D Head Animation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Additionally, the absence of existing 3D singing datasets poses a considerable challenge. To address this, we collect a novel audiovisual dataset, ChorusHead which features synchronized mixed vocal audio and pseudo-3D flame motions for chorus singing. |
XIUMEI XIE et. al. | cvpr | 2025-06-07 |
| 508 | Exploring Listeners’ Perceptions of AI-generated and Human-composed Music for Functional Emotional Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work investigates how listeners perceive and evaluate AI-generated as compared to human-composed music in the context of emotional resonance and regulation. |
Kimaya Lecamwasam; Tishya Ray Chaudhuri; | arxiv-cs.HC | 2025-06-03 |
| 509 | MotionRAG-Diff: A Retrieval-Augmented Diffusion Framework for Long-Term Music-to-Dance Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing approaches exhibit critical limitations: motion graph methods rely on fixed template libraries, restricting creative generation; diffusion models, while capable of producing novel motions, often lack temporal coherence and musical alignment. To address these challenges, we propose $\textbf{MotionRAG-Diff}$, a hybrid framework that integrates Retrieval-Augmented Generation (RAG) with diffusion-based refinement to enable high-quality, musically coherent dance generation for arbitrary long-term music inputs. |
Mingyang Huang; Peng Zhang; Bang Zhang; | arxiv-cs.SD | 2025-06-03 |
| 510 | NoRe: Augmenting Journaling Experience with Generative AI for Music Creation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we explore how AI-generated music can augment the journaling experience. |
Joonyoung Park; Hyewon Cho; Hyehyun Chu; Yeeun Lee; Hajin Lim; | arxiv-cs.HC | 2025-06-02 |
| 511 | Ontological Modeling of Music and Musicological Claims. A Case Study in Early Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Emilio M. Sanfilippo; Richard Freedman; Alessandro Mosca; | Int. J. Digit. Libr. | 2025-06-01 |
| 512 | Iola Walker: A Mobile Footfall Detection System for Music Composition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Theobjective is to find a method for materially enhancing music using hardware andsoftware. |
William B. James; | arxiv-cs.MM | 2025-06-01 |
| 513 | An AI-driven Music Visualization System for Generating Meaningful Audio-Responsive Visuals in Real-Time Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music visualizations are visual representations or interpretations of music that often dynamically respond to audio. They have the potential to enhance the immersive and engaging … |
Jenny Huang; Christoph Johannes Weber; Sylvia Rothe; | Proceedings of the 2025 ACM International Conference on … | 2025-05-31 |
| 514 | Bridging The Gap Between Semantic and User Preference Spaces for Multi-modal Music Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel Hierarchical Two-stage Contrastive Learning (HTCL) method that models similarity from the semantic perspective to the user perspective hierarchically to learn a comprehensive music representation bridging the gap between semantic and user preference spaces. |
XIAOFENG PAN et. al. | arxiv-cs.SD | 2025-05-29 |
| 515 | MGPHot: A Dataset of Musicological Annotations for Popular Music (1958-2022) Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The Music Genome Project® is an extensive music annotation effort spanning two decades, during which a team of musicologists has been annotating a dataset of millions of songs … |
Sergio Oramas; Fabien Gouyon; Steve Hogan; Camilo Landau; Andreas Ehmann; | Trans. Int. Soc. Music. Inf. Retr. | 2025-05-28 |
| 516 | ACE-Step: A Step Towards Music Generation Foundation Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce ACE-Step, a novel open-source foundation model for music generation that overcomes key limitations of existing approaches and achieves state-of-the-art performance through a holistic architectural design. |
Junmin Gong; Sean Zhao; Sen Wang; Shengyuan Xu; Joe Guo; | arxiv-cs.SD | 2025-05-28 |
| 517 | MelodySim: Measuring Melody-aware Music Similarity for Plagiarism Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose MelodySim, a melody-aware music similarity model and dataset for plagiarism detection. |
Tongyu Lu; Charlotta-Marlena Geist; Jan Melechovsky; Abhinaba Roy; Dorien Herremans; | arxiv-cs.SD | 2025-05-27 |
| 518 | Music Source Restoration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce Music Source Restoration (MSR), a novel task addressing the gap between idealized source separation and real-world music production. |
Yongyi Zang; Zheqi Dai; Mark D. Plumbley; Qiuqiang Kong; | arxiv-cs.SD | 2025-05-27 |
| 519 | Semantic-Aware Interpretable Multimodal Music Auto-Tagging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we present an interpretable framework for music auto-tagging that leverages groups of musically meaningful multimodal features, derived from signal processing, deep learning, ontology engineering, and natural language processing. |
Andreas Patakis; Vassilis Lyberatos; Spyridon Kantarelis; Edmund Dervakos; Giorgos Stamou; | arxiv-cs.LG | 2025-05-22 |
| 520 | Moonbeam: A MIDI Foundation Model Using Both Absolute and Relative Music Attributes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Leveraging the pretrained Moonbeam, we propose 2 finetuning architectures with full anticipatory capabilities, targeting 2 categories of downstream tasks: symbolic music understanding and conditional music generation (including music infilling). |
Zixun Guo; Simon Dixon; | arxiv-cs.SD | 2025-05-21 |
| 521 | Unified Cross-modal Translation of Score Images, Symbolic Music, and Performance Audio Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a unified approach, where we train a general-purpose model on many translation tasks simultaneously. |
JONGMIN JUNG et. al. | arxiv-cs.SD | 2025-05-19 |
| 522 | Distilling A Speech and Music Encoder with Task Arithmetic Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Knowledge Distillation of teacher ensembles may be a natural solution, but we posit that decoupling the distillation of the speech and music SSL models allows for more flexibility. Thus, we propose to learn distilled task vectors and then linearly interpolate them to form a unified speech+music model. |
FABIAN RITTER-GUTIERREZ et. al. | arxiv-cs.SD | 2025-05-19 |
| 523 | U-MusT: A Unified Framework for Cross-Modal Translation of Score Images, Symbolic Music, and Performance Audio Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music exists in various modalities, such as score images, symbolic scores, MIDI, and audio. Translations between such modalities are established as core tasks of music information … |
JONGMIN JUNG et. al. | IEEE Transactions on Audio, Speech and Language Processing | 2025-05-19 |
| 524 | Text2midi-InferAlign: Improving Symbolic Music Generation with Inference-Time Alignment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present Text2midi-InferAlign, a novel technique for improving symbolic music generation at inference time. |
Abhinaba Roy; Geeta Puri; Dorien Herremans; | arxiv-cs.SD | 2025-05-18 |
| 525 | Machine Learning Approaches to Vocal Register Classification in Contemporary Male Pop Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Particularly in pop music, where a single artist may use a variety of timbre’sand textures to achieve a desired quality, it can be difficult to identify whatvocal register within the vocal range a singer is using. This paper presentstwo methods for classifying vocal registers in an audio signal of male popmusic through the analysis of textural features of mel-spectrogram images.Additionally, we will discuss the practical integration of these models forvocal analysis tools, and introduce a concurrently developed software calledAVRA which stands for Automatic Vocal Register Analysis. |
Alexander Kim; Charlotte Botha; | arxiv-cs.SD | 2025-05-16 |
| 526 | Context-AI Tunes: Context-Aware AI-Generated Music for Stress Reduction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Choosing the right music can be challenging due to the overwhelming number of options and the time-consuming trial-and-error process. To address this, we propose Context-AI Tune (CAT), a system that generates personalized music based on environmental inputs and the user’s self-assessed stress level. |
Xiaoyan Wei; Zebang Zhang; Zijian Yue; Hsiang-Ting Chen; | arxiv-cs.HC | 2025-05-14 |
| 527 | A Mamba-based Network for Semi-supervised Singing Melody Extraction Using Confidence Binary Regularization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Thirdly, transformers typically require large amounts of labeled data to achieve optimal performances, but the SME task lacks of sufficient annotated data. To address these issues, in this paper, we propose a mamba-based network, called SpectMamba, for semi-supervised singing melody extraction using confidence binary regularization. |
XIAOLIANG HE et. al. | arxiv-cs.SD | 2025-05-13 |
| 528 | Predicting Music Track Popularity By Convolutional Neural Networks on Spotify Features and Spectrogram of Audio Waveform Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study introduces a pioneering methodology that uses Convolutional Neural Networks (CNNs) and Spotify data analysis to forecast the popularity of music tracks. |
Navid Falah; Behnam Yousefimehr; Mehdi Ghatee; | arxiv-cs.SD | 2025-05-12 |
| 529 | Harmonycloak: Making Music Unlearnable for Generative AI Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recent advances in generative AI have significantly expanded into the realms of art and music. This development has opened up a vast realm of possibilities, pushing the boundaries … |
Syed Irfan Ali Meerza; Lichao Sun; Jian Liu; | 2025 IEEE Symposium on Security and Privacy (SP) | 2025-05-12 |
| 530 | Not That Groove: Zero-Shot Symbolic Music Editing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Most work in AI music generation focused on audio, which has seen limited use in the music production industry due to its rigidity. To maximize flexibility while assuming only … |
Li Zhang; | arxiv-cs.SD | 2025-05-12 |
| 531 | CST: A Melody Generation Method Based on ChatGPT and Structure Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Ruhan He; Ruixue Liu; Tao Peng; Xinrong Hu; | Multimedia Systems | 2025-05-11 |
| 532 | Learning Music Audio Representations With Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we investigate the behavior of several music audio representation models under limited-data learning regimes. |
Christos Plachouras; Emmanouil Benetos; Johan Pauwels; | arxiv-cs.SD | 2025-05-09 |
| 533 | Mathematical AI-Driven Insights Into Societal Dynamics and Resilience Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The behavioral patterns of individuals within a society can serve as a reflection of its overall conditions and status. By examining these patterns, we can identify critical … |
Md. Sarwar Kamal; Md. Rafiqul Islam; | Companion Proceedings of the ACM on Web Conference 2025 | 2025-05-08 |
| 534 | Automatic Music Transcription Using Convolutional Neural Networks and Constant-Q Transform Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we design a processing pipeline that can transform classical piano audio files in . |
Yohannis Telila; Tommaso Cucinotta; Davide Bacciu; | arxiv-cs.SD | 2025-05-07 |
| 535 | Flower Across Time and Media: Sentiment Analysis of Tang Song Poetry and Visual Correspondence Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While previous scholarship has examined these domains independently, the systematic correlation between evolving literary emotions and visual culture remains underexplored. This study addresses that gap by employing BERT-based sentiment analysis to quantify emotional patterns in floral imagery across Tang Song poetry, then validating these patterns against contemporaneous developments in decorative arts.Our approach builds upon recent advances in computational humanities while remaining grounded in traditional sinological methods. |
Shuai Gong; Tiange Zhou; | arxiv-cs.CL | 2025-05-07 |
| 536 | Mamba-Diffusion Model with Learnable Wavelet for Controllable Symbolic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We represent symbolic music as image-like pianorolls, facilitating the use of diffusion models for the generation of symbolic music. |
Jincheng Zhang; György Fazekas; Charalampos Saitis; | arxiv-cs.SD | 2025-05-06 |
| 537 | Familiarizing with Music: Discovery Patterns for Different Music Discovery Needs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, very little is known about how users discover and explore previously unknown music, and how this behavior differs for users of varying discovery needs. In this paper we bridge this gap by analyzing data from a survey answered by users of the major music streaming platform Deezer in combination with their streaming data. |
Marta Moscati; Darius Afchar; Markus Schedl; Bruno Sguerra; | arxiv-cs.IR | 2025-05-06 |
| 538 | REFFLY: Melody-Constrained Lyrics Editing Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces REFFLY (REvision Framework For LYrics), the first revision framework for editing and generating melody-aligned lyrics. |
Songyan Zhao; Bingxuan Li; Yufei Tian; Nanyun Peng; | naacl | 2025-05-04 |
| 539 | A Data-Driven Method for Analyzing and Quantifying Lyrics-Dance Motion Relationships Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address this challenge, we hypothesize that lyrics and dance motions that co-occur across multiple songs are related. Based on this hypothesis, we propose a novel data-driven method to detect the parts of songs where meaningful relationships between lyrics and dance motions exist. |
Kento Watanabe; Masataka Goto; | naacl | 2025-05-04 |
| 540 | Exploring The Diversity of Music Experiences for Deaf and Hard of Hearing Individuals Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music plays an important role in the personal fulfillment and cognitive performance of deaf and hard of hearing (DHH) individuals. Since deafness is a spectrum — as are DHH … |
Kyrie Zhixuan Zhou; Weirui Peng; Yuhan Liu; Rachel F. Adler; | Proceedings of the ACM on Human-Computer Interaction | 2025-05-02 |
| 541 | Deep Learning Driven Secure Music Traffic Transmission in Consumer Internet of Things Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The popularization of Consumer Internet of Things (CIoT) has brought unprecedented convenience. However, rapid development has led to new challenges in the secure transmission of … |
Jiang Jiang; Fenglei Wang; Yao Lyu; Lingling Zhang; Mohammed Amoon; | IEEE Transactions on Consumer Electronics | 2025-05-01 |
| 542 | AI-Empowered Consumer Behavior Modeling Framework for Music Recommendation Over Heterogeneous Electronics Products Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Amidst the fast progress of technology and the widespread availability of music streaming platforms, there is a pressing need to provide precise and reliable music recommendations … |
Ke Zhang; Amin Yousefpour; Daohua Pan; Jiajia Li; Guangwu Hu; | IEEE Transactions on Consumer Electronics | 2025-05-01 |
| 543 | Linguistic Complexity and Socio-cultural Patterns in Hip-Hop Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a comprehensive computational framework for analyzing linguistic complexity and socio-cultural trends in hip-hop lyrics. |
Aayam Bansal; Raghav Agarwal; Kaashvi Jain; | arxiv-cs.CL | 2025-04-29 |
| 544 | MVPrompt: Building Music-Visual Prompts for AI Artists to Craft Music Video Mise-en-scène Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music videos have traditionally been the domain of experts, but with text-to-video generative AI models, AI artists can now create them more easily. However, accurately reflecting … |
ChungHa Lee; DaeHo Lee; Jin-Hyuk Hong; | Proceedings of the 2025 CHI Conference on Human Factors in … | 2025-04-25 |
| 545 | Exploring The Potential of Music Generative AI for Music-Making By Deaf and Hard of Hearing People IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recent advancements in text-to-music generative AI (GenAI) have significantly expanded access to music creation. However, deaf and hard of hearing (DHH) individuals remain largely … |
Youjin Choi; JaeYoung Moon; JinYoung Yoo; Jin-Hyuk Hong; | Proceedings of the 2025 CHI Conference on Human Factors in … | 2025-04-25 |
| 546 | EuterPen: Unleashing Creative Expression in Music Score Writing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music notation programs force composers to follow the many rules of the staff notation when writing music and constantly seek to optimize symbol placement, making numerous … |
Vincent Cavez; Catherine Letondal; Caroline Appert; Emmanuel Pietriga; | Proceedings of the 2025 CHI Conference on Human Factors in … | 2025-04-25 |
| 547 | MV-Crafter: An Intelligent System for Music-guided Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In response, we present MV-Crafter, a system capable of producing high-quality music videos with synchronized music-video rhythm and style. |
CHUER CHEN et. al. | arxiv-cs.HC | 2025-04-24 |
| 548 | Music Sequence Generation and Arrangement Based on Transformer Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Existing music sequence generation methods often struggle with long-range dependencies, leading to gradient vanishing or exploding, which compromises their ability to capture … |
X. Cui; Panwen Hu; Zheng Huang; | Journal of Computational Methods in Sciences and Engineering | 2025-04-24 |
| 549 | The Musical Mastermind: A Case Study on Fostering Music Theory and Computational Thinking Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: There is great potential in the introduction of gamified, interdisciplinary learning approaches in various education environments. This paper explores the effectiveness of Musical … |
Ioannis Sarlis; D. Kotsifakos; Christos Douligeris; | 2025 IEEE Global Engineering Education Conference (EDUCON) | 2025-04-22 |
| 550 | DRAGON: Distributional Rewards Optimize Diffusion Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Distributional RewArds for Generative OptimizatioN (DRAGON), a versatile framework for fine-tuning media generation models towards a desired outcome. |
Yatong Bai; Jonah Casebeer; Somayeh Sojoudi; Nicholas J. Bryan; | arxiv-cs.SD | 2025-04-21 |
| 551 | MusFlow: Multimodal Music Generation Via Conditional Flow Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite advancements in generating music from specific textual descriptions (e.g., style, genre, instruments), the practical application is still hindered by ordinary users’ limited expertise or time to write accurate prompts. To bridge this application gap, this paper introduces MusFlow, a novel multimodal music generation model using Conditional Flow Matching. |
Jiahao Song; Yuzhao Wang; | arxiv-cs.SD | 2025-04-18 |
| 552 | Apollo: An Interactive Environment for Generating Symbolic Musical Phrases Using Corpus-based Style Imitation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Apollo, an interactive music application for generating symbolic phrases of conventional western music using corpus-based style imitation techniques. |
Renaud Bougueng Tchemeube; Jeff Ens; Philippe Pasquier; | arxiv-cs.HC | 2025-04-18 |
| 553 | A Survey on Cross-Modal Interaction Between Music and Multimodal Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This survey aims to provide a comprehensive review of multimodal tasks related to music, outlining how music contributes to multimodal learning and offering insights for researchers seeking to expand the boundaries of computational music. |
SIFEI LI et. al. | arxiv-cs.MM | 2025-04-17 |
| 554 | Effects of Structural Reflection-Promoting Mechanism-Based Peer Assessment on Students’ Vocal Music Learning Performance and Perceptions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Vocal music education is a skill‐oriented course. Students not only need to improve their skills through repeated practice, but also need to learn self‐reflection on their singing … |
CHEN-CHEN LIU et. al. | J. Comput. Assist. Learn. | 2025-04-16 |
| 555 | Everyone-Can-Sing: Zero-Shot Singing Voice Synthesis and Conversion with Speech Reference Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a unified framework for Singing Voice Synthesis (SVS) and Conversion (SVC), addressing the limitations of existing approaches in cross-domain SVS/SVC, poor output musicality, and scarcity of singing data. |
S. Dai; Y. Wang; R. B. Dannenberg; Z. Jin; | icassp | 2025-04-15 |
| 556 | Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite the significant progress in controllable music generation and editing, challenges remain in the quality and length of generated music due to the use of Mel-spectrogram representations and UNet-based model structures. To address these limitations, we propose a novel approach using a Diffusion Transformer (DiT) augmented with an additional control branch using ControlNet. |
S. Hou; | icassp | 2025-04-15 |
| 557 | PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Such issues highlight the need for publicly available, copyright-free musical data, in which there is a large shortage, particularly for symbolic music data. To alleviate this issue, we present PDMX: a large-scale open-source dataset of over 250K public domain MusicXML scores collected from the score-sharing forum MuseScore, making it the largest available copyright-free symbolic music dataset to our knowledge. |
P. Long; Z. Novack; T. Berg-Kirkpatrick; J. McAuley; | icassp | 2025-04-15 |
| 558 | MQAD: A Large-Scale Question Answering Dataset for Training Music Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces MQAD, a music QA dataset built on the Million Song Dataset (MSD), encompassing a rich array of musical features – including beat, chord, key, structure, instrument, and genre — across 270,000 tracks, featuring nearly 3 million diverse questions and captions. |
Z. OUYANG et. al. | icassp | 2025-04-15 |
| 559 | CoLLAP: Contrastive Long-form Language-Audio Pretraining with Musical Temporal Structure Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Contrastive Long-form Language-Audio Pretraining (CoLLAP) to significantly extend the perception window for both the input audio (up to 5 minutes) and the language descriptions (exceeding 250 words), while enabling contrastive learning across modalities and temporal dynamics. |
J. WU et. al. | icassp | 2025-04-15 |
| 560 | A Singing Melody Extraction Network Via Self-Distillation and Multi-Level Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a singing melody extraction network consisting of five stacked multi-scale feature time-frequency aggregation (MF-TFA) modules. |
Y. HU et. al. | icassp | 2025-04-15 |
| 561 | Contrastive Lyrics Alignment with A Timestamp-Informed Loss Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our approach introduces a box loss that directly incorporates timestamp information into the loss function, enabling precise alignment and competitive results even with limited training data. |
T. Kick; F. Grötschla; L. A. Lanzendörfer; R. Wattenhofer; | icassp | 2025-04-15 |
| 562 | Benchmarking Music Generation Models and Metrics Via Human Preference Studies IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we generate 6k songs using 12 state-of-the-art models and conduct a survey of 15k pairwise audio comparisons with 2.5k human participants to evaluate the correlation between human preferences and widely used metrics. |
F. Grötschla; A. Solak; L. A. Lanzendörfer; R. Wattenhofer; | icassp | 2025-04-15 |
| 563 | SPSinger: Multi-Singer Singing Voice Synthesis with Short Reference Prompt Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To overcome the challenge of requiring long audio prompts during inference, we introduce the Latent Prompt Adaptation Model (LPAM), a Transformer-based module that derives timbre features from global embeddings. |
J. Zhao; C. Low; Y. Wang; | icassp | 2025-04-15 |
| 564 | Generating Gezi Opera Scores with A Large Language Model and A High-Quality Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we collect pictures of the jianpu of the Chinese Gezi opera and manually construct a high-quality standard data set of Chinese Gezi opera scores, which can be read by music notation software. |
Z. Lei; K. Gu; P. Bai; X. Shi; | icassp | 2025-04-15 |
| 565 | SONIQUE: Video Background Music Generation Using Unpaired Audio-Visual Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present SONIQUE, a model for generating background music tailored to video content. |
L. Zhang; M. Fuentes; | icassp | 2025-04-15 |
| 566 | Bootstrapping Language-Audio Pre-training for Music Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce BLAP, a model capable of generating high-quality captions for music. |
L. A. Lanzendörfer; C. Pinkl; N. Perraudin; R. Wattenhofer; | icassp | 2025-04-15 |
| 567 | FUTGA-MIR: Enhancing Fine-grained and Temporally-aware Music Understanding with Music Information Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While some existing music LLMs have been augmented with temporally-aware music captions, music information retrieval (MIR) features conventionally do not exist in music caption datasets, thus neglected by music-LLMs. To bridge the gap between recent music LLMs and conventional music information retrieval tasks, we propose FUTGA-MIR (Fine-grained Music Understanding through Temporally-enhanced Generative Augmentation with Music Information Retrieval) to enhance the existing music LLMs by augmenting them with MIR features and aligning with human feedback. |
J. Wu; | icassp | 2025-04-15 |
| 568 | Simultaneous Music Separation and Generation Using Multi-Track Latent Diffusion Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we introduce a latent diffusion-based multi-track generation model capable of both source separation and multi-track music synthesis by learning the joint probability distribution of tracks sharing a musical context. |
T. Karchkhadze; M. R. Izadi; S. Dubnov; | icassp | 2025-04-15 |
| 569 | Multimodal Fusion for EEG Emotion Recognition in Music with A Multi-Task Learning Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a novel EEG-based emotion recognition approach for music, employing a two-stage training framework that integrates emotion representations from music, lyrics, and EEG. |
S. Huang; Z. Jin; D. Li; J. Han; X. Tao; | icassp | 2025-04-15 |
| 570 | SYKI-SVC: Advancing Singing Voice Conversion with Post-Processing Innovations and An Open-Source Professional Testset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a high-fidelity singing voice conversion system. |
Y. Zhou; | icassp | 2025-04-15 |
| 571 | Music Tagging with Classifier Group Chains Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose music tagging with classifier chains that model the interplay of music tags. |
T. Hasumi; T. Komatsu; Y. Fujita; | icassp | 2025-04-15 |
| 572 | Classifying Music-Induced Emotion Using Multi-Modal Ensembles of EEG and Audio Feature Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present our submission to the EEG-Music Emotion Recognition Challenge at ICASSP 2025. |
P. PAUKNER et. al. | icassp | 2025-04-15 |
| 573 | Subtractive Training for Music Stem Insertion Using Latent Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Subtractive Training1, a simple and novel method for synthesizing individual musical instrument stems given other instruments as context. |
I. Villa-Renteria; | icassp | 2025-04-15 |
| 574 | ConSinger: Efficient High-Fidelity Singing Voice Generation with Minimal Steps Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In order to obtain high quality synthetic singing voice more efficiently, we propose a singing voice synthesis method based on the consistency model, ConSinger, to achieve high-fidelity singing voice synthesis with minimal steps. |
Y. Song; G. Sang; J. Yu; C. Xiao; | icassp | 2025-04-15 |
| 575 | Perceptual Noise-Masking with Music Through Deep Spectral Envelope Shaping Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Indeed, a music signal can mask some of the noise’s frequency components due to the effect of simultaneous masking. In this article, we propose a neural network based on a psychoacoustic masking model, designed to enhance the music’s ability to mask ambient noise by reshaping its spectral envelope with predicted filter frequency responses. |
C. Berger; R. Badeau; S. Essid; | icassp | 2025-04-15 |
| 576 | Investigation of Perceptual Music Similarity Focusing on Each Instrumental Part Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents an investigation of perceptual similarity between music tracks focusing on each individual instrumental part based on a large-scale listening test towards developing an instrumental-part-based music retrieval. |
Y. Hashizume; T. Toda; | icassp | 2025-04-15 |
| 577 | MotionComposer: Enhancing Rhythmic Music Generation with Adaptive Retrieval Reference Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present MotionComposer, a novel retrieval-augmented, easy-to-hard training approach designed to enhance rhythmic music generation. |
J. Wang; L. Liu; J. Wang; | icassp | 2025-04-15 |
| 578 | MusicLIME: Explainable Multimodal Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce MusicLIME, a model-agnostic feature importance explanation method designed for multimodal music models. |
T. Sotirou; V. Lyberatos; O. M. Mastromichalakis; G. Stamou; | icassp | 2025-04-15 |
| 579 | Generating Vocals from Lyrics and Musical Accompaniment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce AutoSing, a novel framework designed to generate diverse and high-quality singing voices from provided lyrics and musical accompaniment. |
G. Streich; L. A. Lanzendörfer; F. Grötschla; R. Wattenhofer; | icassp | 2025-04-15 |
| 580 | A Novel Compressive Compound Word Encoding and Independent Word Attention for Symbolic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To enable models to capture the dependencies between independent tokens in super tokens when using compound word encoding, we propose compressive compound word (CCP) encoding and independent word attention (IWA). |
L. Zhou; L. Yin; Y. Qian; | icassp | 2025-04-15 |
| 581 | Learning Music Audio Representations With Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Understanding how these models behave in limited-data scenarios could be crucial for developing techniques to tackle them.In this work, we investigate the behavior of several music audio representation models under limited-data learning regimes. We consider music models with various architectures, training paradigms, and input durations, and train them on data collections ranging from 5 to 8,000 minutes long. |
C. Plachouras; E. Benetos; J. Pauwels; | icassp | 2025-04-15 |
| 582 | DOSE: Drum One-Shot Extraction from Music Mixture Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces Drum One-Shot Extraction, a task in which the goal is to extract drum one-shots that are present in the music mixture. |
S. Hwang; S. Kang; K. Kim; S. Ahn; K. Lee; | icassp | 2025-04-15 |
| 583 | Latent Diffusion Bridges for Unsupervised Musical Audio Timbre Transfer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel method based on dual diffusion bridges, trained using the CocoChorales Dataset, which consists of unpaired monophonic single-instrument audio data. |
M. Mancusi; | icassp | 2025-04-15 |
| 584 | Ultra Lightweight Singing Melody Extraction Via Combination of Convolution and MLP Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose the lightweight convolutional MLP (LcMLP), an ultra lightweight model without sacrificing the performance. |
J. Liu; K. Dong; Q. Huang; S. Yu; W. Li; | icassp | 2025-04-15 |
| 585 | MusicMamba: A Dual-Feature Modeling Approach for Generating Chinese Traditional Music with Modal Precision Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, most research focuses on Western music, facing challenges in generating Chinese traditional melodies, particularly in capturing modal characteristics and emotional expression. To address this, we propose the Dual-Feature Modeling Module, which integrates the long-range modeling of the Mamba Block with the global structure capturing of the Transformer Block. |
J. CHEN et. al. | icassp | 2025-04-15 |
| 586 | MusicEval: A Generative Music Dataset with Expert Ratings for Automatic Text-to-Music Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose an automatic assessment task for TTM models to align with human perception. |
C. Liu; | icassp | 2025-04-15 |
| 587 | Semi-Supervised Contrastive Learning for Controllable Video-to-Music Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, identifying the best music for a video can be a difficult and time-consuming task. To address this challenge, we propose a novel framework for automatically retrieving a matching music clip for a given video, and vice versa. |
S. Stewart; G. KV; L. Lu; A. Fanelli; | icassp | 2025-04-15 |
| 588 | DCD-MUSIC: Deep-Learning-Aided Cascaded Differentiable MUSIC Algorithm for Near-Field Localization of Multiple Sources Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work introduces deep-learning-aided cascaded differentiable MUSIC (DCD-MUSIC) that augments MUSIC near-field localization with dedicated deep neural networks (DNNs), allowing it to operate reliably and interpretably. |
A. Gast; L. Le Magoarou; N. Shlezinger; | icassp | 2025-04-15 |
| 589 | A Mamba-based Network for Semi-supervised Singing Melody Extraction Using Confidence Binary Regularization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Thirdly, transformers typically require large amounts of labeled data to achieve optimal performances, but the SME task lacks of sufficient annotated data. To address these issues, in this paper, we propose a mamba-based network, called SpectMamba, for semi-supervised singing melody extraction using confidence binary regularization. |
X. HE et. al. | icassp | 2025-04-15 |
| 590 | Naturalistic Music Decoding from EEG Data Via Latent Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this article, we explore the potential of using latent diffusion models, a family of powerful generative models, for the task of reconstructing naturalistic music from electroencephalogram (EEG) recordings. |
E. Postolache; | icassp | 2025-04-15 |
| 591 | HANet: A Harmonic Attention-Based Network for Singing Melody Extraction from Polyphonic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Harmonic relationships have been shown to be crucial in this task, but most existing models based on Convolutional Neural Networks (CNNs) struggle to capture long-range harmonic dependencies. To address this, we propose a Harmonic Attention-based Network (HANet) for singing melody extraction from polyphonic music, which includes multiple sampling layers. |
S. Wang; X. Kong; H. Huang; K. Wang; Y. Hu; | icassp | 2025-04-15 |
| 592 | Melody Structure Transfer Network: Generating Music with Separable Self-Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose an approach to transfer the structural characteristics of training samples for generating music. |
J. WU et. al. | icassp | 2025-04-15 |
| 593 | Investigating Factors Related to The Naturalness of Synthesized Unison Singing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we focus on unison singing, which is to have several singers singing the same melody together. |
K. Nishizawa; R. Yamamoto; W. -C. Huang; T. Toda; | icassp | 2025-04-15 |
| 594 | F-StrIPE: Fast Structure-Informed Positional Encoding for Symbolic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose F-StrIPE, a structure-informed PE scheme that works in linear complexity. |
M. Agarwal; C. Wang; G. Richard; | icassp | 2025-04-15 |
| 595 | MusicGen-Stem: Multi-stem Music Generation and Edition Through Autoregressive Modeling IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To do so, we train one specialized compression algorithm per stem to tokenize the music into parallel streams of tokens. |
S. Rouard; R. S. Roman; Y. Adi; A. Roebel; | icassp | 2025-04-15 |
| 596 | Singing Voice Conversion with Accompaniment Using Self-Supervised Representation-Based Melody Features Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Other approaches involve performing source separation before conversion, but this often introduces noticeable artifacts, leading to a significant drop in conversion quality and increasing the user’s operational costs. To address these issues, we introduce a novel SVC method that uses self-supervised representation-based melody features to improve melody modeling accuracy in the presence of BGM. |
W. CHEN et. al. | icassp | 2025-04-15 |
| 597 | Exploring Acoustic Similarity in Emotional Speech and Music Via Self-Supervised Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we revisit the acoustic similarity between emotion speech and music, starting with an analysis of the layerwise behavior of SSL models for Speech Emotion Recognition (SER) and Music Emotion Recognition (MER). |
Y. Sun; Z. Zhao; K. Richmond; Y. Li; | icassp | 2025-04-15 |
| 598 | Progressive Rock Music Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We conducted a comparative analysis of various machine learning techniques. |
Arpan Nagar; Joseph Bensabat; Jokent Gaza; Moinak Dey; | arxiv-cs.SD | 2025-04-14 |
| 599 | Compose with Me: Collaborative Music Inpainter for Symbolic Music Infilling Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The field of music generation has seen a surge of interest from both academia and industry, with innovative platforms such as Suno, Udio, and SkyMusic earning widespread … |
Zhejing Hu; Yan Liu; Gong Chen; Bruce X. B. Yu; | AAAI Conference on Artificial Intelligence | 2025-04-11 |
| 600 | Extending Visual Dynamics for Video-to-Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose DyViM, a novel framework to enhance dynamics modeling for video-to-music generation. |
Xiaohao Liu; Teng Tu; Yunshan Ma; Tat-Seng Chua; | arxiv-cs.MM | 2025-04-10 |
| 601 | Optimality of Gradient-MUSIC for Spectral Estimation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce the Gradient-MUSIC algorithm for estimating the unknownfrequencies and amplitudes of a nonharmonic signal from noisy time samples.While the classical MUSIC algorithm performs a computationally expensive searchover a fine grid, Gradient-MUSIC is significantly more efficient and eliminatesthe need for discretization over a fine grid by using optimization techniques.It coarsely scans the 1D landscape to find initialization simultaneously forall frequencies followed by parallelizable local refinement via gradientdescent. |
Albert Fannjiang; Weilin Li; Wenjing Liao; | arxiv-cs.IT | 2025-04-09 |
| 602 | Deconstructing Jazz Piano Style Using Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Here, we focus on musical style, which benefits from a rich theoretical and mathematical analysis tradition. |
Huw Cheston; Reuben Bance; Peter M. C. Harrison; | arxiv-cs.SD | 2025-04-07 |
| 603 | Twitter-MusicPD: Melody of Minds – Navigating User-level Data on Multiple Mental Health Disorders and Music Preferences Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Social media platforms have become integral spaces for individuals to express emotions, seek advice, and disclose mental health conditions. While existing research primarily … |
Soroush Zamani Alavijeh; Xingwei Yang; Zeinab Noorian; Amira Ghenai; Fattane Zarrinkalam; | EPJ Data Science | 2025-04-07 |
| 604 | Of All StrIPEs: Investigating Structure-informed Positional Encoding for Efficient Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present a unified framework based on kernel methods to analyze both families of efficient PEs. |
Manvi Agarwal; Changhong Wang; Gael Richard; | arxiv-cs.SD | 2025-04-07 |
| 605 | Confidence-Enhanced Models for Indian Art Music Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Machine learning models for music have facilitated advancements in core applications like music pedagogy, singer identification, Rāga recognition, transcription, and genre … |
Sumit Kumar; Parampreet Singh; Vipul Arora; | 2025 IEEE International Conference on Acoustics, Speech, … | 2025-04-06 |
| 606 | Mozualization: Crafting Music and Visual Representation with Multimodal AI Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this work, we introduce Mozualization, a music generation and editing tool that creates multi-style embedded music by integrating diverse inputs, such as keywords, images, and … |
WANFANG XU et. al. | Proceedings of the Extended Abstracts of the CHI Conference … | 2025-04-05 |
| 607 | Graphs Are Everywhere — Psst! In Music Recommendation Too Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores the efficacy of Graph Convolutional Networks (GCN), GraphSAGE, and Graph Transformer (GT) models in learning embeddings that effectively capture intricate relationships between music items and genres represented within graph structures. |
Bharani Jayakumar; Orkun Özoğlu; | arxiv-cs.IR | 2025-04-03 |
| 608 | Test-driving Information Theory-based Compositional Distributional Semantics: A Case Study on Spanish Song Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Adrián Ghajari; Alejandro Benito-Santos; Salvador Ros; Víctor Fresno-Fernández; E. González-Blanco; | Knowl. Based Syst. | 2025-04-01 |
| 609 | Two-Stage Spatial Whitening and Normalized MUSIC for Robust DOA Estimation of GNSS Signals Under Jamming IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The estimation of the direction of arrival (DOA) of navigation signals is a critical function of a global navigation satellite system (GNSS) array receiver for applications such … |
Chuanrui Wang; X. Cui; Gang Liu; Mingquan Lu; | IEEE Transactions on Aerospace and Electronic Systems | 2025-04-01 |
| 610 | AE-AMT: Attribute-Enhanced Affective Music Generation With Compound Word Representation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Affective music generation is a challenge for symbolic music generation. Existing methods face the problem that the perceived emotion of the generated music is not evident because … |
Weiyi Yao; C. L. P. Chen; Zongyan Zhang; Tong Zhang; | IEEE Transactions on Computational Social Systems | 2025-04-01 |
| 611 | Exploring The Impact of An LLM-Powered Teachable Agent on Learning Gains and Cognitive Load in Music Education Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study examines the impact of an LLM-powered teachable agent, grounded in the Learning by Teaching (LBT) pedagogy, on students’ music theory learning and cognitive load. |
Lingxi Jin; Baicheng Lin; Mengze Hong; Kun Zhang; Hyo-Jeong So; | arxiv-cs.HC | 2025-04-01 |
| 612 | Hybrid Learning Module-Based Transformer for Multitrack Music Generation With Music Theory IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In recent years, multitrack music generation has garnered significant attention in both academic and industrial spheres for its versatile utilization of various instruments in … |
Y. TIE et. al. | IEEE Transactions on Computational Social Systems | 2025-04-01 |
| 613 | Counterfactual Music Recommendation for Mitigating Popularity Bias Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music recommendation systems aim to suggest tracks that users may enjoy. However, the accuracy of recommendation results is affected by popularity bias. Previous studies have … |
Jidong Yuan; Bingyu Gao; Xiaokang Wang; Haiyang Liu; Lingyin Zhang; | IEEE Transactions on Computational Social Systems | 2025-04-01 |
| 614 | Semantic Communication for VR Music Live Streaming With Rate Splitting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Virtual reality (VR) live streaming has established a remarkable transformation of music performances that facilitates a unique interaction between artists and their audiences … |
Jiaqi Zou; Lvxin Xu; Songlin Sun; | IEEE Transactions on Computational Social Systems | 2025-04-01 |
| 615 | Text2Tracks: Prompt-based Music Recommendation Via Generative Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose to address the task of prompt-based music recommendation as a generative retrieval task. |
ENRICO PALUMBO et. al. | arxiv-cs.IR | 2025-03-31 |
| 616 | Music Information Retrieval on Representative Mexican Folk Vocal Melodies Through MIDI Feature Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study analyzes representative Mexican folk vocal melodies using MIDI feature extraction, examining ambitus, pitch-class entropy, and interval distribution. |
Mario Alberto Vallejo Reyes; | arxiv-cs.SD | 2025-03-31 |
| 617 | Systematic CXL Memory Characterization and Performance Analysis at Scale IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Compute Express Link (CXL) has emerged as a pivotal interconnect for memory expansion. Despite its potential, the performance implications of CXL across devices, latency regimes, … |
JINSHU LIU et. al. | Proceedings of the 30th ACM International Conference on … | 2025-03-30 |
| 618 | CrossMuSim: A Cross-Modal Framework for Music Similarity Retrieval with LLM-Powered Text Description Sourcing and Mining Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To overcome the scarcity of high-quality text-music paired data, this paper introduces a dual-source data acquisition approach combining online scraping and LLM-based prompting, where carefully designed prompts leverage LLMs’ comprehensive music knowledge to generate contextually rich descriptions. |
TRISTAN TSOI et. al. | arxiv-cs.SD | 2025-03-29 |
| 619 | Teaching LLMs Music Theory with In-Context Learning and Chain-of-Thought Prompting: Pedagogical Strategies for Machines Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This study evaluates the baseline capabilities of Large Language Models (LLMs) like ChatGPT, Claude, and Gemini to learn concepts in music theory through in-context learning and chain-of-thought prompting. |
Liam Pond; Ichiro Fujinaga; | arxiv-cs.SD | 2025-03-28 |
| 620 | Vision-to-Music Generation: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we systematically review the research progress in the field of vision-to-music generation. |
ZHAOKAI WANG et. al. | arxiv-cs.CV | 2025-03-27 |
| 621 | Tune It Up: Music Genre Transfer and Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this study, we adapt and improve CycleGAN model to perform music style transfer on Jazz and Classic genres. |
Fidan Samet; Oguz Bakir; Adnan Fidan; | arxiv-cs.SD | 2025-03-27 |
| 622 | Emotion Detection and Music Recommendation System Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As artificial intelligence becomes more and more ingrained in daily life, we present a novel system that uses deep learning for music recommendation and emotion-based detection. |
Swetha Kambham; Hubert Jhonson; Sai Prathap Reddy Kambham; | arxiv-cs.CV | 2025-03-26 |
| 623 | Analyzable Chain-of-Musical-Thought Prompting for High-Fidelity Music Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the conventional next-token prediction paradigm in AR models does not align with the human creative process in music composition, potentially compromising the musicality of generated samples. To overcome this limitation, we introduce MusiCoT, a novel chain-of-thought (CoT) prompting technique tailored for music generation. |
MAX W. Y. LAM et. al. | arxiv-cs.SD | 2025-03-25 |
| 624 | Music Similarity Representation Learning Focusing on Individual Instruments with Source Separation and Human Preference Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose three methods that effectively improve performance. |
Takehiro Imamura; Yuka Hashizume; Wen-Chin Huang; Tomoki Toda; | arxiv-cs.SD | 2025-03-24 |
| 625 | Eurovision Song Contest: Can Juries Assess The Quality of Songs Objectively? Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Nikola Kadoić; N. Z. Hrustek; Maja Gligora Markovic; | Central Eur. J. Oper. Res. | 2025-03-20 |
| 626 | Analysis The Effect of Listening Music on The Human Brain By Using Electroencephalogram (EEG) Technique Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This research investigates the impact of music reflexology on brain activity, utilizing Electroencephalogram (EEG) features to analyze neurological responses. The study addresses … |
Wan Khairunizam; A. Harsono; W. Y. Choong; W. A. Mustafa; | 2025 17th International Conference on Computer and … | 2025-03-20 |
| 627 | A Bird Song Detector for Improving Bird Identification Through Deep Learning: A Case Study from Doñana Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: A key challenge in bird species identification is that many recordings either lack target species or contain overlapping vocalizations, complicating automatic identification. To address these problems, we developed a multi-stage pipeline for automatic bird vocalization identification in Do\~nana National Park (SW Spain), a wetland of high conservation concern. |
ALBA MÁRQUEZ-RODRÍGUEZ et. al. | arxiv-cs.SD | 2025-03-19 |
| 628 | Development and Evaluation of A Mixed Reality Music Visualization for A Live Performance Based on Music Information Retrieval Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The present study explores the development and evaluation of a mixed reality music visualization for a live music performance. Real-time audio analysis and crossmodal … |
Matthias Erdmann; Markus von Berg; Jochen Steffens; | Frontiers Virtual Real. | 2025-03-19 |
| 629 | Automatic Extraction Method for Humming-to-Guzheng Melody Based on Improved YIN Algorithm Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Yang Liu; Xinyue Liu; Ling Zhao; Bo Mi; | Multimedia Systems | 2025-03-19 |
| 630 | A Bimodal Deep Model to Capture Emotions from Music Tracks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This work aims to develop a deep model for automatically labeling music tracks in terms of induced emotions. The machine learning architecture consists of two components: one … |
Jan Tobolewski; Michał Sakowicz; J. Turmo; Bożena Kostek; | Journal of Artificial Intelligence and Soft Computing … | 2025-03-18 |
| 631 | Musicolors: Bridging Sound and Visuals For Synesthetic Creative Musical Experience Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Thus, we developed musicolors, a web-based music visualization library available in real-time. |
ChungHa Lee; Jin-Hyuk Hong; | arxiv-cs.HC | 2025-03-18 |
| 632 | SONICS: Synthetic Or Not – Identifying Counterfeit Songs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Additionally, existing datasets lack music-lyrics diversity, long-duration songs, and open-access fake songs. To address these gaps, we introduce SONICS, a novel dataset for end-to-end Synthetic Song Detection (SSD), comprising over 97k songs (4,751 hours) with over 49k synthetic songs from popular platforms like Suno and Udio. |
Md Awsafur Rahman; Zaber Ibn Abdul Hakim; Najibul Haque Sarker; Bishmoy Paul; Shaikh Anowarul Fattah; | iclr | 2025-03-17 |
| 633 | SINGER: Stochastic Network Graph Evolving Operator for High Dimensional PDEs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a novel framework, StochastIc Network Graph Evolving operatoR (SINGER), for learning the evolution operator of high-dimensional partial differential equations (PDEs). |
MINGQUAN FENG et. al. | iclr | 2025-03-17 |
| 634 | Serenade: A Singing Style Conversion Framework Based On Audio Infilling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose Serenade, a novel framework for the singing style conversion (SSC)task. |
Lester Phillip Violeta; Wen-Chin Huang; Tomoki Toda; | arxiv-cs.SD | 2025-03-16 |
| 635 | Cross-Modal Learning for Music-to-Music-Video Description Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we focus on the MV description generation task and propose a comprehensive pipeline encompassing training data construction and multimodal model fine-tuning. |
ZHUOYUAN MAO et. al. | arxiv-cs.SD | 2025-03-14 |
| 636 | Cultivation of Innovative Ability of Vocal Music Education Based on Big Data Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the wide application of big data technology in education, exploring its role in vocal music education has become particularly important. The purpose of this study is to … |
Siming Lin; | Journal of Computational Methods in Sciences and Engineering | 2025-03-14 |
| 637 | Impact of Mobile Technology-Integrated Dynamic Assessment on Students’ Music Rhythm Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Rhythm is the fundamental element of music and one of the indispensable aspects of training in music education, making it highly valued. However, due to the limitations of … |
C. Koong; Chih-Hung Chen; Yu‐Tzu Chen; Gwo-Haur Hwang; | J. Comput. Assist. Learn. | 2025-03-05 |
| 638 | Be The Beat: AI-Powered Boombox for Music Suggestion from Freestyle Dance Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Dance has traditionally been guided by music throughout history and across cultures, yet the concept of dancing to create music is rarely explored. In this paper, we introduce Be … |
Ethan Chang; Zhixing Chen; Jb Labrune; Marcelo Coelho; | Proceedings of the Nineteenth International Conference on … | 2025-03-04 |
| 639 | What Sounds Dangerous? Establishing Correlations Of Musical Features and Perceived Safety in HRI Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Ahstract- This study explores the potential of music driven sonification as an effective method for improving safety in humanrobot collaboration. Building on the rich expressive … |
Amit Rogel; Jack Hayley; Richard J. Savery; Gil Weinberg; | 2025 20th ACM/IEEE International Conference on Human-Robot … | 2025-03-04 |
| 640 | Augmenting Online Meetings with Context-Aware Real-time Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we investigate the potential of generative artificial intelligence (GenAI) for real-time music generation to enrich online meetings. |
Haruki Suzawa; Ko Watanabe; Andreas Dengel; Shoya Ishimaru; | arxiv-cs.HC | 2025-03-03 |
| 641 | BGM2Pose: Active 3D Human Pose Estimation with Non-Stationary Sounds Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose BGM2Pose, a non-invasive 3D human pose estimation method using arbitrary music (e.g., background music) as active sensing signals. |
YUTO SHIBATA et. al. | arxiv-cs.CV | 2025-03-01 |
| 642 | Melody Prediction of Vocal Performance Using LSTM and Attention Mechanism and Its Application in Folk Music Innovation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper combines Long Short-Term Memory (LSTM) and Self-Attention mechanisms to predict melody direction in vocal performances, exploring its application in folk music … |
Yumei Zhang; Changlong Liu; | Journal of Computational Methods in Sciences and Engineering | 2025-02-28 |
| 643 | DGFM: Full Body Dance Generation Driven By Music Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In music-driven dance motion generation, most existing methods use hand-crafted features and neglect that music foundation models have profoundly impacted cross-modal content generation. To bridge this gap, we propose a diffusion-based method that generates dance movements conditioned on text and music. |
Xinran Liu; Zhenhua Feng; Diptesh Kanojia; Wenwu Wang; | arxiv-cs.SD | 2025-02-27 |
| 644 | JEN-1 DreamStyler: Customized Musical Concept Learning Via Pivotal Parameters Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel method for customized text-to-music generation, which can capture the concept from a two-minute reference music and generate a new piece of music conforming to the concept. |
Boyu Chen; Peike Li; Yao Yao; Alex Wang; | aaai | 2025-02-25 |
| 645 | SongGLM: Lyric-to-Melody Generation with 2D Alignment Encoding and Multi-Task Pre-Training Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose SongGLM, a lyric-to-melody generation system that leverages 2D alignment encoding and multi-task pre-training based on the General Language Model (GLM) to guarantee the alignment and harmony between lyrics and melodies. |
JIAXING YU et. al. | aaai | 2025-02-25 |
| 646 | Detecting Music Performance Errors with Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To address (2), we present a novel data generation technique capable of creating large-scale synthetic music error datasets. |
BENJAMIN SHIUE-HAL CHOU et. al. | aaai | 2025-02-25 |
| 647 | CSL-L2M: Controllable Song-Level Lyric-to-Melody Generation Based on Conditional Transformer with Fine-Grained Lyric and Musical Controls Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Due to the difficulty of learning strict yet weak correlations between lyrics and melodies, previous methods have suffered from weak controllability, low-quality and poorly structured generation. To address these challenges, we propose CSL-L2M, a controllable song-level lyric-to-melody generation method based on an in-attention Transformer decoder with fine-grained lyric and musical controls, which is able to generate full-song melodies matched with the given lyrics and user-specified musical attributes. |
Li Chai; Donglin Wang; | aaai | 2025-02-25 |
| 648 | Text2midi: Generating Symbolic Music from Captions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper introduces text2midi, an end-to-end model to generate MIDI files from textual descriptions. |
KESHAV BHANDARI et. al. | aaai | 2025-02-25 |
| 649 | JEN-1 Composer: A Unified Framework for High-Fidelity Multi-Track Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This departure from the typical workflows of professional composers hinders the ability to refine details in specific tracks. To address this gap, we propose JEN-1 Composer, a unified framework designed to efficiently model marginal, conditional, and joint distributions over multi-track music using a single model. |
Yao Yao; Peike Li; Boyu Chen; Alex Wang; | aaai | 2025-02-25 |
| 650 | GVMGen: A General Video-to-Music Generation Model with Hierarchical Attentions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we present General Video-to-Music Generation model (GVMGen), designed for generating high-related music to the video input. |
HEDA ZUO et. al. | aaai | 2025-02-25 |
| 651 | SongSong: A Time Phonograph for Chinese SongCi Music from Thousand of Years Away Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce SongSong, the first music generation model capable of restoring Chinese SongCi to our knowledge. |
JILIANG HU et. al. | aaai | 2025-02-25 |
| 652 | S²MILE: Semantic-and-Structure-Aware Music-Driven Lyric Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we bridge the structural and semantic gap between music and lyrics by proposing an end-to-end model for music-driven lyric generation. |
Mu You; Fang Zhang; Shuai Zhang; Linli Xu; | aaai | 2025-02-25 |
| 653 | SongEditor: Adapting Zero-Shot Song Generation Language Model As A Multi-Task Editor Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present SongEditor, the first song editing paradigm that introduces the editing capabilities into language-modeling song generation approaches, facilitating both segment-wise and track-wise modifications. |
CHENYU YANG et. al. | aaai | 2025-02-25 |
| 654 | Drop The Beat! Freestyler for Accompaniment Conditioned Rapping Voice Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose Freestyler, the first system that generates rapping vocals directly from lyrics and accompaniment inputs. |
ZIQIAN NING et. al. | aaai | 2025-02-25 |
| 655 | UniMuMo: Unified Text, Music, and Motion Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce UniMuMo, a unified multimodal model capable of taking arbitrary text, music, and motion data as input conditions to generate outputs across all three modalities. |
HAN YANG et. al. | aaai | 2025-02-25 |
| 656 | Characterizations of Kadison–Singer Lattices and Lie Triple Derivations on Kadison–Singer Algebras Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Guangyu An; Tian Fang; Danni Zhao; | Periodica Mathematica Hungarica | 2025-02-24 |
| 657 | The GigaMIDI Dataset with Features for Expressive Music Performance Detection IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Distinguishing between non-expressive and expressive MIDI tracks is challenging, as MIDI files do not inherently make this distinction. To address this issue, we introduce a set of innovative heuristics for detecting expressive music performance. |
KEON JU MAVERICK LEE et. al. | arxiv-cs.SD | 2025-02-24 |
| 658 | Perceptual Noise-Masking with Music Through Deep Spectral Envelope Shaping Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Indeed, a music signal can mask some of the noise’s frequency components due to the effect of simultaneous masking. In this article, we propose a neural network based on a psychoacoustic masking model, designed to enhance the music’s ability to mask ambient noise by reshaping its spectral envelope with predicted filter frequency responses. |
Clémentine Berger; Roland Badeau; Slim Essid; | arxiv-cs.SD | 2025-02-24 |
| 659 | ComposeOn Academy: Transforming Melodic Ideas Into Complete Compositions Integrating Music Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing digital audio workstations and music production software often present high entry barriers for users lacking formal musical training. To address this, we introduce ComposeOn, a music theory-based tool designed for users with limited musical knowledge. |
Hongxi Pu; Futian Jiang; Zihao Chen; Xingyue Song; | arxiv-cs.HC | 2025-02-21 |
| 660 | Visual and Auditory Aesthetic Preferences Across Cultures Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a large-scale cross-cultural study examining aesthetic preferences across five distinct modalities extensively explored in the literature: shape, curvature, colour, musical harmony and melody. |
HARIN LEE et. al. | arxiv-cs.MM | 2025-02-20 |
| 661 | Synthesizing Composite Hierarchical Structure from Symbolic Music Corpora Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In order to analyzemusic compositions holistically and at multiple granularities, we propose aunified, hierarchical meta-representation of musical structure called thestructural temporal graph (STG). |
ILANA SHAPIRO et. al. | arxiv-cs.AI | 2025-02-20 |
| 662 | TALKPLAY: Multimodal Music Recommendation with Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present TALKPLAY, a novel multimodal music recommendation system that reformulates recommendation as a token generation problem using large language models (LLMs). |
Seungheon Doh; Keunwoo Choi; Juhan Nam; | arxiv-cs.IR | 2025-02-19 |
| 663 | Note-Level Singing Melody Transcription for Time-Aligned Musical Score Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we consider an extended version of the traditional note-level transcription task that recognizes onset, offset, and pitch, through including extraction of additional note value to generate a time-aligned score from an audio input. |
Leekyung Kim; Sungwook Jeon; Wan Heo; Jonghun Park; | arxiv-cs.SD | 2025-02-17 |
| 664 | NOTA: Multimodal Music Notation Understanding for Visual Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing general-domain visual language models still lack the ability of music notation understanding. Recognizing this gap, we propose NOTA, the first large-scale comprehensive multimodal music notation dataset. |
MINGNI TANG et. al. | arxiv-cs.CV | 2025-02-17 |
| 665 | CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To advance future research, we release WikiMT-X, a benchmark comprising 1,000 triplets of sheet music, audio, and richly varied text descriptions. |
SHANGDA WU et. al. | arxiv-cs.SD | 2025-02-14 |
| 666 | Beyond English: Unveiling Multilingual Bias in LLM Copyright Compliance Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Large Language Models (LLMs) have raised significant concerns regarding the fair use of copyright-protected content. While prior studies have examined the extent to which LLMs … |
Yupeng Chen; Xiaoyu Zhang; Yixian Huang; Qian Xie; | ArXiv | 2025-02-14 |
| 667 | Music Style Transfer and Creation Method Based on Transfer Learning Algorithm Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the growth of people’s demand for personalized music, how to use AI technology to achieve accurate understanding and creative transformation of music styles has become an … |
Shuiyi Chi; Hao Chen; | Journal of Computational Methods in Sciences and Engineering | 2025-02-14 |
| 668 | F-StrIPE: Fast Structure-Informed Positional Encoding for Symbolic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose F-StrIPE, a structure-informed PE scheme that works in linear complexity. |
Manvi Agarwal; Changhong Wang; Gael Richard; | arxiv-cs.SD | 2025-02-14 |
| 669 | Engaging K-12 Students with Flow-Based Music Programming: An Experience Report on Its Impact on Teaching and Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music and computer science (CS) have profound historical and structural connections, with programming music offering a promising avenue for engaging children in CS through … |
ZIFENG LIU et. al. | Proceedings of the 56th ACM Technical Symposium on Computer … | 2025-02-12 |
| 670 | Methods for Pitch Analysis in Contemporary Popular Music: Highlighting Pitch Uncertainty in Primaal’s Commercial Works Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The ultimate goal of the study is to introduce a set of methods suited to the analysis of pitch in contemporary popular music. |
Emmanuel Deruty; Luc Leroy; Yann Macé; David Meredith; | arxiv-cs.SD | 2025-02-12 |
| 671 | Hookpad Aria: A Copilot for Songwriters Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Hookpad Aria, a generative AI system designed to assist musicians in writing Western pop songs. |
CHRIS DONAHUE et. al. | arxiv-cs.SD | 2025-02-12 |
| 672 | Are Expressions for Music Emotions The Same Across Cultures? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: A key challenge in cross-cultural research on music emotion is biased stimulus selection and manual curation of taxonomies, predominantly relying on Western music and languages. To address this, we propose a balanced experimental design with nine online experiments in Brazil, the US, and South Korea, involving N=672 participants. |
Elif Celen; Pol van Rijn; Harin Lee; Nori Jacoby; | arxiv-cs.CL | 2025-02-12 |
| 673 | YNote: A Novel Music Notation for Fine-Tuning LLMs in Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These formats are difficult for both machines and humans to interpret due to their variability and intricate structure. To address these challenges, we introduce YNote, a simplified music notation system that uses only four characters to represent a note and its pitch. |
SHAO-CHIEN LU et. al. | arxiv-cs.SD | 2025-02-12 |
| 674 | Music for All: Representational Bias and Cross-Cultural Adaptability of Music Generation Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present a study of the datasets and research papers for music generation and quantify the bias and under-representation of genres. |
ATHARVA MEHTA et. al. | arxiv-cs.SD | 2025-02-11 |
| 675 | JamendoMaxCaps: A Large Scale Music-caption Dataset with Imputed Metadata IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce JamendoMaxCaps, a large-scale music-caption dataset featuring over 362,000 freely licensed instrumental tracks from the renowned Jamendo platform. |
Abhinaba Roy; Renhang Liu; Tongyu Lu; Dorien Herremans; | arxiv-cs.SD | 2025-02-11 |
| 676 | Learning Musical Representations for Music Performance Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Therefore, existing methods tend to answer questions regarding musical performances inaccurately. To bridge the above research gaps, (i) given the intricate multimodal interconnectivity inherent to music data, our primary backbone is designed to incorporate multimodal interactions within the context of music; (ii) to enable the model to learn music characteristics, we annotate and release rhythmic and music sources in the current music datasets; (iii) for time-aware audio-visual modeling, we align the model’s music predictions with the temporal dimension. |
XINGJIAN DIAO et. al. | arxiv-cs.CV | 2025-02-10 |
| 677 | Automatic Identification of Samples in Hip-Hop Music Via Multi-Loss Training and An Artificial Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Here, we show that a convolutional neural network trained on an artificial dataset can identify real-world samples in commercial hip-hop music. We extract vocal, harmonic, and percussive elements from several databases of non-commercial music recordings using audio source separation, and train the model to fingerprint a subset of these elements in transformed versions of the original audio. |
Huw Cheston; Jan Van Balen; Simon Durand; | arxiv-cs.SD | 2025-02-10 |
| 678 | Singing Voice Conversion with Accompaniment Using Self-Supervised Representation-Based Melody Features Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Other approaches involve performing source separation before conversion, but this often introduces noticeable artifacts, leading to a significant drop in conversion quality and increasing the user’s operational costs. To address these issues, we introduce a novel SVC method that uses self-supervised representation-based melody features to improve melody modeling accuracy in the presence of BGM. |
WEI CHEN et. al. | arxiv-cs.SD | 2025-02-07 |
| 679 | ImprovNet – Generating Controllable Musical Improvisations with Iterative Corruption Refinement Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Despite deep learning’s remarkable advances in style transfer across various domains, generating controllable performance-level musical style transfer for complete symbolically … |
KESHAV BHANDARI et. al. | 2025 International Joint Conference on Neural Networks … | 2025-02-06 |
| 680 | ImprovNet — Generating Controllable Musical Improvisations with Iterative Corruption Refinement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents ImprovNet, a transformer-based architecture that generates expressive and controllable musical improvisations through a self-supervised corruption-refinement training strategy. |
KESHAV BHANDARI et. al. | arxiv-cs.SD | 2025-02-06 |
| 681 | Investigation of Perceptual Music Similarity Focusing on Each Instrumental Part Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents an investigation of perceptual similarity between music tracks focusing on each individual instrumental part based on a large-scale listening test towards developing an instrumental-part-based music retrieval. |
Yuka Hashizume; Tomoki Toda; | arxiv-cs.SD | 2025-02-04 |
| 682 | On Bob Dylan: A Computational Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, I extend Sunstein’s observations through a large-scale computational analysis of Dylan’s lyrics from 1962 to 2012. |
Prashant Garg; | arxiv-cs.CL | 2025-02-03 |
| 683 | Secure & Personalized Music-to-Video Generation Via CHARCHA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Music is a deeply personal experience and our aim is to enhance this with a fully-automated pipeline for personalized music video generation. |
Mehul Agarwal; Gauri Agarwal; Santiago Benoit; Andrew Lippman; Jean Oh; | arxiv-cs.AI | 2025-02-02 |
| 684 | The Beatbots: A Musician-Informed Multi-Robot Percussion Quartet Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose design principles to guide the development of future robotic music systems and identify key robotic music affordances that our musician consultants considered particularly important for robotic music performance. |
Isabella Pu; Jeff Snyder; Naomi Ehrich Leonard; | arxiv-cs.RO | 2025-02-02 |
| 685 | Gamispotify: A Gamified Social Music Recommendation System Based on Users’ Personal Values Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this article, we have introduced Gamispotify. For the first time, in a social network-based environment, and by benefiting from gamification and crowdsourcing, Gamispotify … |
Mohammad Hajarian; Miguel Herrera Carrillo; Paloma Díaz; I. Aedo; | Multimedia Tools and Applications | 2025-02-01 |
| 686 | Music Dynamics Visualization for Music Practice and Education Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Eun Ji Park; | Multimedia Tools and Applications | 2025-01-31 |
| 687 | Every Image Listens, Every Image Dances: Music-Driven Image Animation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce MuseDance, an innovative end-to-end model that animates reference images using both music and text inputs. |
Zhikang Dong; Weituo Hao; Ju-Chiang Wang; Peng Zhang; Pawel Polak; | arxiv-cs.CV | 2025-01-30 |
| 688 | Linguistic Analysis of Sinhala YouTube Comments on Sinhala Music Videos: A Dataset Study Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The purpose of this study is to analyze the behavior of Sinhala comments on YouTube Sinhala song videos using social media comments as primary data sources. |
W. M. Yomal De Mel; Nisansa de Silva; | arxiv-cs.CL | 2025-01-28 |
| 689 | Everyone-Can-Sing: Zero-Shot Singing Voice Synthesis and Conversion with Speech Reference Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a unified framework for Singing Voice Synthesis (SVS) and Conversion (SVC), addressing the limitations of existing approaches in cross-domain SVS/SVC, poor output musicality, and scarcity of singing data. |
Shuqi Dai; Yunyun Wang; Roger B. Dannenberg; Zeyu Jin; | arxiv-cs.SD | 2025-01-23 |
| 690 | Exploring GPT’s Ability As A Judge in Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we use a systematic prompt engineering approach for LLMs to solve MIR problems. |
Kun Fang; Ziyu Wang; Gus Xia; Ichiro Fujinaga; | arxiv-cs.IR | 2025-01-22 |
| 691 | Chromagram Features Analysis for Learning-Based Query By Humming Systems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The Query by Humming (QBH) system is a melody-based searching system that can retrieve the song without using the information of the title, the composer, or lyrics. To well … |
Kuan-Yu Chen; Jian-Jiun Ding; | 2025 International Conference on Electronics, Information, … | 2025-01-19 |
| 692 | MusicEval: A Generative Music Dataset with Expert Ratings for Automatic Text-to-Music Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose an automatic assessment task for TTM models to align with human perception. |
CHENG LIU et. al. | arxiv-cs.SD | 2025-01-18 |
| 693 | Deep Learning for Music Genre Classification: A Case Study of Thai Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Classifying music genres plays an important role in music recommendation systems and information retrieval. Advanced deep learning model has provided promising results compared to … |
Pasin Sawaengsawangarom; Suparoek Phongoen; Papis Wongchaisuwat; | Proceedings of the 2025 9th International Conference on … | 2025-01-16 |
| 694 | XMusic: Towards A Generalized and Controllable Symbolic Music Generation Framework IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a generalized symbolic music generation framework, XMusic, which supports flexible prompts (i.e., images, videos, texts, tags, and humming) to generate emotionally controllable and high-quality symbolic music. |
Sida Tian; Can Zhang; Wei Yuan; Wei Tan; Wenjie Zhu; | arxiv-cs.SD | 2025-01-15 |
| 695 | Innovative Applications and Teaching Effectiveness Analysis of Interactive Mobile Technology in Music Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the rapid advancement of mobile Internet technology, the application of interactive mobile technology in education has emerged as a significant area of research, particularly … |
Na Sun; Yingran Zang; | Int. J. Interact. Mob. Technol. | 2025-01-13 |
| 696 | Sanidha: A Studio Quality Multi-Modal Dataset for Carnatic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce ‘Sanidha’, the first open-source novel dataset for Carnatic music, offering studio-quality, multi-track recordings with minimal to no overlap or bleed. |
Venkatakrishnan Vaidyanathapuram Krishnan; Noel Alben; Anish Nair; Nathaniel Condit-Schultz; | arxiv-cs.SD | 2025-01-12 |
| 697 | AI-Based Melody Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the evolving realm of social media, music significantly enhances post appeal and viewer engagement. However, challenges such as copyright and royalties complicate its usage. … |
Roberto Cavicchioli; Jia-Cheng Hu; Marco Furini; | 2025 IEEE 22nd Consumer Communications & Networking … | 2025-01-10 |
| 698 | Music and Art: A Study in Cross-modal Interpretation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose guidelines for using music to enhance the experience of viewing art, and we propose directions for future research. |
Paul Warren; Paul Mulholland; Naomi Barker; | arxiv-cs.HC | 2025-01-09 |
| 699 | Music Tagging with Classifier Group Chains Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose music tagging with classifier chains that model the interplay of music tags. |
Takuya Hasumi; Tatsuya Komatsu; Yusuke Fujita; | arxiv-cs.SD | 2025-01-09 |
| 700 | Evaluating Interval-based Tokenization for Pitch Representation in Symbolic Music Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce a general framework for building interval-based tokenizations. |
Dinh-Viet-Toan Le; Louis Bigo; Mikaela Keller; | arxiv-cs.IR | 2025-01-08 |
| 701 | MAJL: A Model-Agnostic Joint Learning Framework for Music Source Separation and Pitch Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these methods still face two critical challenges that limit the improvement of both tasks: the lack of labeled data and joint learning optimization. To address these challenges, we propose a Model-Agnostic Joint Learning (MAJL) framework for both tasks. |
Haojie Wei; Jun Yuan; Rui Zhang; Quanyu Dai; Yueguo Chen; | arxiv-cs.SD | 2025-01-07 |
| 702 | Multi-label Cross-lingual Automatic Music Genre Classification from Lyrics with Sentence BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a multi-label, cross-lingual genre classification system based on multilingual sentence embeddings generated by sBERT. |
Tiago Fernandes Tavares; Fabio José Ayres; | arxiv-cs.IR | 2025-01-07 |
| 703 | Application of Blockchain Technology in Digital Music Copyright Management: A Case Study of VNT Chain Platform Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents the design and development of a digital music copyright management system, built on the VNT Chain blockchain platform. The system aims to enhance copyright … |
Qilong Shi; Yan Zhou; | Frontiers Blockchain | 2025-01-07 |
| 704 | SYKI-SVC: Advancing Singing Voice Conversion with Post-Processing Innovations and An Open-Source Professional Testset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a high-fidelity singing voice conversion system. |
YIQUAN ZHOU et. al. | arxiv-cs.SD | 2025-01-06 |
| 705 | A System for Melodic Harmonization Using Schoenberg Regions, Giant Steps, and Church Modes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, I describe Harmonizer, a prototype system for melodic harmonization. |
Frederick Fernandes; | arxiv-cs.SD | 2025-01-05 |
| 706 | Can Impressions of Music Be Extracted from Thumbnail Images? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This type of information is underrepresented in existing music caption datasets due to the challenges associated with extracting it directly from music data. To address this issue, we propose a method for generating music caption data that incorporates non-musical aspects inferred from music thumbnail images, and validated the effectiveness of our approach through human evaluations. |
Takashi Harada; Takehiro Motomitsu; Katsuhiko Hayashi; Yusuke Sakai; Hidetaka Kamigaito; | arxiv-cs.CL | 2025-01-05 |
| 707 | MusicGen-Stem: Multi-stem Music Generation and Edition Through Autoregressive Modeling IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To do so, we train one specialized compression algorithm per stem to tokenize the music into parallel streams of tokens. |
Simon Rouard; Robin San Roman; Yossi Adi; Axel Roebel; | arxiv-cs.SD | 2025-01-03 |
| 708 | MMVA: Multimodal Matching Based on Valence and Arousal Across Images, Music, and Musical Captions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Multimodal Matching based on Valence and Arousal (MMVA), a tri-modal encoder framework designed to capture emotional content across images, music, and musical captions. |
Suhwan Choi; Kyu Won Kim; Myungjoo Kang; | arxiv-cs.SD | 2025-01-02 |
| 709 | MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose a self-supervised music representation learning model for music understanding. |
HAINA ZHU et. al. | arxiv-cs.SD | 2025-01-02 |
| 710 | Aggregating Contextual Information for Multi-Criteria Online Music Recommendations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper introduces CAMCMusic, a novel context-aware multi-criteria music recommendation system designed to address these limitations without relying on user-specific … |
Jieqi Liu; | IEEE Access | 2025-01-01 |
| 711 | PIMG: Progressive Image-to-Music Generation With Contrastive Diffusion Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The goal of Image-to-Music Generation is to create pure music according to the given image. Unlike existing tasks such as text-to-image generation, there is no explicit connection … |
Mulin Chen; Yajie Wang; Xuelong Li; | IEEE Transactions on Multimedia | 2025-01-01 |
| 712 | Interacting with Annotated and Synchronized Music Corpora on The Dezrann Web Platform Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Open datasets with annotated corpora are crucial to foster research in music information retrieval (MIR) studies and to disseminate knowledge towards musicians and the general … |
CHARLES BALLESTER et. al. | Trans. Int. Soc. Music. Inf. Retr. | 2025-01-01 |
| 713 | Music Generation Using Deep Learning and Generative AI: A Systematic Review IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents a systematic review of recent advances in music generation using deep learning techniques, categorizing the latest research in the field and identifying key … |
Rohan Mitra; Imran A. Zualkernan; | IEEE Access | 2025-01-01 |
| 714 | Personalized Music Education: A Systematic Review of AI Generation Methods Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Despite recent notable advances of AI-Education research, formal music education is still largely performed as it was at the beginning of the last century. Instrument technical … |
Filippo Carnovalini; Luís Espírito Santo; Geraint A. Wiggins; | IEEE Access | 2025-01-01 |
| 715 | Many-to-Many Singing Performance Style Transfer on Pitch and Energy Contours Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Singing voice conversion (SVC) aims to convert the singer identity of a singing voice to that of another singer. However, most existing SVC systems only perform the conversion of … |
Yu-Teng Hsu; J. Wang; Jyh-Shing Roger Jang; | IEEE Signal Processing Letters | 2025-01-01 |
| 716 | Music Emotion Classification Based on Heterogeneous Graph Neural Networks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The classification of musical emotions is crucial for the indexing, structuring, searching, and recommending of tracks and albums across various music platforms. Consequently, the … |
Jingying Guo; Peng Wang; | IEEE Access | 2025-01-01 |
| 717 | Exploring The Use of Virtual Reality and AI to Create Immersive, Interactive Music Experiences for Performance, Education, and Therapy Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study investigates the synchronization of real-time music generation with visual elements in Virtual Reality (VR) environments, leveraging Artificial Intelligence (AI) to … |
Jing Zhao; Kun-Hung Cheng; | IEEE Access | 2025-01-01 |
| 718 | LYRICEL: Knowledge Graphs Combined With Large Language Models and Machine Learning for Cross-Cultural Analysis of Lyrics—The Case of Greek Songs Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
DIMITRIOS P. PANAGOULIAS et. al. | IEEE Access | 2025-01-01 |
| 719 | An Exploration of Controllability in Symbolic Music Infilling Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study uses a transformer model to enhance the controllability of generative symbolic music models, specifically related to the infilling task. We introduce a novel Symbolic … |
Rui Guo; Dorien Herremans; | IEEE Access | 2025-01-01 |
| 720 | A Multimodal Deep Network for Music Emotion Recognition Using Audio Chorus and Lyrics Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music emotion recognition (MER) is an essential branch in music information retrieval, focusing on categorization of music based on emotional content. This study introduces a … |
Mohammad Ali Talaghat; Elham Parvinnia; M. Mehrabi; R. Boostani; | IEEE Access | 2025-01-01 |
| 721 | Improving Automatic Detection of Gender-Based Violence in Spanish Song Lyrics Using Deep Learning, Data Augmentation and Undersampling Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Richard Calbullanca Viluñir; A. Navarrete; Christian Vidal-Castro; C. Martínez-Araneda; | FICC | 2025-01-01 |
| 722 | The Integration of Artificial Intelligence and Ethnic Music Cultural Inheritance Under Deep Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The traditional music education system faces numerous challenges in inheriting ethnic music culture. Especially in the modern educational environment, the protection and … |
Wenbo Chang; | Comput. Sci. Inf. Syst. | 2025-01-01 |
| 723 | MusicEval: A Generative Music Corpus with Expert Ratings for Automatic Text-to-Music Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
CHENG LIU et. al. | ArXiv | 2025-01-01 |
| 724 | Dancing to The #Challenge: The Effect of TikTok on Closing The Artist Gender Gap in The Music Industry Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study investigates how Hashtag Dance Challenges (HDCs), a phenomenon popularized on the short-video platform TikTok, are instrumental in helping music artists achieve … |
Yifei Wang; Jui Ramaprasad; Anand Gopal; | MIS Q. | 2025-01-01 |
| 725 | Application of Big Data Analysis in Optimizing Music Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the rapid development of big data technology, its application in the field of education, especially music education, has become a new research field. This study is dedicated … |
Shaowei Min; | Journal of Computational Methods in Sciences and Engineering | 2025-01-01 |
| 726 | From Push Buttons to Notes: A Hardware/Software Ecosystem for Inclusive Music Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: : This paper explores several ways to drive a music-oriented computer system by push-button controls, with a particular focus on music education for young children and individuals … |
L. A. Ludovico; Vanessa Faschi; Federico Avanzini; Emanuele Parravicini; Manuele Maestri; | International Conference on Computer Supported Education | 2025-01-01 |
| 727 | A CNN-Based Approach for Classical Music Recognition and Style Emotion Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music recognition refers to the process of automatically recognizing and classifying the musical content in audio signals using computer technology and algorithms. Music … |
Yawen Shi; | IEEE Access | 2025-01-01 |
| 728 | Heterogeneous AI Music Generation Technology Integrating Fine-Grained Control Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: As artificial intelligence algorithms continue to advance, researchers have increasingly harnessed their capabilities to generate music that resonates with human emotions, … |
Hongtao Wang; Lingbin Gong; | IEEE Access | 2025-01-01 |
| 729 | Music for All: Exploring Multicultural Representations in Music Generation Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save |
ATHARVA MEHTA et. al. | ArXiv | 2025-01-01 |
| 730 | AI and Music, How Do Listeners and Artists Perceive It? An Empirical Study Toward The Attitude of Humans to AI Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Kevin Laksana Iskandar; Ton A. M. Spil; F. Bukhsh; | Hawaii International Conference on System Sciences | 2025-01-01 |
| 731 | Multimodal Music Genre Classification of Sotho-Tswana Musical Videos Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music genre classification is a fundamental task in music information retrieval, aimed at discerning the categorical placement, or genre, of a given musical piece. Such … |
Osondu E. Oguike; Mpho Primus; | IEEE Access | 2025-01-01 |
| 732 | Emotion-Based Music Recommendation System Integrating Facial Expression Recognition and Lyrics Sentiment Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Facial Expression Recognition (FER) has created widespread interest due to its potential uses in personalized technology and mental health, notably in systems that recommend music … |
V. S. G. S. P. Bottu; Krishnasamy Ragavan; | IEEE Access | 2025-01-01 |
| 733 | Unrolled Creative Adversarial Network For Generating Novel Musical Pieces Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, a classical system was employed alongside a new system to generate creative music. |
Pratik Nag; | arxiv-cs.SD | 2024-12-31 |
| 734 | LyricScraper: A Dataset of Spanish Song Lyrics Created Via Web Scraping and Dual-Labeling for LLM Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: . Songs represent a powerful means of expressing emotions through melody and lyrics. This study focuses on understanding and classifying emotions present in songs, ranging from … |
Tania Alcántara; Omar García-Vázquez; Mayte Hernández; Hiram Calvo; Alan Desiderio; | Computación y Sistemas (CyS) | 2024-12-30 |
| 735 | Comparative Analysis of Document-Level Embedding Methods for Similarity Scoring on Shakespeare Sonnets and Taylor Swift Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study evaluates the performance of TF-IDF weighting, averaged Word2Vec embeddings, and BERT embeddings for document similarity scoring across two contrasting textual domains. |
Klara Kramer; | arxiv-cs.CL | 2024-12-23 |
| 736 | Music Genre Classification: Ensemble Learning with Subcomponents-level Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The letter introduces a novel approach by combining ensemble learning with attention to sub-components, aiming to enhance the accuracy of identifying music genres. |
Yichen Liu; Abhijit Dasgupta; Qiwei He; | arxiv-cs.SD | 2024-12-20 |
| 737 | Tuning Music Education: AI-Powered Personalization in Learning Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In the second case study we prototype adaptive piano method books that use Automatic Music Transcription to generate exercises at different skill levels while retaining a close connection to musical interests. |
Mayank Sanganeria; Rohan Gala; | arxiv-cs.SD | 2024-12-18 |
| 738 | Detecting Machine-Generated Music with Explainability – A Challenge and Early Benchmarks IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Machine-generated music (MGM) has become a groundbreaking innovation with wide-ranging applications, such as music therapy, personalised editing, and creative inspiration within … |
Yupei Li; Qiyang Sun; Hanqian Li; Lucia Specia; Bjorn W. Schuller; | ArXiv | 2024-12-18 |
| 739 | Detecting Machine-Generated Music with Explainability — A Challenge and Early Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By providing a comprehensive comparison of benchmark results and their interpretability, we propose several directions to inspire future research to develop more robust and effective detection methods for MGM. |
Yupei Li; Qiyang Sun; Hanqian Li; Lucia Specia; Björn W. Schuller; | arxiv-cs.SD | 2024-12-17 |
| 740 | Leveraging User-Generated Metadata of Online Videos for Cover Song Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a multi-modal approach for cover song identification on online video platforms. |
Simon Hachmeier; Robert Jäschke; | arxiv-cs.MM | 2024-12-16 |
| 741 | A Benchmark and Robustness Study of In-Context-Learning with Large Language Models in Music Entity Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we provide a novel dataset of user-generated metadata and conduct a benchmark and a robustness study using recent LLMs with in-context-learning (ICL). |
Simon Hachmeier; Robert Jäschke; | arxiv-cs.CL | 2024-12-16 |
| 742 | INTERACT: Enabling Interactive, Question-Driven Learning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce INTERACT (INTERactive learning for Adaptive Concept Transfer), a framework in which a student LLM engages a teacher LLM through iterative inquiries to acquire knowledge across 1,347 contexts, including song lyrics, news articles, movie plots, academic papers, and images. |
Aum Kendapadi; Kerem Zaman; Rakesh R. Menon; Shashank Srivastava; | arxiv-cs.CL | 2024-12-15 |
| 743 | Sparse Sounds: Exploring Low-Dimensionality in Music Generation Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We are the first to explore the intricacies of LLM compression techniques in the context of text-to-music generation, focusing on the MusicGen Transformer model. We implement and … |
Shu Wang; Shiwei Liu; | 2024 IEEE International Conference on Big Data (BigData) | 2024-12-15 |
| 744 | Ourmuse: Plot-Specific AI Music Generation for Video Advertising Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The integration of music and video is a pivotal aspect of creating impactful advertisements. This study explores the application of Artificial Intelligence (AI) in generating … |
Hyeseong Park; Myung Won Raymond Jung; Sanjarbek Rakhmonov; Sngon Kim; | 2024 IEEE International Conference on Big Data (BigData) | 2024-12-15 |
| 745 | Emotion Classification of Lyrics Through Summarization By Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We propose a method that utilizes a large language model in the task of lyrics emotion classification. We especially employ GPT-4o, which is expected to deliver high performance … |
Sho Miyakawa; T. Utsuro; | 2024 IEEE International Conference on Big Data (BigData) | 2024-12-15 |
| 746 | Graph Neural Network Guided Music Mashup Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music mashups integrate elements from different songs to create surprising and engaging listening experiences. Typically, a mashup combines the vocal track of a base song with the … |
Xinyang Wu; Andrew Horner; | 2024 IEEE International Conference on Big Data (BigData) | 2024-12-15 |
| 747 | Challenging Fairness: A Comprehensive Exploration of Bias in LLM-Based Recommendations IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Large Language Model (LLM)-based recommendation systems provide more comprehensive recommendations than traditional systems by deeply analyzing content and user behavior. However, … |
Shahnewaz Karim Sakib; Anindya Bijoy Das; | 2024 IEEE International Conference on Big Data (BigData) | 2024-12-15 |
| 748 | Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce a novel method named Visuals Music Bridge (VMB). |
BAISEN WANG et. al. | arxiv-cs.CV | 2024-12-12 |
| 749 | The Emotional Bridge: Exploring The Association Between Color and Western and Chinese Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Two prominent hypotheses have been proposed to explain the intriguing connection between music and color. The Direct Link Hypothesis posits a direct correlation between the two … |
Kaihui Lin; Daixin Zhang; Rongrong Chen; | Proceedings of the 17th International Symposium on Visual … | 2024-12-11 |
| 750 | Frechet Music Distance: A Metric For Generative Symbolic Music Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper we introduce the Frechet Music Distance (FMD), a novel evaluation metric for generative symbolic music models, inspired by the Frechet Inception Distance (FID) in computer vision and Frechet Audio Distance (FAD) in generative audio. |
Jan Retkowski; Jakub Stępniak; Mateusz Modrzejewski; | arxiv-cs.SD | 2024-12-10 |
| 751 | MuMu-LLaMA: Multi-modal Music Understanding and Generation Via Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To address this, we introduce a dataset with 167.69 hours of multi-modal data, including text, images, videos, and music annotations. Based on this dataset, we propose MuMu-LLaMA, a model that leverages pre-trained encoders for music, images, and videos. |
Shansong Liu; Atin Sakkeer Hussain; Qilong Wu; Chenshuo Sun; Ying Shan; | arxiv-cs.SD | 2024-12-09 |
| 752 | A Comprehensive Approach Integrating Spot Atmosphere, User Situations, and Moods for Music Recommendation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music recommendation plays a significant role in enhancing the daily experiences of individuals, especially during activities such as traveling. In this study, we propose an … |
Da Li; Fumina Maruoka; Tadahiko Kumamoto; Shintaro Ono; Yukiko Kawai; | 2024 IEEE/WIC International Conference on Web Intelligence … | 2024-12-09 |
| 753 | Converting Vocal Performances Into Sheet Music Leveraging Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Advanced natural language processing (NLP) models are increasingly applied in music composition and performance, particularly in generating vocal melodies and simulating singing … |
Jinjing Jiang; Nicole Anne Teo Huiying; Haibo Pen; Seng-Beng Ho; Zhaoxia Wang; | 2024 IEEE International Conference on Data Mining Workshops … | 2024-12-09 |
| 754 | VidMusician: Video-to-Music Generation with Semantic-Rhythmic Alignment Via Hierarchical Visual Features IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose VidMusician, a parameter-efficient video-to-music generation framework built upon text-to-music models. |
SIFEI LI et. al. | arxiv-cs.SD | 2024-12-09 |
| 755 | AI TrackMate: Finally, Someone Who Will Give Your Music More Than Just Sounds Great! Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The rise of bedroom producers has democratized music creation, while challenging producers to objectively evaluate their work. To address this, we present AI TrackMate, an LLM-based music chatbot designed to provide constructive feedback on music productions. |
Yi-Lin Jiang; Chia-Ho Hsiung; Yen-Tung Yeh; Lu-Rong Chen; Bo-Yu Chen; | arxiv-cs.SD | 2024-12-09 |
| 756 | Advancing Music Therapy: Integrating Eastern Five-Element Music Theory and Western Techniques with AI in The Novel Five-Element Harmony System Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this article, we developed a music therapy system for the first time by applying the theory of the five elements in music therapy to practice. |
Yubo Zhou; Weizhen Bian; Kaitai Zhang; Xiaohan Gu; | arxiv-cs.HC | 2024-12-09 |
| 757 | Source Separation & Automatic Transcription for Music Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Using spectrogram masking, deep neural networks, and the MuseScore API, we attempt to create an end-to-end pipeline that allows for an initial music audio mixture (e.g.. |
Bradford Derby; Lucas Dunker; Samarth Galchar; Shashank Jarmale; Akash Setti; | arxiv-cs.SD | 2024-12-09 |
| 758 | M6: Multi-generator, Multi-domain, Multi-lingual and Cultural, Multi-genres, Multi-instrument Machine-Generated Music Detection Databases Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Detecting machine-generated music (MGMD) is, therefore, critical to safeguarding these domains, yet the field lacks comprehensive datasets to support meaningful progress. To address this gap, we introduce \textbf{M6}, a large-scale benchmark dataset tailored for MGMD research. |
Yupei Li; Hanqian Li; Lucia Specia; Björn W. Schuller; | arxiv-cs.SD | 2024-12-08 |
| 759 | Semi-Supervised Contrastive Learning for Controllable Video-to-Music Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, identifying the best music for a video can be a difficult and time-consuming task. To address this challenge, we propose a novel framework for automatically retrieving a matching music clip for a given video, and vice versa. |
Shanti Stewart; Gouthaman KV; Lie Lu; Andrea Fanelli; | arxiv-cs.MM | 2024-12-08 |
| 760 | Aligned Music Notation and Lyrics Transcription Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper introduces and formalizes, for the first time, the Aligned Music Notation and Lyrics Transcription (AMNLT) challenge, which addresses the complete transcription of vocal scores by jointly considering music symbols, lyrics, and their synchronization. |
Eliseo Fuentes-Martínez; Antonio Ríos-Vila; Juan C. Martinez-Sevilla; David Rizo; Jorge Calvo-Zaragoza; | arxiv-cs.CV | 2024-12-05 |
| 761 | Missing Melodies: AI Music Generation and Its Nearly Complete Omission of The Global South Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We conducted an extensive analysis of over one million hoursof audio datasets used in AI music generation research and manually reviewedmore than 200 papers from eleven prominent AI and music conferences andorganizations (AAAI, ACM, EUSIPCO, EURASIP, ICASSP, ICML, IJCAI, ISMIR,NeurIPS, NIME, SMC) to identify a critical gap in the fair representation andinclusion of the musical genres of the Global South in AI research. |
Atharva Mehta; Shivam Chauhan; Monojit Choudhury; | arxiv-cs.SD | 2024-12-05 |
| 762 | Relationships Between Keywords and Strong Beats in Lyrical Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Artificial Intelligence (AI) song generation has emerged as a popular topic, yet the focus on exploring the latent correlations between specific lyrical and rhythmic features remains limited. In contrast, this pilot study particularly investigates the relationships between keywords and rhythmically stressed features such as strong beats in songs. |
Callie C. Liao; Duoduo Liao; Ellie L. Zhang; | arxiv-cs.SD | 2024-12-05 |
| 763 | SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We observe that the differences between singing and talking audios manifest in terms of frequency and amplitude. |
YAN LI et. al. | arxiv-cs.CV | 2024-12-04 |
| 764 | Generation of Photo Slideshow with Song Based on Closeness Between Concept of Lyrics and That of Images IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper proposes a method that allows users to easily convert a large number of still images into movies by displaying photos in sync with memories or favorite songs, which … |
Mei Hashimoto; Michiharu Niimi; | 2024 Asia Pacific Signal and Information Processing … | 2024-12-03 |
| 765 | Advancing Music Emotion Recognition: A Transformer Encoder-Based Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music Emotion Recognition (MER) involves identifying the emotional content conveyed by music. This field is becoming increasingly significant due to its broad range of … |
Yangyuan Chen; Zhizhong Ma; Mingjing Wang; Mingzhe Liu; | Proceedings of the 6th ACM International Conference on … | 2024-12-03 |
| 766 | ArtStory Beats: Highlighting Interactions Between Visual Arts and Music with Storytelling Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this article, we present the use-case of ArtStory Beats, a mobile application that aims to reflect the creative dialogue between visual arts and music, offering stories about … |
M. Vayanou; A. Katifori; A. Antoniou; G. Loumos; Yannis E. Ioannidis; | ACM Journal on Computing and Cultural Heritage | 2024-12-02 |
| 767 | Phantom Audition: Using The Visualization of Electromyography and Vocal Metrics As Tools in Singing Training Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Our approach aims to use electromyography (EMG) and vocal metrics to enhance vocal physiological feedback, comparing a professional singer and students to analyze muscle control … |
KANYU CHEN et. al. | SIGGRAPH Asia 2024 Posters | 2024-12-02 |
| 768 | An Investigation of The Effect of Smart Cockpit Layout on Distracted Driving Behavior Based on Real Road Experiments Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the context of vehicle intelligence, smart cockpits are widely used in modern vehicle design. However, with the popularity of smart cockpits, their impact on drivers’ driving … |
Lin Hu; Xinjiao Deng; Fang Wang; Xianhui Wu; | IEEE Transactions on Intelligent Transportation Systems | 2024-12-01 |
| 769 | Would You Tell Spotify How You’re Feeling? Exploring Acceptability and Ethics of Emotion-Regulation Plugins for Music Streaming Apps Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Over half a billion people around the globe listen to music via music streaming apps. Research shows that, for many users, these apps are tools for managing everyday moods and … |
Xanthe Lowe-Brown; Solange Glasser; P. Koval; Greg Wadley; | Proceedings of the 36th Australasian Conference on … | 2024-11-30 |
| 770 | MusicGen-Chord: Advancing Music Generation Through Chord Progressions and Interactive Web-UI Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: MusicGen is a music generation language model (LM) that can be conditioned on textual descriptions and melodic features. We introduce MusicGen-Chord, which extends this capability by incorporating chord progression features. |
Jongmin Jung; Andreas Jansson; Dasaem Jeong; | arxiv-cs.SD | 2024-11-29 |
| 771 | Parameter-Efficient Transfer Learning for Music Foundation Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: More music foundation models are recently being released, promising a general, mostly task independent encoding of musical information. Common ways of adapting music foundation … |
Yiwei Ding; Alexander Lerch; | ArXiv | 2024-11-28 |
| 772 | MR Kabuki: Mixed Reality Enabled Performing Arts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We developed a system for viewing Kabuki, a traditional Japanese performing art form, using a Mixed Reality headset, which was experienced by approximately 100 visitors as a … |
Soko Aoki; A. Inada; Masashi Tomita; | Proceedings of the 2024 International Conference on … | 2024-11-28 |
| 773 | Music2Fail: Transfer Music to Failed Recorder Style Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we investigate another style transfer scenario called “failed-music style transfer”. |
CHON IN LEONG et. al. | arxiv-cs.SD | 2024-11-27 |
| 774 | Synthesising Handwritten Music with GANs: A Comprehensive Evaluation of CycleWGAN, ProGAN, and DCGAN Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper addresses the data scarcity problem by applying Generative Adversarial Networks (GANs) to synthesise realistic handwritten music sheets. |
Elona Shatri; Kalikidhar Palavala; George Fazekas; | arxiv-cs.CV | 2024-11-25 |
| 775 | Proceedings of The 6th International Workshop on Reading Music Systems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The International Workshop on Reading Music Systems (WoRMS) is a workshop that tries to connect researchers who develop systems for reading music, such as in the field of Optical … |
Jorge Calvo-Zaragoza; Alexander Pacha; Elona Shatri; | arxiv-cs.CV | 2024-11-24 |
| 776 | Mode-conditioned Music Learning and Composition: A Spiking Neural Network Inspired By Neuroscience and Psychology Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a spiking neural network inspired by brain mechanisms and psychological theories to represent musical modes and keys, ultimately generating musical pieces that incorporate tonality features. |
Qian Liang; Yi Zeng; Menghaoran Tang; | arxiv-cs.SD | 2024-11-22 |
| 777 | DAIRHuM: A Platform for Directly Aligning AI Representations with Human Musical Judgments Applied to Carnatic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a platform for exploring the Direct alignment between AI music model Representations and Human Musical judgments (DAIRHuM). |
Prashanth Thattai Ravikumar; | arxiv-cs.SD | 2024-11-22 |
| 778 | Generative AI for Music and Audio Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this dissertation, I introduce the three main directions of my research centered around generative AI for music and audio: 1) multitrack music generation, 2) assistive music creation tools, and 3) multimodal learning for audio and music. |
Hao-Wen Dong; | arxiv-cs.SD | 2024-11-21 |
| 779 | Building Music with Lego Bricks and Raspberry Pi Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, a system to build music in an intuitive and accessible way, with Lego bricks, is presented. |
Ana M. Barbancho; Lorenzo J. Tardon; Isabel Barbancho; | arxiv-cs.HC | 2024-11-20 |
| 780 | Song Form-aware Full-Song Text-to-Lyrics Generation with Multi-Level Granularity Syllable Count Control Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Lyrics generation presents unique challenges, particularly in achievingprecise syllable control while adhering to song form structures such as versesand choruses. Conventional … |
Yunkee Chae; Eunsik Shin; Suntae Hwang; Seungryeol Paik; Kyogu Lee; | arxiv-cs.CL | 2024-11-20 |
| 781 | Improving Controllability and Editability for Pretrained Text-to-Music Generation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These challenges include providing sufficient control over the generated content and allowing for flexible, precise edits. This thesis tackles these issues by introducing a series of advancements that progressively build upon each other, enhancing the controllability and editability of text-to-music generation models. |
Yixiao Zhang; | arxiv-cs.SD | 2024-11-19 |
| 782 | Attention-guided Spectrogram Sequence Modeling with CNNs for Music Genre Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music genre classification is a critical component of music recommendation systems, generation algorithms, and cultural analytics. In this work, we present an innovative model for … |
Aditya Sridhar; | ArXiv | 2024-11-18 |
| 783 | Zero-Shot Crate Digging: DJ Tool Retrieval Using Speech Activity, Music Structure And CLAP Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work we demonstrate an approach to discovering DJ tools in personal music collections. |
Iroro Orife; | arxiv-cs.SD | 2024-11-18 |
| 784 | Do Captioning Metrics Reflect Music Semantic Alignment? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present cases where traditional metrics are vulnerable to syntactic changes, and show they do not correlate well with human judgments. By addressing these issues, we aim to emphasize the need for a critical reevaluation of how music captions are assessed. |
Jinwoo Lee; Kyogu Lee; | arxiv-cs.SD | 2024-11-18 |
| 785 | A Computational Analysis of The Platformization of Music: Comparing Hit Songs on TikTok and Spotify Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The (re)creation and distribution of cultural products such as music are increasingly shaped by digital platforms. This study explores how TikTok and Spotify, situated in … |
Na Ta; Fang Jiao; Cong Lin; Cuihua Shen; | Information, Communication & Society | 2024-11-17 |
| 786 | Examining Platformization in Cultural Production: A Comparative Computational Analysis of Hit Songs on TikTok and Spotify Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores how TikTok and Spotify, situated in different governance and user contexts, could influence digital music production and reception within each platform and between each other. |
Na Ta; Fang Jiao; Cong Lin; Cuihua Shen; | arxiv-cs.SI | 2024-11-17 |
| 787 | Language Models for Music Medicine Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose fine-tuning MusicGen, a music-generating transformer model, to create short musical clips that assist patients in transitioning from negative to desired emotional states. |
EMMANOUIL NIKOLAKAKIS et. al. | arxiv-cs.SD | 2024-11-13 |
| 788 | PerceiverS: A Multi-Scale Perceiver with Effective Segmentation for Long-Term Expressive Symbolic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: AI-based music generation has made significant progress in recent years.However, generating symbolic music that is both long-structured and expressiveremains a significant challenge. In this paper, we propose PerceiverS(Segmentation and Scale), a novel architecture designed to address this issueby leveraging both Effective Segmentation and Multi-Scale attention mechanisms.Our approach enhances symbolic music generation by simultaneously learninglong-term structural dependencies and short-term expressive details. |
Yungang Yi; Weihua Li; Matthew Kuo; Quan Bai; | arxiv-cs.AI | 2024-11-12 |
| 789 | Music Discovery Dialogue Generation Using Human Intent Analysis and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we present a data generation framework for rich music discovery dialogue using a large language model (LLM) and user intents, system actions, and musical attributes. |
SeungHeon Doh; Keunwoo Choi; Daeyong Kwon; Taesu Kim; Juhan Nam; | arxiv-cs.SD | 2024-11-11 |
| 790 | Timing and Dynamics of The Rosanna Shuffle Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this analysis, we examine the timing and dynamics of the original drum track, focusing on rhythmic variations such as swing factor, microtiming deviations, tempo drift, and the overall dynamics of the hi-hat pattern. |
Esa Räsänen; Niko Gullsten; Otto Pulkkinen; Tuomas Virtanen; | arxiv-cs.SD | 2024-11-11 |
| 791 | Psychological Needs As Credible Song Signals: Testing Large Language Models to Annotate Lyrics Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: —Our preliminary study presents a new perspective in music information retrieval by investigating how contemporary song-making and listening emulate our innate responses, similar … |
Eunsun Smith; Yinxuan Wang; Eric Matson; | Conference on Computer Science and Information Systems | 2024-11-09 |
| 792 | Evolutionary Music Synthesis: A Generative AI System with Interactive User Feedback Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this study, a music generation system that reflects user preferences by combining music generation AI and inter-active evolutionary computation is proposed. In recent years, … |
Keishi Ohya; Emmanuel Ayedoun; Masataka Tokumaru; | 2024 Joint 13th International Conference on Soft Computing … | 2024-11-09 |
| 793 | Music-oriented Choreography Considering Hybrid Density Multimedia Network Algorithms Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music-oriented choreography combines dance sequences with musical compositions to enhance artistic experience. With multimedia network complexity and data management, advanced … |
Songyuan Han; Yuanying Shi; | Journal of Computational Methods in Sciences and Engineering | 2024-11-09 |
| 794 | Harnessing High-Level Song Descriptors Towards Natural Language-Based Music Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we assess LMs effectiveness in recommending songs based on user natural language descriptions and items with descriptors like genres, moods, and listening contexts. |
Elena V. Epure; Gabriel Meseguer-Brocal; Darius Afchar; Romain Hennequin; | arxiv-cs.IR | 2024-11-08 |
| 795 | Long-Form Text-to-Music Generation with Adaptive Prompts: A Case Study in Tabletop Role-Playing Games Soundtracks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce Babel Bardo, a system that uses Large Language Models (LLMs) to transform speech transcriptions into music descriptions for controlling a text-to-music model. |
Felipe Marra; Lucas N. Ferreira; | arxiv-cs.SD | 2024-11-06 |
| 796 | Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: This paper investigates the capabilities of text-to-audio music generation models in producing long-form music with prompts that change over time, focusing on soundtrack … |
Felipe Marra; Lucas N. Ferreira; | ArXiv | 2024-11-06 |
| 797 | MoMu-Diffusion: On Learning Long-Term Motion-Music Synchronization and Correspondence Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, to date, there has been no work that considers them jointly to explore the modality alignment within. To bridge this gap, we propose a novel framework, termed MoMu-Diffusion, for long-term and synchronous motion-music generation. |
FUMING YOU et. al. | arxiv-cs.SD | 2024-11-04 |
| 798 | PIAST: A Multimodal Piano Dataset with Audio, Symbolic and Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: While piano music has become a significant area of study in Music Information Retrieval (MIR), there is a notable lack of datasets for piano solo music with text labels. To address this gap, we present PIAST (PIano dataset with Audio, Symbolic, and Text), a piano music dataset. |
HAYEON BANG et. al. | arxiv-cs.SD | 2024-11-04 |
| 799 | Generative AI and EEG-based Music Personalization for Work Stress Reduction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The escalating prevalence of work-related stress has led to a notable decline in work performance and the mental well-being of the workforce. Studies suggest that personally … |
VARSHA WIJETHUNGE et. al. | IECON 2024 – 50th Annual Conference of the IEEE Industrial … | 2024-11-03 |
| 800 | Sing-On-Your-Beat: Simple Text-Controllable Accompaniment Generations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: With advancements in deep learning, previous research has focused on generating suitable accompaniments but often lacks precise alignment with the desired instrumentation and genre. To address this, we propose a straightforward method that enables control over the accompaniment through text prompts, allowing the generation of music that complements the vocals and aligns with the song instrumental and genre requirements. |
Quoc-Huy Trinh; Minh-Van Nguyen; Trong-Hieu Nguyen Mau; Khoa Tran; Thanh Do; | arxiv-cs.SD | 2024-11-03 |
| 801 | Assessing The Impact of Sampling, Remixes, and Covers on Original Song Popularity Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Using Who Sampled data and Google Trends, we examine how the popularity of a borrowing song affects the original. |
Guilherme Soares S. dos Santos; Flavio Figueiredo; | arxiv-cs.SI | 2024-11-02 |
| 802 | I’ve Heard This Before: Initial Results on Tiktok’s Impact On The Re-Popularization of Songs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we analyze how TikTok helps to revitalize older songs. |
Breno Matos; Francisco Galuppo; Rennan Cordeiro; Flavio Figueiredo; | arxiv-cs.SI | 2024-11-02 |
| 803 | Music Foundation Model As Generic Booster for Music Downstream Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce SoniDo, a music foundation model (MFM) designed to extract hierarchical features from target music samples. |
WEIHSIANG LIAO et. al. | arxiv-cs.SD | 2024-11-02 |
| 804 | The Role of Artificial Intelligence in Personalized Music Teaching Quality Evaluation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the increasing emphasis on personalized learning in the field of education, music teaching has gradually turned to personalized methods to meet the diversified needs of … |
Shiwei Zhao; | Journal of Computational Methods in Sciences and Engineering | 2024-11-01 |
| 805 | MIRFLEX: Music Information Retrieval Feature Library for Extraction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper introduces an extendable modular system that compiles a range of music feature extraction models to aid music information retrieval research. |
Anuradha Chopra; Abhinaba Roy; Dorien Herremans; | arxiv-cs.SD | 2024-11-01 |
| 806 | Machine Learning Framework for Audio-Based Content Evaluation Using MFCC, Chroma, Spectral Contrast, and Temporal Feature Engineering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study presents a machine learning framework for assessing similarity between audio content and predicting sentiment score. |
Aris J. Aristorenas; | arxiv-cs.SD | 2024-10-31 |
| 807 | HKDSME: Heterogeneous Knowledge Distillation for Semi-supervised Singing Melody Extraction Using Harmonic Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To solve the two issues, in this paper, we propose a heterogeneous knowledge distillation framework for semi-supervised singing melody extraction using harmonic supervision, termed as HKDSME. |
Shuai Yu; Xiaoliang He; Ke Chen; Yi Yu; | mm | 2024-10-30 |
| 808 | CNN-LSTM Based Multimodal Models for Music Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this article, we tackle the dual challenges of efficiency and quality in music generation. We aim to create a model that produces high-quality music efficiently while keeping … |
Man Zhang; Dongning Liu; | 2024 IEEE International Symposium on Parallel and … | 2024-10-30 |
| 809 | Controllable Music Loops Generation with MIDI and Text Via Multi-Stage Cross Attention and Instrument-Aware Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, they often inadequately produce the precise rendering of critical music loop attributes, including melody, rhythms, and instrumentation, which are essential for modern music loop production. To overcome this limitation, this paper proposed a Loops Transformer and a Multi-Stage Cross Attention mechanism that enable a cohesive integration of textual and MIDI input specifications. |
Guan-Yuan Chen; Von-Wun Soo; | mm | 2024-10-30 |
| 810 | MUSCAT: A Multimodal MUSic Collection for Automatic Transcription of Real Recordings and Image Scores Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Nevertheless, while proven to outperform single-modality recognition rates, this approach has been exclusively validated under controlled scenarios—monotimbral and monophonic synthetic data—mainly due to a lack of collections with symbolic score-level annotations for both recordings and graphical sheets. To promote research on this topic, this work presents the Multimodal mUSic Collection for Automatic Transcription (MUSCAT) assortment of acoustic recordings, image sheets, and their score-level annotations in several notation formats. |
ALEJANDRO GALAN-CUENCA et. al. | mm | 2024-10-30 |
| 811 | Zero-shot Controllable Music Generation from Videos Using Facial Expressions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper proposes a method to control a music generation process from videos using users’ facial expressions, aligning the music with their emotions. Unlike previous works, this … |
Shilin Liu; Kyohei Kamikawa; Keisuke Maeda; Takahiro Ogawa; M. Haseyama; | 2024 IEEE 13th Global Conference on Consumer Electronics … | 2024-10-29 |
| 812 | Sing It, Narrate It: Quality Musical Lyrics Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper aims to enhance translation quality while maintaining key singability features. |
Zhuorui Ye; Jinhan Li; Rongwu Xu; | arxiv-cs.CL | 2024-10-29 |
| 813 | Emotion-Guided Image to Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents an emotion-guided image-to-music generation framework that leverages the Valence-Arousal (VA) emotional space to produce music that aligns with the emotional tone of a given image. |
Souraja Kundu; Saket Singh; Yuji Iwahori; | arxiv-cs.SD | 2024-10-29 |
| 814 | Semi-Supervised Self-Learning Enhanced Music Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To handle the noisy label issue, we propose a semi-supervised self-learning (SSSL) method, which can differentiate between samples with correct and incorrect labels in a self-learning manner, thus effectively utilizing the augmented segment-level data. |
Yifu Sun; Xulong Zhang; Monan Zhou; Wei Li; | arxiv-cs.SD | 2024-10-29 |
| 815 | ExpressiveSinger: Multilingual and Multi-Style Score-based Singing Voice Synthesis with Expressive Performance Control IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Singing Voice Synthesis (SVS) has significantly advanced with deep generative models, achieving high audio quality but still struggling with musicality, mainly due to the lack of … |
Shuqi Dai; Ming-Yu Liu; Rafael Valle; Siddharth Gururani; | ACM Multimedia | 2024-10-28 |
| 816 | Symbotunes: Unified Hub for Symbolic Music Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Therefore, directly comparing the methods or becoming acquainted with them may present challenges. To mitigate this issue we introduce Symbotunes, an open-source unified hub for symbolic music generative models. |
Paweł Skierś; Maksymilian Łazarski; Michał Kopeć; Mateusz Modrzejewski; | arxiv-cs.SD | 2024-10-27 |
| 817 | An Approach to Hummed-tune and Song Sequences Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper covers details about the pre-processed data from the original type (mp3) to usable form for training and inference. |
LOC BAO PHAM et. al. | arxiv-cs.SD | 2024-10-27 |
| 818 | MusicFlow: Cascaded Flow Matching for Text Guided Music Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce MusicFlow, a cascaded text-to-music generation model based on flow matching. |
K R PRAJWAL et. al. | arxiv-cs.SD | 2024-10-27 |
| 819 | We Musicians Know How to Divide and Conquer: Exploring Multimodal Interactions To Improve Music Reading and Memorization for Blind and Low Vision Learners Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Despite the potential of multimodal assistive technologies (MATs) to convey visual information, such as music notation, to blind or low-vision (BLV) individuals, we do not fully … |
Leon Lu; Chase Crispin; Audrey Girouard; | Proceedings of the 26th International ACM SIGACCESS … | 2024-10-27 |
| 820 | Arabic Music Classification and Generation Using Deep Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The dataset used in this project consists of new and classical Egyptian music pieces composed by different composers. |
MOHAMED ELSHAARAWY et. al. | arxiv-cs.SD | 2024-10-25 |
| 821 | Melody Construction for Persian Lyrics Using LSTM Recurrent Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The present paper investigated automatic melody construction for Persian lyrics as an input. |
Farshad Jafari; Farzad Didehvar; Amin Gheibi; | arxiv-cs.SD | 2024-10-23 |
| 822 | Striking A New Chord: Neural Networks in Music Information Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Specifically, we compare LSTM, Transformer, and GPT models against a widely-used markov model to predict a chord event following a sequence of chords. |
Farshad Jafari; Claire Arthur; | arxiv-cs.IT | 2024-10-23 |
| 823 | Music102: An $D_{12}$-equivariant Transformer for Chord Progression Accompaniment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present Music102, an advanced model aimed at enhancing chord progressionaccompaniment through a $D_{12}$-equivariant transformer. |
Weiliang Luo; | arxiv-cs.SD | 2024-10-22 |
| 824 | Musinger: Communication of Music Over A Distance with Wearable Haptic Display and Touch Sensitive Surface Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores the integration of auditory and tactile experiences in musical haptics, focusing on enhancing sensory dimensions of music through touch. Addressing the gap in translating auditory signals to meaningful tactile feedback, our research introduces a novel method involving a touch-sensitive recorder and a wearable haptic display that captures musical interactions via force sensors and converts these into tactile sensations. |
MIGUEL ALTAMIRANO CABRERA et. al. | arxiv-cs.HC | 2024-10-21 |
| 825 | Association of Facial Action Units with Emotions in People Living with Dementia Listening to Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music is a recommended intervention for people living with dementia as it can improve behaviour and levels of happiness. The automatic association of facial action units and in … |
DIMITRIOS KOLOSOV et. al. | 2024 IEEE International Conference on Metrology for … | 2024-10-21 |
| 826 | OpenMU: Your Swiss Army Knife for Music Understanding IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present OpenMU-Bench, a large-scale benchmark suite for addressing the data scarcity issue in training multimodal language models to understand music. |
MENGJIE ZHAO et. al. | arxiv-cs.SD | 2024-10-20 |
| 827 | ArchiTone: A LEGO-Inspired Gamified System for Visualized Music Education Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Informed by formative investigation and inspired by LEGO, we introduce ArchiTone, a gamified system that employs constructivism by visualizing music theory concepts as musical blocks and buildings for music education. |
JIAXING YU et. al. | arxiv-cs.HC | 2024-10-20 |
| 828 | ConSinger: Efficient High-Fidelity Singing Voice Generation with Minimal Steps Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In order to obtain high quality synthetic singing voice more efficiently, we propose a singing voice synthesis method based on the consistency model, ConSinger, to achieve high-fidelity singing voice synthesis with minimal steps. |
Yulin Song; Guorui Sang; Jing Yu; Chuangbai Xiao; | arxiv-cs.SD | 2024-10-20 |
| 829 | A Low-complexity DOA Estimation Method for Nested Array Via Root-MUSIC Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The nested array results in a hole-free difference coarray (DCA) after virtualization, enhancing the utilization of virtual array elements. MUSIC algorithm is an algorithm with … |
Juntan Huang; Shufang Xu; | Proceedings of the 2024 8th International Conference on … | 2024-10-18 |
| 830 | Music Therapy for Autism Spectrum Disorder: A Comprehensive Literature Review on Therapeutic Efficacy, Limitations, and AI Integration Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Autism Spectrum Disorder (ASD) is a neurological and developmental condition that presents considerable social, behavioral, and communicative challenges to those diagnosed with … |
Beatrice Low; Xindi Liu; Richard Z. Li; Elizabeth Ren; Jasmine X Zhang; | 2024 IEEE 15th Annual Ubiquitous Computing, Electronics & … | 2024-10-17 |
| 831 | MeloTrans: A Text to Symbolic Music Generation Model Following Human Composition Habit Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To archive this, we develop the POP909$\_$M dataset, the first to include labels for musical motifs and their variants, providing a basis for mimicking human compositional habits. Building on this, we propose MeloTrans, a text-to-music composition model that employs principles of motif development rules. |
YUTIAN WANG et. al. | arxiv-cs.SD | 2024-10-17 |
| 832 | CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: These limitations reduce their effectiveness in a global, multimodal music environment. To address these issues, we introduce CLaMP 2, a system compatible with 101 languages that supports both ABC notation (a text-based musical notation format) and MIDI (Musical Instrument Digital Interface) for music information retrieval. |
SHANGDA WU et. al. | arxiv-cs.SD | 2024-10-17 |
| 833 | KeYric: Unsupervised Keywords Extraction and Expansion from Music for Coherent Lyrics Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We address the challenge of enhancing coherence in generated lyrics from symbolic music, particularly for creating singing-based language learning materials. Coherence, defined as … |
Xichu Ma; Varun Sharma; Min-Yen Kan; Wee Sun Lee; Ye Wang; | ACM Transactions on Multimedia Computing, Communications … | 2024-10-17 |
| 834 | MuVi: Video-to-Music Generation with Semantic Alignment and Rhythmic Synchronization IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Generating music that aligns with the visual content of a video has been a challenging task, as it requires a deep understanding of visual semantics and involves generating music … |
RUIQI LI et. al. | ArXiv | 2024-10-16 |
| 835 | Music Generation Using Dual Interactive Wasserstein Fourier Acquisitive Generative Adversarial Network Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music composition, an intricate blend of human creativity and emotion, presents substantial challenges when generating melodies from lyrics which hinders effective learning in … |
Tarannum Shaikh; Ashish Jadhav; | Int. J. Comput. Intell. Appl. | 2024-10-16 |
| 836 | Like It or Not: Exploring The Impact of (Dis)liked Background Music on Player Behavior and Experience Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Entertainment media, including video games, utilize background music (BGM) to enhance ambiance and gameplay, with a growing trend of players replacing in-game audio with personal … |
Marc Schubhan; Sridhar Karra; Maximilian Altmeyer; Antonio Krüger; | Proceedings of the ACM on Human-Computer Interaction | 2024-10-14 |
| 837 | Do We Need More Complex Representations for Structure? A Comparison of Note Duration Representation for Music Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we inquire if the off-the-shelf Music Transformer models perform just as well on structural similarity metrics using only unannotated MIDI information. |
Gabriel Souza; Flavio Figueiredo; Alexei Machado; Deborah Guimarães; | arxiv-cs.SD | 2024-10-14 |
| 838 | Towards Music-Aware Virtual Assistants Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We propose a system for modifying spoken notifications in a manner that is sensitive to the music a user is listening to. Spoken notifications provide convenient access to rich … |
Alexander Wang; David Lindlbauer; Chris Donahue; | Proceedings of the 37th Annual ACM Symposium on User … | 2024-10-13 |
| 839 | CrAIzy MIDI: AI-powered Wearable Musical Instrumental for Novice Player Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Playing music is a deeply fulfilling and universally cherished activity, yet the steep learning curve often discourages novice amateurs. Traditional music creation demands … |
Hongni Ye; Xiangrong Zhu; Yongbo Yang; | Adjunct Proceedings of the 37th Annual ACM Symposium on … | 2024-10-13 |
| 840 | M2M-Gen: A Multimodal Framework for Automated Background Music Generation in Japanese Manga Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces M2M Gen, a multi modal framework for generating background music tailored to Japanese manga. |
Megha Sharma; Muhammad Taimoor Haseeb; Gus Xia; Yoshimasa Tsuruoka; | arxiv-cs.SD | 2024-10-13 |
| 841 | Small Tunes Transformer: Exploring Macro & Micro-Level Hierarchies for Skeleton-Conditioned Melody Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we delve into the multi-level structures within music from macro-level and micro-level hierarchies. |
Yishan Lv; Jing Luo; Boyuan Ju; Xinyu Yang; | arxiv-cs.SD | 2024-10-11 |
| 842 | Explainability in Music Recommender System Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recommendation systems play a crucial role in our daily lives, influencing many of our significant and minor decisions. These systems also have become integral to the music … |
Shahrzad Shashaani; | Proceedings of the 18th ACM Conference on Recommender … | 2024-10-08 |
| 843 | LyricLure: Mining Catchy Hooks in Song Lyrics to Enhance Music Discovery and Recommendation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music Search encounters a significant challenge as users increasingly rely on catchy lines from lyrics to search for both new releases and other popular songs. Integrating lyrics … |
Siddharth Sharma; Akshay Shukla; Ajinkya Walimbe; Tarun Sharma; Joaquin Delgado; | Proceedings of the 18th ACM Conference on Recommender … | 2024-10-08 |
| 844 | Song Emotion Classification of Lyrics with Out-of-Domain Data Under Label Scarcity Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We examine the novel usage of a large out-of-domain dataset as a creative solution to the challenge of training data scarcity in the emotional classification of song lyrics. |
Jonathan Sakunkoo; Annabella Sakunkoo; | arxiv-cs.CL | 2024-10-08 |
| 845 | MuRS 2024: 2nd Music Recommender Systems Workshop Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music recommendation has been relevant to the Recommender Systems (RecSys) community since the early days. With the growth of music streaming platforms, algorithmic … |
Andres Ferraro; Lorenzo Porcaro; Peter Knees; Christine Bauer; | Proceedings of the 18th ACM Conference on Recommender … | 2024-10-08 |
| 846 | Music-triggered Fashion Design: from Songs to The Metaverse Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Artistic collectives are not an exception, and we here aim to put special attention into musicians. |
MARTINA DELGADO et. al. | arxiv-cs.HC | 2024-10-07 |
| 847 | Thunder: A Design Process to Build Emotionally Engaging Music Visualizations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music’s profound emotional impact extends beyond traditional listening experiences, playing a critical role in shaping user engagement in entertainment contexts, including digital … |
Caio Nunes; Isabelle Reinbold; Mariana Castro; Ticianne Darin; | Proceedings of the XXIII Brazilian Symposium on Human … | 2024-10-07 |
| 848 | Art2Mus: Bridging Visual Arts and Music Through Cross-Modal Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, existing image-to-music models are limited to simple images, lacking the capability to generate music from complex digitized artworks. To address this gap, we introduce $\mathcal{A}\textit{rt2}\mathcal{M}\textit{us}$, a novel model designed to create music from digitized artworks or text inputs. |
Ivan Rinaldi; Nicola Fanelli; Giovanna Castellano; Gennaro Vessio; | arxiv-cs.MM | 2024-10-07 |
| 849 | UniMuMo: Unified Text, Music and Motion Generation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: We introduce UniMuMo, a unified multimodal model capable of taking arbitrary text, music, and motion data as input conditions to generate outputs across all three modalities. To … |
HAN YANG et. al. | ArXiv | 2024-10-06 |
| 850 | Music Statistics: Uncertain Logistic Regression Models with Applications in Analyzing Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Jue Lu; Lianlian Zhou; Wenxing Zeng; Anshui Li; | Fuzzy Optimization and Decision Making | 2024-10-04 |
| 851 | Enriching Music Descriptions with A Finetuned-LLM and Metadata for Text-to-Music Retrieval IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, users also articulate a need to explore music that shares similarities with their favorite tracks or artists, such as \textit{I need a similar track to Superstition by Stevie Wonder}. To address these concerns, this paper proposes an improved Text-to-Music Retrieval model, denoted as TTMR++, which utilizes rich text descriptions generated with a finetuned large language model and metadata. |
SeungHeon Doh; Minhee Lee; Dasaem Jeong; Juhan Nam; | arxiv-cs.SD | 2024-10-04 |
| 852 | SoundSignature: What Type of Music Do You Like? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: . In this paper, we highlight the application’s innovative features and educational potential, and present findings from a pilot user study that evaluates its efficacy and usability. |
Brandon James Carone; Pablo Ripollés; | arxiv-cs.SD | 2024-10-04 |
| 853 | SONIQUE: Video Background Music Generation Using Unpaired Audio-Visual Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present SONIQUE, a model for generating background music tailored to video content. |
Liqian Zhang; Magdalena Fuentes; | arxiv-cs.SD | 2024-10-04 |
| 854 | CoLLAP: Contrastive Long-form Language-Audio Pretraining with Musical Temporal Structure Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Contrastive Long-form Language-Audio Pretraining (\textbf{CoLLAP}) to significantly extend the perception window for both the input audio (up to 5 minutes) and the language descriptions (exceeding 250 words), while enabling contrastive learning across modalities and temporal dynamics. |
JUNDA WU et. al. | arxiv-cs.SD | 2024-10-03 |
| 855 | Generating Symbolic Music from Natural Language Prompts Using An LLM-Enhanced Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present MetaScore, a new dataset consisting of 963K musical scores paired with rich metadata, including free-form user-annotated tags, collected from an online music forum. |
Weihan Xu; Julian McAuley; Taylor Berg-Kirkpatrick; Shlomo Dubnov; Hao-Wen Dong; | arxiv-cs.SD | 2024-10-02 |
| 856 | Agent-Driven Large Language Models for Mandarin Lyric Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this research, we developed a multi-agent system that decomposes the melody-to-lyric task into sub-tasks, with each agent controlling rhyme, syllable count, lyric-melody alignment, and consistency. |
Hong-Hsiang Liu; Yi-Wen Liu; | arxiv-cs.CL | 2024-10-02 |
| 857 | Do Music Generation Models Encode Music Theory? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Thus, we introduce SynTheory, a synthetic MIDI and audio music theory dataset, consisting of tempos, time signatures, notes, intervals, scales, chords, and chord progressions concepts. |
Megan Wei; Michael Freeman; Chris Donahue; Chen Sun; | arxiv-cs.SD | 2024-10-01 |
| 858 | Part-of-Speech Features in Bob Dylan’s Song Lyrics: A Stylometric Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Honoured as a Nobel Laureate in 2016, Bob Dylan’s song lyrics have garnered well-deserved recognition and appreciation for their themes, content and artistic performances. … |
Zheyuan Dai; Haitao Liu; | Int. J. Humanit. Arts Comput. | 2024-10-01 |
| 859 | An Adaptive Melody Search Algorithm Based on Low-level Heuristics for Material Feeding Scheduling Optimization in A Hybrid Kitting System Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Yufan Huang; Ling-zhi Zhao; Binghai Zhou; | Adv. Eng. Informatics | 2024-10-01 |
| 860 | Melody-Guided Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present the Melody-Guided Music Generation (MG2) model, a novel approach using melody to guide the text-to-music generation that, despite a simple method and limited resources, achieves excellent performance. |
Shaopeng Wei; Manzhen Wei; Haoyu Wang; Yu Zhao; Gang Kou; | arxiv-cs.SD | 2024-09-30 |
| 861 | Presence and Flow in Virtual and Mixed Realities for Music-Related Educational Settings Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music in Extended Reality (XR) is increasingly present in both academic and industrial research. While XR applications are more prevalent in STEM education, there is growing … |
Leonard Bruns; Benedict Saurbier; Tray Minh Voong; Michael Oehler; | 2024 IEEE 5th International Symposium on the Internet of … | 2024-09-30 |
| 862 | Integrating Text-to-Music Models with Language Models: Composing Long Structured Music Pieces Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes integrating a text-to-music model with a large language model to generate music with form. |
Lilac Atassi; | arxiv-cs.SD | 2024-09-30 |
| 863 | GENPIA: A Genre-Conditioned Piano Music Generation System Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the demand for music continuing to grow as people seek variety and personal resonance, many works focus on music generation. In this study, we propose GENPIA, a … |
QUOC-VIET NGUYEN et. al. | 2024 IEEE 5th International Symposium on the Internet of … | 2024-09-30 |
| 864 | Characteristics and Development Trends of Internet Plus Music Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The ‘Internet Plus Music Education’ utilizes online technologies to transcend traditional educational constraints of time and space, offering students flexible and efficient … |
Xu Ni; Xiyang Chen; | Int. J. Web Based Learn. Teach. Technol. | 2024-09-26 |
| 865 | Tuning Into Bias: A Computational Study of Gender Bias in Song Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents an analysis of gender bias in English song lyrics using topic modeling and bias measurement techniques. |
Danqing Chen; Adithi Satish; Rasul Khanbayov; Carolin M. Schuster; Georg Groh; | arxiv-cs.CL | 2024-09-24 |
| 866 | Transforming Music Education Through Artificial Intelligence: A Systematic Literature Review on Enhancing Music Teaching and Learning IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The advent of artificial intelligence (AI) has brought significant and transformative alterations to traditional music education. This study examines the progress of AI technology … |
Yifang Zhang; Beh Wen Fen; Chao Zhang; Sheng Pi; | Int. J. Interact. Mob. Technol. | 2024-09-24 |
| 867 | SongTrans: An Unified Song Transcription and Alignment Method for Lyrics and Notes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While there are tools available for lyric or note transcription tasks, they all need pre-processed data which is relatively time-consuming (e.g., vocal and accompaniment separation). Besides, most of these tools are designed to address a single task and struggle with aligning lyrics and notes (i.e., identifying the corresponding notes of each word in lyrics). |
SIWEI WU et. al. | arxiv-cs.SD | 2024-09-22 |
| 868 | Meta-Learning-Based Supervised Domain Adaptation for Melody Extraction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The task of extracting the dominant pitch from polyphonic audio is crucial in the music information retrieval field. A substantial amount of labeled audio data is required to … |
Kavya Ranjan Saxena; Vipul Arora; | 2024 IEEE 34th International Workshop on Machine Learning … | 2024-09-22 |
| 869 | Research on Multi-Modal Music Score Alignment Model for Online Music Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: As music data storage becomes increasingly diverse in the era of big data, ensuring alignment of music works with the same semantics for online music education is crucial. To … |
Dexin Ren; | J. Adv. Comput. Intell. Intell. Informatics | 2024-09-20 |
| 870 | MuCodec: Ultra Low-Bitrate Music Codec Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address thisissue, we propose MuCodec, specifically targeting music compression andreconstruction tasks at ultra low bitrates. |
YAOXUN XU et. al. | arxiv-cs.SD | 2024-09-20 |
| 871 | Designing Audio Processing Strategies to Enhance Cochlear Implant Users’ Music Enjoyment Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Cochlear implants (CIs) provide hundreds of thousands of users with increased access to sound, particularly speech, but experiences of music are more varied. Can greater … |
Lloyd May; Aaron Hodges; So Yeon Park; Blair Kaneshiro; Jonathan Berger; | Frontiers Comput. Sci. | 2024-09-19 |
| 872 | FruitsMusic: A Real-World Corpus of Japanese Idol-Group Songs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study presents FruitsMusic, a metadata corpus of Japanese idol-group songs in the real world, precisely annotated with who sings what and when. |
Hitoshi Suda; Shunsuke Yoshida; Tomohiko Nakamura; Satoru Fukayama; Jun Ogata; | arxiv-cs.SD | 2024-09-19 |
| 873 | M6(GPT)3: Generating Multitrack Modifiable Multi-Minute MIDI Music from Text Using Genetic Algorithms, Probabilistic Methods and GPT Models in Any Progression and Time Signature Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This work introduces the $\text{M}^\text{6}(\text{GPT})^\text{3}$ composer system, capable of generating complete, multi-minute musical compositions with complex structures in any … |
Jakub Po’cwiardowski; Mateusz Modrzejewski; Marek S. Tatara; | ArXiv | 2024-09-19 |
| 874 | Exploring Bat Song Syllable Representations in Self-supervised Audio Encoders Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: How well can deep learning models trained on human-generated sounds distinguish between another species’ vocalization types? |
Marianne de Heer Kloots; Mirjam Knörnschild; | arxiv-cs.SD | 2024-09-19 |
| 875 | M6(GPT)3: Generating Multitrack Modifiable Multi-Minute MIDI Music from Text Using Genetic Algorithms, Probabilistic Methods and GPT Models in Any Progression and Time Signature Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a genetic algorithm for thegeneration of melodic elements. |
Jakub Poćwiardowski; Mateusz Modrzejewski; Marek S. Tatara; | arxiv-cs.SD | 2024-09-19 |
| 876 | METEOR: Melody-aware Texture-controllable Symbolic Orchestral Music Generation Via Transformer VAE Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose METEOR, amodel for generating Melody-aware Texture-controllable re-Orchestration with aTransformer-based variational auto-encoder (VAE). |
Dinh-Viet-Toan Le; Yi-Hsuan Yang; | arxiv-cs.SD | 2024-09-18 |
| 877 | Simultaneous Music Separation and Generation Using Multi-Track Latent Diffusion Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we introduce a latent diffusion-based multi-track generation model capable of both source separation and multi-track music synthesis by learning the joint probability distribution of tracks sharing a musical context. |
Tornike Karchkhadze; Mohammad Rasool Izadi; Shlomo Dubnov; | arxiv-cs.SD | 2024-09-18 |
| 878 | Modeling Musical Knowledge With Quantum Bayesian Networks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music is a multifaceted art form that requires a nuanced and comprehensive framework for analysis. This framework should encompass correlations among diverse musical attributes, … |
Florian Krebs; Hermann Fürntratt; Roland Unterberger; Franz Graf; | 2024 International Conference on Content-Based Multimedia … | 2024-09-18 |
| 879 | Harmonious Touch: Haptic Wristband for Home Listening Experience Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The integration of wearable technology has paved the way for innovative approaches to enhancing the music listening experience. This study introduces a haptic wristband designed … |
Trinity Melder; Richard Savery; | Proceedings of the 19th International Audio Mostly … | 2024-09-18 |
| 880 | Evaluation of Pretrained Language Models on Music Understanding Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Music-text multimodal systems have enabled new approaches to Music Information Research (MIR) applications such as audio-to-text and text-to-audio retrieval, text-based song … |
Yannis Vasilakis; Rachel M. Bittner; Johan Pauwels; | ArXiv | 2024-09-17 |
| 881 | PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Such issues highlight the need for publicly available, copyright-free musical data, in which there is a large shortage, particularly for symbolic music data. To alleviate this issue, we present PDMX: a large-scale open-source dataset of over 250K public domain MusicXML scores collected from the score-sharing forum MuseScore, making it the largest available copyright-free symbolic music dataset to our knowledge. |
Phillip Long; Zachary Novack; Taylor Berg-Kirkpatrick; Julian McAuley; | arxiv-cs.SD | 2024-09-16 |
| 882 | Unveiling and Mitigating Bias in Large Language Model Recommendations: A Path to Fairness Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores the interplay between bias and LLM-based recommendation systems, focusing on music, song, and book recommendations across diverse demographic and cultural groups. |
Anindya Bijoy Das; Shahnewaz Karim Sakib; | arxiv-cs.IR | 2024-09-16 |
| 883 | MusicLIME: Explainable Multimodal Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce MusicLIME, a model-agnostic feature importance explanation method designed for multimodal music models. |
Theodoros Sotirou; Vassilis Lyberatos; Orfeas Menis Mastromichalakis; Giorgos Stamou; | arxiv-cs.SD | 2024-09-16 |
| 884 | ELMI: Interactive and Intelligent Sign Language Translation of Lyrics for Song Signing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present ELMI, an accessible song-signing tool that assists in translating lyrics into sign language. |
Suhyeon Yoo; Khai N. Truong; Young-Ho Kim; | arxiv-cs.HC | 2024-09-15 |
| 885 | Prevailing Research Areas for Music AI in The Era of Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We then overview different generative models, forms of evaluating these models, and their computational constraints/limitations. Subsequently, we highlight applications of these generative models towards extensions to multiple modalities and integration with artists’ workflow as well as music education systems. |
Megan Wei; Mateusz Modrzejewski; Aswin Sivaraman; Dorien Herremans; | arxiv-cs.SD | 2024-09-14 |
| 886 | A Survey of Foundation Models for Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The rapid advancement of artificial intelligence (AI) has introduced new ways to analyze music, aiming to replicate human understanding of music and provide related services. |
WENJUN LI et. al. | arxiv-cs.SD | 2024-09-14 |
| 887 | Computational Musicking: Music + Coding As A Hybrid Practice Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: While there is a growing body of research that explores the integration of music and coding in learning environments, much of this work has either emphasised the technical aspects … |
Cameron L. Roberts; Mike Horn; | Behaviour & Information Technology | 2024-09-13 |
| 888 | Towards Leveraging Contrastively Pretrained Neural Audio Embeddings for Recommender Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our experiments demonstrate that neural embeddings, particularly those generated with the Contrastive Language-Audio Pretraining (CLAP) model, present a promising approach to enhancing music recommendation tasks within graph-based frameworks. |
Florian Grötschla; Luca Strässle; Luca A. Lanzendörfer; Roger Wattenhofer; | arxiv-cs.SD | 2024-09-13 |
| 889 | Seed-Music: A Unified Framework for High Quality and Controlled Music Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Seed-Music, a suite of music generation systems capable of producing high-quality music with fine-grained style control. |
YE BAI et. al. | arxiv-cs.SD | 2024-09-13 |
| 890 | Bridging Paintings and Music – Exploring Emotion Based Music Generation Through Paintings IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Rapid advancements in artificial intelligence have significantly enhanced generative tasks involving music and images, employing both unimodal and multimodal approaches. This … |
Tanisha Hisariya; Huan Zhang; Jinhua Liang; | ArXiv | 2024-09-12 |
| 891 | Bridging Paintings and Music — Exploring Emotion Based Music Generation Through Paintings Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This research develops a model capable of generating music that resonates with the emotions depicted in visual arts, integrating emotion labeling, image captioning, and language models to transform visual inputs into musical compositions. |
Tanisha Hisariya; Huan Zhang; Jinhua Liang; | arxiv-cs.SD | 2024-09-12 |
| 892 | VMAS: Video-to-Music Generation Via Semantic Alignment in Web Music Videos IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a framework for learning to generate background music from video inputs. |
Yan-Bo Lin; Yu Tian; Linjie Yang; Gedas Bertasius; Heng Wang; | arxiv-cs.MM | 2024-09-11 |
| 893 | A Two-Stage Band-Split Mamba-2 Network For Music Separation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper applies Mamba-2 with a two-stage strategy, which introduces residual mapping based on the mask method, effectively compensating for the details absent in the mask and further improving separation performance. |
Jinglin Bai; Yuan Fang; Jiajie Wang; Xueliang Zhang; | arxiv-cs.SD | 2024-09-10 |
| 894 | RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, current separating methods struggle to fully remove noise or excessively suppress signal components, affecting the naturalness and similarity of the processed audio. To tackle this, our study introduces RobustSVC, a novel any-to-one SVC framework that converts noisy vocals into clean vocals sung by the target singer. |
WEI CHEN et. al. | arxiv-cs.SD | 2024-09-10 |
| 895 | An End-to-End Approach for Chord-Conditioned Song Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Given the inaccuracy of automatic chord extractors, we devise a robust cross-attention mechanism augmented with dynamic weight sequence to integrate extracted chord information into song generations and reduce frame-level flaws, and propose a novel model termed Chord-Conditioned Song Generator (CSG) based on it. |
SHUOCHEN GAO et. al. | arxiv-cs.SD | 2024-09-10 |
| 896 | Benchmarking Sub-Genre Classification For Mainstage Dance Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We employ a continuous soft labeling approach toaccommodate tracks blending multiple sub-genres, preserving their inherentcomplexity. |
Hongzhi Shu; Xinglin Li; Hongyu Jiang; Minghao Fu; Xinyu Li; | arxiv-cs.SD | 2024-09-10 |
| 897 | Musical Chords: A Novel Java Algorithm and App Utility to Enumerate Chord-Progressions Adhering to Music Theory Guidelines Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address these limitations, a novel Java Algorithm and automated music theory chord progression and variations generator App has been developed. This App offers a piano user interface, that applies music theory to generate all possible four-chord and eight-chord progressions and produces three alternate variations of the generated progressions selected by the user. |
Aditya Lakshminarasimhan; | arxiv-cs.SD | 2024-09-09 |
| 898 | SongCreator: Lyrics-based Universal Song Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While various aspects of song generation have been explored by previous works, such as singing voice, vocal composition and instrumental arrangement, etc., generating songs with both vocals and accompaniment given lyrics remains a significant challenge, hindering the application of music generation models in the real world. In this light, we propose SongCreator, a song-generation system designed to tackle this challenge. |
SHUN LEI et. al. | arxiv-cs.SD | 2024-09-09 |
| 899 | Latent Diffusion Bridges for Unsupervised Musical Audio Timbre Transfer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel method based on dual diffusion bridges, trained using the CocoChorales Dataset, which consists of unpaired monophonic single-instrument audio data. |
MICHELE MANCUSI et. al. | arxiv-cs.SD | 2024-09-09 |
| 900 | Mel-RoFormer for Vocal Separation and Vocal Melody Transcription IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Mel-RoFormer, a spectrogram-based model featuring two key designs: a novel Mel-band Projection module at the front-end to enhance the model’s capability to capture informative features across multiple frequency bands, and interleaved RoPE Transformers to explicitly model the frequency and time dimensions as two separate sequences. |
Ju-Chiang Wang; Wei-Tsung Lu; Jitong Chen; | arxiv-cs.SD | 2024-09-06 |
| 901 | Enhancing Sequential Music Recommendation with Personalized Popularity Awareness Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Moreover, music consumption is characterized by a prevalence of repeated listening, i.e., users frequently return to their favourite tracks, an important signal that could be framed as individual or personalized popularity. This paper addresses these challenges by introducing a novel approach that incorporates personalized popularity information into sequential recommendation. |
Davide Abbattista; Vito Walter Anelli; Tommaso Di Noia; Craig Macdonald; Aleksandr Vladimirovich Petrov; | arxiv-cs.IR | 2024-09-06 |
| 902 | Multi-Track MusicLDM: Towards Versatile Music Generation with Latent Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This process involves composing each instrument to align with existing ones in terms of beat, dynamics, harmony, and melody, requiring greater precision and control over tracks than text prompts usually provide. In this work, we address these challenges by extending the MusicLDM, a latent diffusion model for music, into a multi-track generative model. |
Tornike Karchkhadze; Mohammad Rasool Izadi; Ke Chen; Gerard Assayag; Shlomo Dubnov; | arxiv-cs.SD | 2024-09-04 |
| 903 | MusicMamba: A Dual-Feature Modeling Approach for Generating Chinese Traditional Music with Modal Precision Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, existing research primarily focuses on Western music and encounters challenges in generating melodies for Chinese traditional music, especially in capturing modal characteristics and emotional expression. To address these issues, we propose a new architecture, the Dual-Feature Modeling Module, which integrates the long-range dependency modeling of the Mamba Block with the global structure capturing capabilities of the Transformer Block. |
JIATAO CHEN et. al. | arxiv-cs.SD | 2024-09-04 |
| 904 | Applications and Advances of Artificial Intelligence in Music Generation:A Review IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In recent years, artificial intelligence (AI) has made significant progress in the field of music generation, driving innovation in music creation and applications. This paper … |
Yanxu Chen; Linshu Huang; Tian Gou; | ArXiv | 2024-09-03 |
| 905 | A Progressive-Adaptive Music Generator (PAMG): An Approach to Interactive Procedural Music for Videogames Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Alvaro Eduardo Lopez Duarte; | El Farmaceutico | 2024-09-02 |
| 906 | Considerations and Concerns of Professional Game Composers Regarding Artificially Intelligent Music Technology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Artificially intelligent music technology (AIMT) is a promising field with great potential for creating innovation in music. However, the considerations and concerns surrounding … |
Kyle Worrall; Tom Collins; | IEEE Transactions on Games | 2024-09-01 |
| 907 | MMT-BERT: Chord-aware Symbolic Music Generation Based on Multitrack Music Transformer and MusicBERT Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current techniques dedicated to symbolic music generation generally encounter two significant challenges: training data’s lack of information about chords and scales and the requirement of specially designed model architecture adapted to the unique format of symbolic music representation. In this paper, we solve the above problems by introducing new symbolic music representation with MusicLang chord analysis model. |
Jinlong Zhu; Keigo Sakurai; Ren Togo; Takahiro Ogawa; Miki Haseyama; | arxiv-cs.SD | 2024-09-01 |
| 908 | Challenge of Singing Voice Synthesis Using Only Text-To-Speech Corpus With FIRNet Source-Filter Neural Vocoder Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Singing voice synthesis (SVS) corpora are more costly to collect than TTS corpora. SVS using only a TTS corpus is challenging because the ranges of fundamental frequency ( f o ) … |
T. Okamoto; Yamato Ohtani; Sota Shimizu; T. Toda; Hisashi Kawai; | Interspeech 2024 | 2024-09-01 |
| 909 | X-Singer: Code-Mixed Singing Voice Synthesis Via Cross-Lingual Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Singing voice synthesis (SVS) systems have exhibited a remarkable ability to synthesize natural singing voices. However, existing methods still depend on the phoneme annotation in … |
Ji-Sang Hwang; Hyeongrae Noh; Yoonseok Hong; Insoo Oh; | Interspeech 2024 | 2024-09-01 |
| 910 | Application Research of Short-Time Fourier Transform in Music Generation Based on The Parallel WaveGan System Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Despite the widespread use of Fourier transform (FT) networks and generative adversarial networks (GANs) in audio signal processing, their practical effectiveness in unsupervised … |
Jun Min; Zhiwei Gao; Lei Wang; Aihua Zhang; | IEEE Transactions on Industrial Informatics | 2024-09-01 |
| 911 | Improved-MUSIC Algorithm Based on Compressive Subspace Learning for Multiantenna Cognitive Radio Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recently, cognitive radio (CR) has become an effective method for wideband spectrum sensing (WBSS) based on sub-Nyquist sampling to enhance the sensing range as much as possible … |
Liqi Yang; Ronghua Qin; Hongying Tang; Hui Ma; | IEEE Transactions on Vehicular Technology | 2024-09-01 |
| 912 | FLUX That Plays Music IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper explores a simple extension of diffusion-based rectified flow Transformers for text-to-music generation, termed as FluxMusic. |
Zhengcong Fei; Mingyuan Fan; Changqian Yu; Junshi Huang; | arxiv-cs.SD | 2024-08-31 |
| 913 | Toward A More Complete OMR Solution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this study, we focus on the MUSCIMA++ v2.0 dataset, which represents musical notation as a graph with pairwise relationships among detected music objects, and we consider both stages together. |
Guang Yang; Muru Zhang; Lin Qiu; Yanming Wan; Noah A. Smith; | arxiv-cs.CV | 2024-08-30 |
| 914 | Research on Music Teaching Systems Assisted By Artificial Intelligence Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In music teaching, the application of artificial intelligence has brought revolutionary changes to teaching and greatly improved the teaching quality and learning effect. The … |
Kuixing Yuan; | Int. J. e Collab. | 2024-08-29 |
| 915 | Music Grounding By Short Video Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In order to bridge the gapbetween the practical need for music moment localization and V2MR, we propose anew task termed Music Grounding by Short Video (MGSV). |
ZIJIE XIN et. al. | arxiv-cs.MM | 2024-08-29 |
| 916 | Do Recommender Systems Promote Local Music? A Reproducibility Study Using Music Streaming Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To assess the robustness of this study’s conclusions, we conduct a comparative analysis using proprietary listening data from a global music streaming service, which we publicly release alongside this paper. |
KRISTINA MATROSOVA et. al. | arxiv-cs.IR | 2024-08-29 |
| 917 | Transformers Meet ACT-R: Repeat-Aware and Sequential Listening Session Recommendation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This is a crucial limitation for music recommendation, as repeatedly listening to the same song over time is a common phenomenon that can even change the way users perceive this song. In this paper, we introduce PISA (Psychology-Informed Session embedding using ACT-R), a session-level sequential recommender system that overcomes this limitation. |
Viet-Anh Tran; Guillaume Salha-Galvan; Bruno Sguerra; Romain Hennequin; | arxiv-cs.IR | 2024-08-29 |
| 918 | Royalty Management By Using Blockchain Network: A Multiple Case Study Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The rise of music streaming services has led to significant challenges in royalty management, particularly affecting independent musicians who face poor royalty payments and a … |
Thalea Christy Nathaniela; Elfindah Princes; Gunawan Wang; | 2024 International Conference on Information Management and … | 2024-08-28 |
| 919 | Lyrically Speaking: Exploring The Link Between Lyrical Emotions, Themes and Depression Risk Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we examine online music consumption of individuals at risk of depression in light of lyrical themes and emotions. |
Pavani Chowdary; Bhavyajeet Singh; Rajat Agarwal; Vinoo Alluri; | arxiv-cs.IR | 2024-08-28 |
| 920 | Multimodal Music Datasets? Challenges and Future Goals in Music Processing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Anna-Maria Christodoulou; Olivier Lartillot; Alexander Refsum Jensenius; | Int. J. Multim. Inf. Retr. | 2024-08-28 |
| 921 | Unifying Symbolic Music Arrangement: Track-Aware Reconstruction and Structured Tokenization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a unified framework for automatic multitrack music arrangementthat enables a single pre-trained symbolic music model to handle diversearrangement scenarios, including reinterpretation, simplification, and additivegeneration. |
LONGSHEN OU et. al. | arxiv-cs.SD | 2024-08-27 |
| 922 | Knowledge Discovery in Optical Music Recognition: Enhancing Information Retrieval with Instance Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We make our implementation, pre-processing scripts, trained models, and evaluation results publicly available to support further research and development. |
Elona Shatri; George Fazekas; | arxiv-cs.IR | 2024-08-27 |
| 923 | Foundation Models for Music: A Survey IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: The paper offers insights into future challenges and trends on FMs for music, aiming to shape the trajectory of human-AI collaboration in the music realm. |
YINGHAO MA et. al. | arxiv-cs.SD | 2024-08-26 |
| 924 | How Robots Influence Human Perception: Investigating The Role of Body Language and Music in Emotion Perception for Social HRI Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Emotional dance is an engaging and stimulating multimodal social activity involving the display of both body language and music. An interesting area of research is in the … |
Nan Liang; G. Nejat; | 2024 33rd IEEE International Conference on Robot and Human … | 2024-08-26 |
| 925 | SONICS: Synthetic Or Not — Identifying Counterfeit Songs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Additionally, existing datasets lack music-lyrics diversity, long-duration songs, and open-access fake songs. To address these gaps, we introduce SONICS, a novel dataset for end-to-end Synthetic Song Detection (SSD), comprising over 97k songs (4,751 hours) with over 49k synthetic songs from popular platforms like Suno and Udio. |
Md Awsafur Rahman; Zaber Ibn Abdul Hakim; Najibul Haque Sarker; Bishmoy Paul; Shaikh Anowarul Fattah; | arxiv-cs.SD | 2024-08-26 |
| 926 | Scoring Synchronization Between Music and Motion: Local Vs Global Approaches Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper compares methods for scoring the synchronization between music and motion. A wide range of local and global methods such as the Gaussian-based method, relative phase, … |
Hamza Bayd; Patrice Guyot; Benoît G. Bardy; Pierre Slangen; | 2024 32nd European Signal Processing Conference (EUSIPCO) | 2024-08-26 |
| 927 | Envelope Based Deep Source Separation and EEG Auditory Attention Decoding for Speech and Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study focuses on auditory attention decoding (AAD) for speech and music. We propose an envelope-based deep source separation strategy on a single microphone system, where the … |
M. A. Tanveer; Jesper Jensen; Zheng-Hua Tan; Jan Østergaard; | 2024 32nd European Signal Processing Conference (EUSIPCO) | 2024-08-26 |
| 928 | LyCon: Lyrics Reconstruction from The Bag-of-Words Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Our study introduces a novel method for generating copyright-free lyrics from publicly available Bag-of-Words (BoW) datasets, which contain the vocabulary of lyrics but not the lyrics themselves. |
Haven Kim; Kahyun Choi; | arxiv-cs.CL | 2024-08-26 |
| 929 | Research on The Application of Intelligent Algorithms in The Automation of Music Generation and Composition Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study explores recent advances in intelligent algorithms for the automation of music generation and arrangement, with a particular focus on the potential applications of … |
Hanxiao Ye; | 2024 International Conference on Computers, Information … | 2024-08-26 |
| 930 | Teaching Indian Classical Music Using Web-Based Interactive Platform and Real-Time Audio Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music education has traditionally relied on theo-retical instruction and sheet music. However, integrating real-time audio analysis and interactive learning tools introduces a … |
ASHWIN P JOBY et. al. | 2024 32nd European Signal Processing Conference (EUSIPCO) | 2024-08-26 |
| 931 | SONICS: Synthetic Or Not – Identifying Counterfeit Songs IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The recent surge in AI-generated songs presents exciting possibilities and challenges. While these inventions democratize music creation, they also necessitate the ability to … |
Md Awsafur Rahman; Zaber Ibn Abdul Hakim; Najibul Haque Sarker; Bishmoy Paul; S. Fattah; | ArXiv | 2024-08-26 |
| 932 | Hybrid Music Recommendation with Graph Neural Networks IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Modern music streaming services rely on recommender systems to help users navigate within their large collections. Collaborative filtering (CF) methods, that leverage past … |
Matej Bevec; Marko Tkalcic; Matevž Pesek; | User Model. User Adapt. Interact. | 2024-08-24 |
| 933 | Fairness of Large Music Models: From A Culturally Diverse Perspective Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper explores the fairness of large music models in generating culturally diverse musical compositions, emphasizing the need for inclusivity and equity in AI-generated … |
Qinyuan Wang; Bruce Gu; He Zhang; Yunfeng Li; | 2024 IEEE 9th International Conference on Data Science in … | 2024-08-23 |
| 934 | Charting The Universe of Metal Music Lyrics and Analyzing Their Relation to Perceived Audio Hardness Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We analyze the relationship between the musical and the lyrical content of metal music by combining automated audio feature extraction and quantitative text analysis on a corpus … |
Isabella Czedik-Eysenberg; Oliver Wieczorek; Arthur Flexer; Christoph Reuter; | Trans. Int. Soc. Music. Inf. Retr. | 2024-08-22 |
| 935 | Melody Predominates Over Harmony in The Evolution of Musical Scales Across 96 Countries Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While recent analyses provide mixed support for arole of melody as well as harmony, we lack a comparative analysis based oncross-cultural data. We address this longstanding problem through a rigorouscomputational comparison of the main theories using 1,314 scales from 96countries. |
John M McBride; Elizabeth Phillips; Patrick E Savage; Steven Brown; Tsvi Tlusty; | arxiv-cs.SD | 2024-08-22 |
| 936 | Information and Motor Constraints Shape Melodic Diversity Across Cultures Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The motor constraint hypothesis accountsfor certain similarities, such as scalar motion and contour shape, but not forother major common features, such as repetition, song length, and scale size.Here we investigate the role of information constraints in shaping thesehallmarks of melodies. |
JOHN M MCBRIDE et. al. | arxiv-cs.SD | 2024-08-22 |
| 937 | Towards Estimating Personal Values in Song Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, as highly subjective text, song lyrics present a challenge in terms of sampling songs to be annotated, annotation methods, and in choosing a method for aggregation. In this project, we take a perspectivist approach, guided by social science theory, to gathering annotations, estimating their quality, and aggregating them. |
Andrew M. Demetriou; Jaehun Kim; Sandy Manolios; Cynthia C. S. Liem; | arxiv-cs.CL | 2024-08-22 |
| 938 | A Tighter Complexity Analysis of SparseGPT IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we improved the analysis of the running time of SparseGPT [Frantar, Alistarh ICML 2023] from $O(d^{3})$ to $O(d^{\omega} + d^{2+a+o(1)} + d^{1+\omega(1,1,a)-a})$ for any $a \in [0, 1]$, where $\omega$ is the exponent of matrix multiplication. |
Xiaoyu Li; Yingyu Liang; Zhenmei Shi; Zhao Song; | arxiv-cs.DS | 2024-08-22 |
| 939 | SentHYMNent: An Interpretable and Sentiment-Driven Model for Algorithmic Melody Harmonization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we introduce two major novel elements: a nuanced mixture-based representation for musical sentiment, including a web tool to gather data, as well as a sentiment- and theory-driven harmonization model, SentHYMNent. |
STEPHEN HAHN et. al. | kdd | 2024-08-21 |
| 940 | Oh, Behave! Country Representation Dynamics Created By Feedback Loops in Music Recommender Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we investigate the dynamics of representation of local (i.e., country-specific) and US-produced music in user profiles and recommendations. |
Oleg Lesota; Jonas Geiger; Max Walder; Dominik Kowald; Markus Schedl; | arxiv-cs.IR | 2024-08-21 |
| 941 | Rage Music Classification and Analysis Using K-Nearest Neighbour, Random Forest, Support Vector Machine, Convolutional Neural Networks, and Gradient Boosting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We compare methods of classification in the application of audio analysis with machine learning and identify optimal models. |
Akul Kumar; | arxiv-cs.SD | 2024-08-20 |
| 942 | Text-to-Song: Towards Controllable Music Generation Incorporating Vocal and Accompaniment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a novel task called Text-to-Song synthesis which incorporates both vocal and accompaniment generation. |
ZHIQING HONG et. al. | acl | 2024-08-20 |
| 943 | Mo�sai: Efficient Text-to-Music Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In our work, we bridge text and music via a text-to-music generation model that is highly efficient, expressive, and can handle long-term structure. |
Flavio Schneider; Ojasv Kamal; Zhijing Jin; Bernhard Sch�lkopf; | acl | 2024-08-20 |
| 944 | DisMix: Disentangling Mixtures of Musical Instruments for Source-level Pitch and Timbre Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To fill the gap, we propose DisMix, a generative framework in which the pitch and timbre representations act as modular building blocks for constructing the melody and instrument of a source, and the collection of which forms a set of per-instrument latent representations underlying the observed mixture. |
YIN-JYUN LUO et. al. | arxiv-cs.SD | 2024-08-20 |
| 945 | Rhyme-aware Chinese Lyric Generator Based on GPT IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To enhance the rhyming quality of generated lyrics, we incorporate integrated rhyme information into our model, thereby improving lyric generation performance. |
YIXIAO YUAN et. al. | arxiv-cs.CL | 2024-08-19 |
| 946 | The Evolution of Inharmonicity and Noisiness in Contemporary Popular Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we use modified MPEG-7 features to explore and characterise the evolution of noise and inharmonicity in popular music since 1961. |
Emmanuel Deruty; David Meredith; Stefan Lattner; | arxiv-cs.SD | 2024-08-15 |
| 947 | A Theory-Based Explainable Deep Learning Architecture for Music Emotion Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper develops a theory-based, explainable deep learning convolutional neural network classifier to predict the time-varying emotional response to music. … |
H. Fong; Vineet Kumar; K. Sudhir; | Mark. Sci. | 2024-08-13 |
| 948 | A New Dataset, Notation Software, and Representation for Computational Schenkerian Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: With a larger corpus of Schenkerian data, it may be possible to infuse machine learning models with a deeper understanding of musical structure, thus leading to more human results. To encourage further research in Schenkerian analysis and its potential benefits for music informatics and generation, this paper presents three main contributions: 1) a new and growing dataset of SchAs, the largest in human- and computer-readable formats to date (>140 excerpts), 2) a novel software for visualization and collection of SchA data, and 3) a novel, flexible representation of SchA as a heterogeneous-edge graph data structure. |
STEPHEN NI-HAHN et. al. | arxiv-cs.SD | 2024-08-13 |
| 949 | Surveying More Than Two Decades of Music Information Retrieval Research on Playlists Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this paper, we present an extensive survey of music information retrieval (MIR) research into music playlists. Our survey spans more than 20 years, and includes around 300 … |
Giovanni Gabbolini; Derek Bridge; | ACM Trans. Intell. Syst. Technol. | 2024-08-12 |
| 950 | Tactile Melodies: A Desk-Mounted Haptics for Perceiving Musical Experiences Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces a novel interface for experiencing music through haptic impulses to the palm of the hand. |
Raj Varshith Moora; Gowdham Prabhakar; | arxiv-cs.HC | 2024-08-12 |
| 951 | The Algorithmic Nature of Song-sequencing: Statistical Regularities in Music Albums Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Based on a review of anecdotal beliefs, we explored patterns of track-sequencing within professional music albums. |
Pedro Neto; Martin Hartmann; Geoff Luck; Petri Toiviainen; | arxiv-cs.MM | 2024-08-08 |
| 952 | Quantifying The Corpus Bias Problem in Automatic Music Transcription Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We identify two primary sources of distribution shift: the music, and the sound. |
Lukáš Samuel Marták; Patricia Hu; Gerhard Widmer; | arxiv-cs.SD | 2024-08-08 |
| 953 | The Billboard Melodic Music Dataset (BiMMuDa) Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We introduce the Billboard Melodic Music Dataset (BiMMuDa), which contains the lead vocal melodies of the top five songs of each year from 1950 to 2022 according to the Billboard … |
Madeline Hamilton; Ana Clemente; Edward T. R. Hall; Marcus Pearce; | Trans. Int. Soc. Music. Inf. Retr. | 2024-08-07 |
| 954 | InstructME: An Instruction Guided Music Edit Framework with Latent Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we develop InstructME, an Instruction guided Music Editing and remixing framework based on latent diffusion models. |
BING HAN et. al. | ijcai | 2024-08-03 |
| 955 | MusicMagus: Zero-Shot Text-to-Music Editing Via Diffusion Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, the task of editing these generated music remains a significant challenge. This paper introduces a novel approach to edit music generated by such models, enabling the modification of specific attributes, such as genre, mood, and instrument, while maintaining other aspects unchanged. |
YIXIAO ZHANG et. al. | ijcai | 2024-08-03 |
| 956 | Retrieval Guided Music Captioning Via Multimodal Prefixes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper we put forward a new approach to music captioning, the task of automatically generating natural language descriptions for songs. |
Nikita Srivatsan; Ke Chen; Shlomo Dubnov; Taylor Berg-Kirkpatrick; | ijcai | 2024-08-03 |
| 957 | XAI-Lyricist: Improving The Singability of AI-Generated Lyrics with Prosody Explanations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents XAI-Lyricist, leveraging musical prosody to guide LMs in generating singable lyrics and providing human-understandable singability explanations. |
Qihao Liang; Xichu Ma; Finale Doshi-Velez; Brian Lim; Ye Wang; | ijcai | 2024-08-03 |
| 958 | MuChin: A Chinese Colloquial Description Benchmark for Evaluating Language Models in The Field of Music IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To this end, we present MuChin, the first open-source music description benchmark in Chinese colloquial language, designed to evaluate the performance of multimodal LLMs in understanding and describing music. |
ZIHAO WANG et. al. | ijcai | 2024-08-03 |
| 959 | Re-creation of Creations: A New Paradigm for Lyric-to-Melody Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Current lyric-to-melody generation methods struggle with the lack of paired lyric-melody data to train, and the lack of adherence to composition guidelines, resulting in melodies that do not sound human-composed. To address these issues, we propose a novel paradigm called Re-creation of Creations (ROC) that combines the strengths of both rule-based and neural-based methods. |
Ang Lv; Xu Tan; Tao Qin; Tie-Yan Liu; Rui Yan; | ijcai | 2024-08-03 |
| 960 | Generating High-quality Symbolic Music Using Fine-grained Discriminators Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose to decouple the melody and rhythm from music, and design corresponding fine-grained discriminators to tackle the aforementioned issues. |
ZHEDONG ZHANG et. al. | arxiv-cs.SD | 2024-08-03 |
| 961 | Arrange, Inpaint, and Refine: Steerable Long-term Music Audio Generation and Editing Via Content-based Controls IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: While Large Language Models (LLMs) have shown promise in generating high-quality music, their focus on autoregressive generation limits their utility in music editing tasks. To bridge this gap, To address this gap, we propose a novel approach leveraging a parameter-efficient heterogeneous adapter combined with a masking training scheme. |
Liwei Lin; Gus Xia; Yixiao Zhang; Junyan Jiang; | ijcai | 2024-08-03 |
| 962 | Nested Music Transformer: Sequentially Decoding Compound Tokens in Symbolic Music and Audio Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce the Nested Music Transformer (NMT), an architecture tailored for decoding compound tokens autoregressively, similar to processing flattened tokens, but with low memory usage. |
Jiwoo Ryu; Hao-Wen Dong; Jongmin Jung; Dasaem Jeong; | arxiv-cs.SD | 2024-08-02 |
| 963 | MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, their evaluation poses considerable challenges, and it remains unclear how to effectively assess their ability to correctly interpret music-related inputs with current methods. Motivated by this, we introduce MuChoMusic, a benchmark for evaluating music understanding in multimodal language models focused on audio. |
BENNO WECK et. al. | arxiv-cs.SD | 2024-08-02 |
| 964 | PiCoGen2: Piano Cover Generation with Transfer Learning Approach and Weakly Aligned Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This would, however, result in the loss of piano information and accordingly cause inconsistencies between the original and remapped piano versions. To overcome this limitation, we propose a transfer learning approach that pre-trains our model on piano-only data and fine-tunes it on weakly-aligned paired data constructed without note remapping. |
Chih-Pin Tan; Hsin Ai; Yi-Hsin Chang; Shuen-Huei Guan; Yi-Hsuan Yang; | arxiv-cs.SD | 2024-08-02 |
| 965 | Six Dragons Fly Again: Reviving 15th-Century Korean Court Music with Transformers and Novel Encoding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce a project that revives a piece of 15th-century Korean court music, Chihwapyeong and Chwipunghyeong, composed upon the poem Songs of the Dragon Flying to Heaven. |
DANBINAERIN HAN et. al. | arxiv-cs.SD | 2024-08-02 |
| 966 | METEOR: Melody-aware Texture-controllable Symbolic Music Re-Orchestration Via Transformer VAE Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Re-orchestration is the process of adapting a music piece for a different set of instruments. By altering the original instrumentation, the orchestrator often modifies the musical … |
Dinh-Viet-Toan Le; Yi-Hsuan Yang; | International Joint Conference on Artificial Intelligence | 2024-08-01 |
| 967 | Research and Application of A Dual Filtering Music Hybrid Recommendation Model Based on Cat Boost Algorithm and DCN Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the increase of Internet users, the traditional music recommendation model can not meet the increasing personalized needs of users. The single deep cross network model has … |
Juncai Hou; | Scalable Comput. Pract. Exp. | 2024-08-01 |
| 968 | ChordSync: Conformer-Based Alignment of Chord Annotations to Music Audio Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce ChordSync, a novel conformer-based model designed to seamlessly align chord annotations with audio, eliminating the need for weak alignment. |
Andrea Poltronieri; Valentina Presutti; Martín Rocamora; | arxiv-cs.SD | 2024-08-01 |
| 969 | Towards Explainable and Interpretable Musical Difficulty Estimation: A Parameter-efficient Approach Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Through our baseline, we illustrate how building on top of past research can offer alternatives for music difficulty assessment which are explainable and interpretable. With this, we aim to promote a more effective communication between the Music Information Retrieval (MIR) community and the music education one. |
Pedro Ramoneda; Vsevolod Eremenko; Alexandre D’Hooge; Emilia Parada-Cabaleiro; Xavier Serra; | arxiv-cs.SD | 2024-08-01 |
| 970 | Can LLMs Reason in Music? An Evaluation of LLMs’ Capability of Music Understanding and Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recent research has extended the application of large language models (LLMs) such as GPT-4 and Llama2 to the symbolic music domain including understanding and generation. Yet scant research explores the details of how these LLMs perform on advanced music understanding and conditioned generation, especially from the multi-step reasoning perspective, which is a critical aspect in the conditioned, editable, and interactive human-computer co-creation process. |
ZIYA ZHOU et. al. | arxiv-cs.SD | 2024-07-31 |
| 971 | Lyrics Transcription for Humans: A Readability-Aware Benchmark Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Writing down lyrics for human consumption involves not only accurately capturing word sequences, but also incorporating punctuation and formatting for clarity and to convey … |
Ondrej Cífka; Hendrik Schreiber; Luke Miner; Fabian-Robert Stöter; | ArXiv | 2024-07-30 |
| 972 | PiCoGen: Generate Piano Covers with A Two-stage Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Cover song generation stands out as a popular way of music making in the music-creative community. In this study, we introduce Piano Cover Generation (PiCoGen), a two-stage … |
Chih-Pin Tan; Shuen-Huei Guan; Yi-Hsuan Yang; | arxiv-cs.SD | 2024-07-30 |
| 973 | Emotion-Driven Melody Harmonization Via Melodic Variation and Functional Representation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose a novel functional representation for symbolic music. |
Jingyue Huang; Yi-Hsuan Yang; | arxiv-cs.SD | 2024-07-29 |
| 974 | Futga: Towards Fine-grained Music Understanding Through Temporally-enhanced Generative Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Existing music captioning methods are limited to generating concise global descriptions of short music clips, which fail to capture fine-grained musical characteristics and time-aware musical changes. To address these limitations, we propose FUTGA, a model equipped with fined-grained music understanding capabilities through learning from generative augmentation with temporal compositions. |
JUNDA WU et. al. | arxiv-cs.SD | 2024-07-29 |
| 975 | Towards Music Instrument Classification Using Convolutional Neural Networks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recognizing musical instruments from an audio signal is a challenging yet valuable endeavor within the realm of music study. The recognition and classification of musical … |
Paul Tiemeijer; Mahyar Shahsavari; Mahmood Fazlali; | 2024 IEEE International Conference on Omni-layer … | 2024-07-29 |
| 976 | Start from Video-Music Retrieval: An Inter-Intra Modal Loss for Cross Modal Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Experiments also show that II loss improves various self-supervised and supervised uni-modal and cross-modal retrieval tasks, and can obtain good retrieval models with a small amount of training samples. |
ZEYU CHEN et. al. | arxiv-cs.MM | 2024-07-28 |
| 977 | Exploring Genre and Success Classification Through Song Lyrics Using DistilBERT: A Fun NLP Venture Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents a natural language processing (NLP) approach to the problem of thoroughly comprehending song lyrics, with particular attention on genre classification, … |
Servando Pizarro Martinez; Moritz Zimmermann; Miguel Serkan Offermann; Florian Reither; | ArXiv | 2024-07-28 |
| 978 | Automatic Detection of Moral Values in Music Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Moral values play a fundamental role in how we evaluate information, make decisions, and form judgements around important social issues. |
Vjosa Preniqi; Iacopo Ghinassi; Julia Ive; Kyriaki Kalimeri; Charalampos Saitis; | arxiv-cs.CY | 2024-07-26 |
| 979 | Playlists and Genre: The Role of Music Genre in Spotify’s Playlists Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: PurposeGenre is a valuable access point for popular music collections; however, the blurring of genre boundaries combined with changing listening habits and new forms of … |
Callum McDonald; A. Foster; Pauline Rafferty; | J. Documentation | 2024-07-26 |
| 980 | Simulation of Neural Responses to Classical Music Using Organoid Intelligence Methods Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Hence, we present the PyOrganoid library, an innovative tool that facilitates the simulation of organoid learning models, integrating sophisticated machine learning techniques with biologically inspired organoid simulations. |
Daniel Szelogowski; | arxiv-cs.NE | 2024-07-25 |
| 981 | Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music with Lightweight Finetuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, editing music audios remains challenging due to the conflicting desiderata of performing fine-grained alterations on the audio while maintaining a simple user interface. To address this challenge, we propose Audio Prompt Adapter (or AP-Adapter), a lightweight addition to pretrained text-to-music models. |
FANG-DUO TSAI et. al. | arxiv-cs.SD | 2024-07-23 |
| 982 | Enhancing Mental Health With PHILOI: A Comprehensive Analysis of Mood Music and Chatbot Module Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The project aim was to develop an app that would enable the recording and monitoring of behaviour related to specific aspects of wellness, as well as support those aspects of … |
Kavita Kumavat; D. Gatagat; K. Wakode; S. Gundawar; V. Jain; | EAI Endorsed Trans. Mob. Commun. Appl. | 2024-07-22 |
| 983 | Towards Assessing Data Replication in Music Generation with Music Similarity Metrics on Raw Audio IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: A relevant discussion and related technical challenge is the potential replication and plagiarism of the training set in AI-generated music, which could lead to misuse of data and intellectual property rights violations. To tackle this issue, we present the Music Replication Assessment (MiRA) tool: a model-independent open evaluation method based on diverse audio music similarity metrics to assess data replication. |
Roser Batlle-Roca; Wei-Hisang Liao; Xavier Serra; Yuki Mitsufuji; Emilia Gómez; | arxiv-cs.SD | 2024-07-19 |
| 984 | Reducing Barriers to The Use of Marginalised Music Genres in AI Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: XAI opportunities identified included topics of improving transparency and control of AI models, explaining the ethics and bias of AI models, fine tuning large models with small datasets to reduce bias, and explaining style-transfer opportunities with AI models. Participants in the research emphasised that whilst it is hard to work with small datasets such as marginalised music and AI, such approaches strengthen cultural representation of underrepresented cultures and contribute to addressing issues of bias of deep learning models. |
Nick Bryan-Kinns; Zijin Li; | arxiv-cs.SD | 2024-07-18 |
| 985 | LYRICEL: A Rule-Augmented Artificial Intelligence-Empowered Cultural E-Learning with GPT and Machine Learning for Song Lyrics Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this paper, we present LYRICEL, an advanced AI-enhanced system that combines a rule-based decision-making mechanism, OpenAI’s application programming interface (API), and … |
DIMITRIOS P. PANAGOULIAS et. al. | 2024 15th International Conference on Information, … | 2024-07-17 |
| 986 | A New Model of Vocal Music Teaching in The Context of Internet Distance Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: As Internet technology evolves, distance learning emerges as a pivotal mode of education. In music education, vocal teaching faces limitations in traditional face-to-face methods. … |
Xiaochen Zhang; Junkai Zhang; | Int. J. Web Based Learn. Teach. Technol. | 2024-07-17 |
| 987 | GraphMuse: A Library for Symbolic Music Graph Processing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Graph Neural Networks (GNNs) have recently gained traction in symbolic music tasks, yet a lack of a unified framework impedes progress. Addressing this gap, we present GraphMuse, a graph processing framework and library that facilitates efficient music graph processing and GNN training for symbolic music tasks. |
Emmanouil Karystinaios; Gerhard Widmer; | arxiv-cs.SD | 2024-07-17 |
| 988 | Audio Conditioning for Music Generation Via Discrete Bottleneck Features Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For the second model we train a music language model from scratch jointly with a text conditioner and a quantized audio feature extractor. |
Simon Rouard; Yossi Adi; Jade Copet; Axel Roebel; Alexandre Défossez; | arxiv-cs.SD | 2024-07-17 |
| 989 | The Effects of Selected Preferred Music on Perceived Emotions Through Audiovisual Stimuli Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music surrounds us, and there is no denying that music in visual media can shape and evoke emotions. Yet, understanding how musical preference influences emotions through audio … |
Thunrada Thaiwong; Makoto Fukumoto; | 2024 IEEE/ACIS 9th International Conference on Big Data, … | 2024-07-16 |
| 990 | Popular Hooks: A Multimodal Dataset of Musical Hooks for Music Understanding and Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The Internet is rich in unimodal music data, available in either symbolic or audio representations. However, there is a notable scarcity of multimodal music datasets that offer … |
Xinda Wu; Jiaming Wang; Jiaxing Yu; Tieyao Zhang; Kejun Zhang; | 2024 IEEE International Conference on Multimedia and Expo … | 2024-07-15 |
| 991 | RevNet: A Review Network with Group Aggregation Fusion for Singing Melody Extraction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Singing melody extraction (SME) is a critical task in the field of music information retrieval (MIR). Recently, deep learning based methods have achieved remarkable successes for … |
Shuai Yu; Xiaoliang He; Yanting Zhang; | 2024 IEEE International Conference on Multimedia and Expo … | 2024-07-15 |
| 992 | Multitrack Emotion-Based Music Generation Network Using Continuous Symbolic Features Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Musicians need an efficient composition process to yield a multitude of musical pieces. To enhance the artistry and emotional relevance of AI-generated music, a novel Multi-Track … |
DONGHUI ZHANG et. al. | 2024 IEEE International Conference on Multimedia and Expo … | 2024-07-15 |
| 993 | Harmonic Frequency-Separable Transformer for Instrument-Agnostic Music Transcription Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Automatic Music Transcription (AMT) aims to convert music audio into symbolic representations. Recently, transformer-based methods have been successfully applied to … |
YULUN WU et. al. | 2024 IEEE International Conference on Multimedia and Expo … | 2024-07-15 |
| 994 | A User-Guided Generation Framework for Personalized Music Synthesis Using Interactive Evolutionary Computation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The development of generative artificial intelligence (AI) has demonstrated notable advancements in the domain of music synthesis. However, a perceived lack of creativity in the … |
Yanan Wang; Yan Pei; Zerui Ma; Jianqiang Li; | GECCO Companion | 2024-07-14 |
| 995 | Striking The Right Chord: A Comprehensive Approach to Amazon Music Search Spell Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we build a multi-stage framework for spell correction solution for music, media and named entity heavy search engines. |
Siddharth Sharma; Shiyun Yang; Ajinkya Walimbe; Tarun Sharma; Joaquin Delgado; | sigir | 2024-07-14 |
| 996 | The Interpretation Gap in Text-to-Music Generation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a framework to describe the musical interaction process, which includes expression, interpretation, and execution of controls. |
Yongyi Zang; Yixiao Zhang; | arxiv-cs.SD | 2024-07-14 |
| 997 | A Preliminary Investigation on Flexible Singing Voice Synthesis Through Decomposed Framework with Inferrable Features Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As collecting large singing datasets labeled with music scores is an expensive task, we investigate an alternative approach by decomposing the SVS system and inferring different singing voice features. |
Lester Phillip Violeta; Taketo Akama; | arxiv-cs.SD | 2024-07-12 |
| 998 | From Real to Cloned Singer Identification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: As such, methods to identify the original singer in synthetic voices are needed. In this paper, we investigate how singer identification methods could be used for such a task. |
Dorian Desblancs; Gabriel Meseguer-Brocal; Romain Hennequin; Manuel Moussallam; | arxiv-cs.SD | 2024-07-11 |
| 999 | Let Network Decide What to Learn: Symbolic Music Understanding Model Based on Large-scale Adversarial Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This bias often arises when masked tokens cannot be inferred from their context, forcing the model to overfit the training set instead of generalizing. To address this challenge, we propose Adversarial-MidiBERT for SMU, which adaptively determines what to mask during MLM via a masker network, rather than employing random masking. |
Zijian Zhao; | arxiv-cs.SD | 2024-07-11 |
| 1000 | Music Genre Classification Using Contrastive Dissimilarity Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the digital age, streaming platforms have revolutionized how we access and interact with music, highlighting the need for more intuitive ways to organize and categorize our … |
Gabriel Henrique Costanzi; Lucas O. Teixeira; G. Felipe; George D. C. Cavalcanti; Yandre M. G. Costa; | 2024 31st International Conference on Systems, Signals and … | 2024-07-09 |